docbook message

Subject: Re: [docbook] RFE: RDFa in Docbook 5

From: Joshua Wulf <jwulf@redhat.com>
To: Jirka Kosek <jirka@kosek.cz>
Date: Wed, 14 Dec 2011 01:33:07 +1000

Thanks for your reply Jirka, and for explaining the situation with RDF.
I've been reading up on the points you mentioned earlier. I'll take
another look at profiling. My sense of it is that it might do some of
the things we need, but not all. I'll investigate it further, along with
microformats and microdata.

Some history of what we've been up to might provide useful context:

We've been authoring using topics in Docbook for over a year now. Our
toolchain (Publican) supports Docbook 4.5, so we're currently using the
Docbook 4.5 DTD. We recast the <section> element as a <topic> and made
it do.... "unnatural things" (drums fingers together).

The section element is a good fit because its structural role in
compiled output is deterministic.

Initially we stored the topics in an svn repository; wrote some
command-line automation around creating and retrieving them; use
xi:include over http to build using them; put our metadata into comments
in the xml; and used OpenGrok to index the svn repository and provide
our discovery mechanism for authors.

This year we moved our metadata store into a relational database. That's
when we really started to notice things, because suddenly we could query
it using SQL.

The obvious kinds of "programmatic recombination", if you will, are
something like: "Give me all the Concepts for this Platform, Grouped by
Technology Component, ordered alphabetically" = hey presto, a glossary.

My understanding is that a SPARQL query could do this kind of thing, but
I'm not sure if profiling could.

We can build output using a content specification that is the equivalent
of a DITA map (we reference the topics by their unique ID in our
database for the machine, and by the topic title for the humans), and we
can also build an "information cloud" (semantic web?) output by defining
which metadata dimensions we want to use as the grouping and ordering
factors (much like any author would do when deciding how to structure a
book). Where we want to take that is to allow the end-user, the reader,
to determine what view of the data that they want.

We use our metadata store to encode relationships like: "This task topic
A is a prerequisite of this task topic B". This way, building a document
with topic B can pull in topic A as a dependency. These kinds of
relationships seem easy to encode in RDF, but I don't think profiling
can do this.

Maybe we just continue to keep the RDF/microformat/microdata separate
from the topics, as we are doing now for the metadata...

I guess the reason that I would like RDFa flowing from Docbook to HTML
is that it means that we could use pre-rendered views (like games do),
and not require a lot of processor power on the server to manipulate and
render Docbook on the fly. If the HTML is already rendered, and has the
metadata embedded in it, then it's less processor-intensive.

Again, maybe the answer is to prerender the Docbook topics to html, and
keep the metadata separate but available at "runtime".

Either way, I think that the topic path leads inexorably toward semantic
metadata. :-)

Thanks for the heads-up on profiling, microformat, and metadata. I'll
investigate those further. As you've explained, the amount of work
involved in integrating RDFa into Docbook is non-trivial, and it is far
from the clear victor in the marketplace of ideas, even in its
birthplace, the W3C.

We may not be able to make the books sentient just yet (Hitch hiker's
Guide to the Galaxy, anyone?), but I think we can realize some more of
the potential of being liberated from a preprinted linear format, and
having a medium that can participate, along with the author and the
reader, in the creation of narrative.

- Josh

On 12/14/2011 12:30 AM, Jirka Kosek wrote:
> On 13.12.2011 13:50, Joshua Wulf wrote:
> 
>> Some decisions are non-arbitrary, in the sense that there is only one
>> clear "right choice". For example: in a guide for beginners we would
>> omit information that is clearly for advanced use cases. In an
>> "Administrators Guide" we would omit information that is only of
>> relevance to Developers.
> 
> For this you can use profiling. It's easy concept and well supported in
> tools:
> 
> http://www.sagehill.net/docbookxsl/Profiling.html
> 
>> What if... we could encode that information in the built html?
>>
>> In that case, as well as offering a set of predefined static narratives,
>> we could also offer the reader the ability to navigate the information
>> set along multiple potential pathways. In other words, the reader
>> participates in the creation of the narrative (in an assisted way), in
>> much the same way that an author would have created a "best guess" or
>> "one size fits all" narrative for them.
> 
> You can implement this either on server-side and then you will work
> directly with DocBook/XML sources. If you want this work on client-side
> then you are stuck with Javascript and I don't think that I would like
> to manipulate RDFa in Javascript.
> 
>> It also opens the door to more targeted search, the opportunity for
>> external systems to discover, consume, and mashup our content, and even
>> the possibility of expert systems that could use our documentation to
>> answer user questions.
> 
> I have studied knowledge engineering and AI and since that time I'm
> little bit skeptic to such ideas. But this shouldn't prevent others from
> "trying".
> 
> 				Jirka
> 

-- 
Give us your feedback on JBoss Enterprise Documentation, take the key
survey:
http://www.keysurvey.com/survey/361436/1065/

Follow-Ups:
- Re: [docbook] RFE: RDFa in Docbook 5
  - From: Jirka Kosek <jirka@kosek.cz>

References:
- RFE: RDFa in Docbook 5
  - From: Joshua Wulf <jwulf@redhat.com>
- Re: [docbook] RFE: RDFa in Docbook 5
  - From: Jirka Kosek <jirka@kosek.cz>
- Re: [docbook] RFE: RDFa in Docbook 5
  - From: Joshua Wulf <jwulf@redhat.com>
- Re: [docbook] RFE: RDFa in Docbook 5
  - From: Jirka Kosek <jirka@kosek.cz>