[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: DOCBOOK: Re: doc domain vs. problem domain semantics
Sorry for letting so much time go by, before responding to this. >From: Norman Walsh <ndw@nwalsh.com> >To: "Matt G." <matt_g_@hotmail.com> >CC: docbook@lists.oasis-open.org >Subject: Re: doc domain vs. problem domain semantics >Date: Thu, 03 Jan 2002 12:23:12 -0500 > > >>I'm not sure I understand what you're trying to say. >[...] >| the two former. What I was saying is that "tag abuse", as you >| called it, effectively ruins the semantics of the abused tag. >| So, a > >A few observations: > >1. The semantics in question are only "ruined" for the document(s) in >which tag abuse occurs. Or collection of documents with which the problematic document(s) are indistinguishably lumped. If the tag abuse is widespread enough, something like a mere 10% of bad documents (under certain circumstances) can theoretically cause tools authors to to The Wrong Thing(tm), effectively ruining the semantics of the tag for all. As I'm sure you're aware, entropy is a powerful force. >2. You can use the role attribute to distinguish your "abusive" uses >from "real" uses and thereby avoid ruining anything irreparably. That's a good suggestion for the well-intentioned & enlightened abuser. For others, since they're lazy/ignorant enough to resort to tag-abuse, in the first place, they're the most likely not to bother (or consider) using the "role" attribute. >3. The DocBook Technical Committee (TC) is actively maintaining >DocBook. If you have a construct for which there is no suitable tag, >and the problem domain you are working in is not too far afield, >chances are the TC will address the issue. The problem is that this approach doesn't scale well. That's what I'm trying to address. >4. If you need a new element and either can't wait for the TC to >consider it, are if the TC rejects your use case for some reason, you >can always add it yourself. I'd like to see people start maintaining sets of application-specific customizations + stylesheets, for DocBook. Then, people could assemble packages, which include these customization modules and their associated stylesheet modules. Of course, without namespaces, this approach won't scale very well. > >>The entire design of DocBook is geared to make it possible for > >>you to write customization layers that provide the exact markup > >>that you need. I see this as the core competency of the TC. If they do a good job with document structure & meta info (and they have), then there can be customizations for dozens of fields of every sort. >On the one hand, there are publications for which vastly more >presentational information is required (layout-driven magazine >publication, for example). I don't think DocBook should go there. Actually, it seems like you could probably even use DocBook for layout-intensive publications, like magazines and newspapers. You'd use DocBook for the article sources, then use a separate page layout tool, into which the text flows are imported. You wouldn't put any images or maybe even sidebars in your DocBook source - all that would be relegated to your layout tool (the text for sidebars could be separate DocBook documents). >If you really wanted to keep structure and presentation separate, >and let's say you wanted to use DocBook for the structural part, my >"off the top of my head" solution would be to design a new >vocabulary for describing highly detailed presentational semantics >and then point from that document back into the "semantic" DocBook >document. Right, and the layout tool I described could be used to produce files of that type. Perhaps it could be even implemented on top of XSL, since you'd want some very declarative/general mechanisms for styling objects embedded in the text flows (e.g. headings, various inline elements, etc.). >On the other hand, there are publications that have highly detailed >semantic constructs that aren't used in computer software and >hardware documentation. Right. And I believe that up to about 250 of DocBook's elements would be useful in such a context. So, all the architecture work that went into creating those, as well as the associated stylesheet components, documentation, etc. could be leveraged for potentially tens or hundreds of other fields. >cleanly into DocBook. I think if we tried to make DocBook the kitchen >sink of semantic markup, we'd end up with 2000 elements and the whole >enterprise would collapse under its own weight. I think two things are happening. DocBook is maturing, which is a good thing, because it's also reaching the limits of the ability of the TC to maintain and extend it. Of course, that's a very uninformed opinion, so I could be completely wrong. There's no doubt about the fact that it's getting big enough to intimidate new users. Organizing documentation in a more logical and structured fashion could go a ways towards addressing this. >My recommendation, if you want to use DocBook in another community, >would be that you find a few other people in that community that >share your interest and design the semantic constructs that you >need. Then make a customization layer of DocBook that discards the >things you don't need and add the things you do. (I'd be happy to >participate, at least as an observer, in such a process.) What would be great is to improve/update Ch. 5 of TDG to include more guidelines for designing a new module and architecting the associated stylesheet customizations (keep it high-level, assuming people know XSL) and documentation. Also, maybe I'm missing where this is addressed, but Ch. 5 of TDG seems like it could use a section "Alternatives to Customizing DocBook", which could describe use of the 'role' attribute. >| I agree (with you and those people). Though I disagree with the >| approach of Simplified DocBook (possibly because it's intended to >| solve some problems I'm not concerned with). I think a more >| appropriate solution would be to partition the elements into a >| document domain group, and a number of different problem >| domain-specific groups (e.g. publishing meta, program sourcecode >| doc, program usage doc, hw/sw concepts, and misc.). Put them in >| separate schemas, and maybe even namespaces. Also, document them >| in separate groups. > >Looking at this pragmatically, I observe that what you're suggesting >would be *a lot* of work and it wouldn't directly benefit DocBook's >principal community in any direct way. I disagree. For one thing, software often is written to solve problems in a domain other than computer hardware/software. Making it easier for people to add one or more customization modules specific to other fields should be seen as being in line with the TC's goals. But, you do have a point. The biggest advantages of this effort would be felt slightly further outside of the TC's purview. It would also benefit the core DocBook user community, by virtue of the fact that the overall user community could quickly grow by an order of magnitude, or more. This translates into better tools, better support, and better documentation for all users. >That isn't a good reason not to do it, but it does mean that I want >to wait until there's at least one other community that would >directly benefit from this exercise. Are you certain that's not already the case? >| However, here's a suggestion: rather than simply structuring it >| that way, internally, why not do one or both of the following: >| * Document it that way, rather than just lumping all the >| elements together >| * provide a release of the DTD and/or stylesheets without >| any of the HW/SW-specific stuff. > >I tell you what. If you take the list of elements in DocBook and >divide them into those two groups: foundational and HW/SW-specific, >post your division to the list, and see if there's any disagreement, >and if we (the readers and posters on the list) can reach a mutual >understanding of where the dividing line is, I'll consider it. > >I think you'll find 100 elements in the former catagory, 100 in the >latter, and about 100 that no one can agree on. Those elements could always be duplicated. I think 100 is a bit much, though. Nearly all the elements seem to fit in some distinct category or another. I'll send my list in a follow-up message. >| Huh? What do you mean by "included fragments"? You mean like the >| 'fileref' attribute of <imagedata> instances? That's an example >| of what I think it'd be nice to use a command-line XPath or XQuery >| tool to collect. I'll probably just end up writing an XSLT script >| to do it, though (obviously, a separate means would be necessary >| to collect entity references, unless XSLT 2.0 includes this info). > >I often use tools to extract bits of files or preprocess files to >produce something I can include in my document. For example, this >Makefile rule extracts a fragment of addrbook-old.xml and produces >address.1 which I include in my source document. > > address.1: addrbook-old.xml > xinclude -d -x "/*/address[1]" $< $@ > >A tool that notices that mydoc.xml depends on address.1 isn't very >useful (IMHO). And I can't think of any way to encapsulate the rule >above in my document for an automatic tool to extract. Actually, this is a perfect example of what I was talking about. You have a naming convention such that 'address.<n>' is the nth entry of addrbook-old.xml. So, you might rewrite your rule as: address.%: addrbook-old.xml xinclude -d -x "/*address[$*]" $< $@ I think that syntax might be specific to GNU Make, but the '$*' expands to the text that the '%' matched. Anyhow, you have a tool that parses mydoc.xdbk into a makefile fragment that gets included in your makefile, so it knows that mydoc.xdbk depends on address.1, address.17, address.473, and address.94371, all of which the pattern rule tells it how to build. This is the tool I still haven't gotten around to writing. >| resolution. There's no way I want to be forced to maintain a >| separate list of locations for each entity I'm using in my >| document. > >If -I would find it for you, why do you have to maintain it by hand? > >Actually, I think I need an example, I'm not sure what you're looking >for. That's a good point. I think I'll write a tool that takes '-I' options and searches for the PUBLIC identifiers of all the external entities, and generates a catalog file that contains the mapping. I know it's not what PUBLIC identifiers were intended for, but I am unable to put explicit relative or absolute paths in the entity definitions, since automatically-generated external entities might be generated in different directories and at different depths, depending on certain build options. Matt Gruenke _________________________________________________________________ Join the world’s largest e-mail service with MSN Hotmail. http://www.hotmail.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC