[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: DOCBOOK: Re: doc domain vs. problem domain semantics
>From: Norman Walsh <ndw@nwalsh.com> >To: docbook@lists.oasis-open.org >Subject: DOCBOOK: Re: doc domain vs. problem domain semantics > (Re[2]: listitem) >Date: Mon, 31 Dec 2001 16:43:41 -0500 > >/ "Matt G." <matt_g_@hotmail.com> was heard to whine: >| As a matter of fact, I'd guess that more often than not, >| variablelist is used to list things other than variables. > >Ah, the "variable" in variablelist isn't really for programming >language variables. It's really a description list. In fact, if >HTML had come first, we probably would have called it a >descriptionlist. As it is, I don't really recall the etymology. Okay, that sure had me confused. Why not transition to descriptionlist? >| This gets the subject of >| my message, and the tangent the thread is getting off to, which >| is that since there aren't semantics rich enough to describe the >| types of >| formatting structures people use in documents, the more >| domain-specific ones are fallen back upon, as a crutch. This >| has the effect of ruining the semantics of the domain-specific >| markup, particularly if it's uses are mixed, within a single >| document. > >I'm not sure I understand what you're trying to say. Sorry, I should have been consistent and said "problem domain-specific". I should clarify that I'm referring to 3 domains: problem domain: what the author is trying to describe (e.g. classes, types, commands, processes, pipes, message queues, streams, groups, RPC calls, sockets, daemons, character devices, etc.) document domain: the document constructs (I mean structural, as in paragraphs and tables, but the line between document structures and presentation is fuzzy, in places) application domain (or solution domain): generically, the means by which the problem is solved (e.g. typeface, font style and size, margins, pagebreaks, indentation, etc.) The richness of the structure and information decrease, as you get more towards the application domain. Clearly, DocBook is focused on the two former. What I was saying is that "tag abuse", as you called it, effectively ruins the semantics of the abused tag. So, a deficiency in problem domain semantics, w/o a suitable fallback in the document domain, leads to the potential abuse of another problem domain construct (which directly damages the most richly structured information that has the greatest potential longevity and utility). I thought variablelist was an example of this. On the other hand, if you have a document construct on which to fall back, your document may not be as richly structured as it could be, but at least it's not as destructive as tag abuse. >But I will point out that there's a constant tension between >general markup and specific markup. DocBook tries to achieve a >good balance for computer software and hardware documentation. But Of course. As any experienced schema designer can attest, formalizing concepts can be difficult. I think DocBook generally does a good job. >The entire design of DocBook is geared to make it possible for >you to write customization layers that provide the exact markup >that you need. Right, but do you think DocBook is rich enough to serve as an intermediate format for most types of publications, without resorting to tag abuse? In other words, are its document domain semantics sufficiently rich to provide all the structural constructs most documents need? If not, do you consider this goal to be realistic? If you do, then how far off the mark do you consider DocBook to be? Where would you draw the line; for what types of publications could the document domain semantics of DocBook (or a spiffed up version) be used, as an intermediate format? Textbooks? Newspapers? Magazines? (The latter two are really collections of documents, of course.) How would you characterize the dichotomy between document structures that are (or would be) supported and those that aren't? For example, it's true that some magazines are awfully layout-oriented, but if DocBook (or some derivative format) isn't suitable for authoring them, why not? Where does the real problem lie? > >>| More importantly (in the > >>| short-term) it doesn't even appear to be nested, at all, in >>| the >DSSSL print style-sheets (version 1.74b - the latest). > >I've lost the beginning of this thread, what doesn't appear nested? variablelists. They don't nest properly, with DSSSL print style-sheets (version 1.74b), using the TeX backend & OpenJade v1.3. I suppose I should whip up an example and submit a bug report. >| So, is there really no desire to augment it to be better suited >| for more general documentation tasks and more easily adaptable >| to other sorts of problem domains than HW/SW? > >There are thousands of things that we could add that would >ideally suit the needs of one community or another. DocBook >could be extended to provide structures suitable for medical >publishing, for legal publishing, for automotive manufacturing >publishing, etc. ad infinitum. See, that's exactly *not* what I'm talking about. I'm wondering how suitable of a *foundation* (for layering or augmentation) you think DocBook is or could be, so *others* could leverage much of the work done on DocBook and many of the existing (and future) tools. >But I'm not sure that's the best approach. People often complain >that DocBook is too big. Making it 10 times bigger is probably not >a good idea. I agree (with you and those people). Though I disagree with the approach of Simplified DocBook (possibly because it's intended to solve some problems I'm not concerned with). I think a more appropriate solution would be to partition the elements into a document domain group, and a number of different problem domain-specific groups (e.g. publishing meta, program sourcecode doc, program usage doc, hw/sw concepts, and misc.). Put them in separate schemas, and maybe even namespaces. Also, document them in separate groups. ...and now comes the really foolish thing: >| IMO, the DocBook DTD (which, admittedly, I haven't really spent >| much time dissecting) should be partitioned into document >| construct and HW/SW constructs (in addition to the various other >| classes of attribute and entity definitions). Stylesheets, too. >| This would make it easier for say a biotech publication or >| physics department of a major university to use the core >| documentation semantics as a foundation for their own >| field-specific documentation vocabulary, without carrying extra >| baggage or suffering with unnecessary name collisions with >| semantics foreign to their domain. > >With respect, DocBook is designed *specifically* to make this >possible. Perhaps you ought to spend some time looking at it. "with respect"? I don't see why ;) Okay, my apologies. I sometimes get very idealistic and consumed with thinking about how things *should* be structured, while being a little slow to dig into the details (perpetually feeling like "I'm really too busy to spend much time on this, just now"). However, here's a suggestion: rather than simply structuring it that way, internally, why not do one or both of the following: * Document it that way, rather than just lumping all the elements together * provide a release of the DTD and/or stylesheets without any of the HW/SW-specific stuff. >| Do you see that what I'm interested in is two things: >| 1) Preserving the semantics of HW/SW-specific constructs, by >| providing suitable fall-backs >| 2) Allowing DocBook to be more easily adapted to other domains, >| either through augmentation or as a richly structured >| intermediate format. > >I suppose. I think it would be helpful if you made some concrete >proposals. As for point #1, I can probably spend some time going through TDG w/ a fine-tooth comb and might come up w/ more examples like "variablelist". On the topic of #2, I wish I could, but I'm no publishing guru, and that's the kind of person who I imagine could say whether "it's all there", or identify "what's missing", or decide that "DocBook is too far off the mark and there's not the will or resources among the TC to get there", or "it just doesn't make sense to carry all the legacy of DocBook, and it'd be easier to start from scratch". To be honest, I'm definitely not that person, so I'm really just wondering aloud, and am interested in hearing your/others' views on this issue. >| So, you don't have a tool to generate your dependencies >| automatically, do you? I'll soon whip one up, in Python. I >| probably won't bother to > >Generating dependencies for things like entities is easy. But the >processing semantics of included fragments isn't self-evident so >I'm not sure there's a way to make a tool for it. Huh? What do you mean by "included fragments"? You mean like the 'fileref' attribute of <imagedata> instances? That's an example of what I think it'd be nice to use a command-line XPath or XQuery tool to collect. I'll probably just end up writing an XSLT script to do it, though (obviously, a separate means would be necessary to collect entity references, unless XSLT 2.0 includes this info). So, what are you saying has processing semantics such that it'd be unclear whether you'd want to rebuild the document, if it changed? >| would support XSLT 1.1. I also dearly wish it had a >| command-line flag for specifying an SYSTEM id search path (for >| external entities and DTD subsets), similar to the '-I' option >| supported by most C/C++ compilers!! > >Check out the XML Catalogs specification and try using public >identifiers. I already use automatically generated catalog files to resolve the latest (theoretically) compatible DTD, for a given document. But I don't think catalog files are an efficient way to manage entity resolution. There's no way I want to be forced to maintain a separate list of locations for each entity I'm using in my document. Furthermore, Catalog files' inability to provide more than one layer of indirection forces you to automatically generate them, for use within a directory tree that's under source control and used by multiple developers. Finally, most(?) XSLT tools don't even support catalog files. Why is OpenJade's '-D' (which works like '-I', for most C preprocessors) a bad way to go? I think it's the best tradeoff between control, ease of use, and low maintenance burden, for my purposes. I just wish Xalan supported it. > Be seeing you, > norm Yeah, I appreciate your good humor about these things. I do tend to ramble. Matt Gruenke _________________________________________________________________ Chat with friends online, try MSN Messenger: http://messenger.msn.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC