OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [dita] Proposal for Consideration: Default Behavior for List Items


In my own schema development work I distinguish between paragraph-level elements, which may only contain text and phrase-level or inline elements, and those more complex blocks such as lists and tables which contain paragraph-level and these "block" elements.
 
With this distinction, what I advocate is that an element be either a paragraph-type element like <p>, admitting only phrase-level children and #PCDATA, or the more complex type like <li> which admits <p> and <table> and <ol> etc. but does not admit #PCDATA and phrase-level elements.
 
Michael justly objected that such a change to DITA would wreak havoc with specializations and with processors/processes. I concurred that it seems to be too late to rectify this in DITA.
 
However, it is useful to distinguish between paragraph-level and  block-type elements. It is only in the latter that this particular sort of rendering issue intrudes into what should be strictly a structural/semantic markup, and equivocation between the two clouds discussion.
 
To be more explicit, browsers interpret
 
.....<ol>
.......<li>
...........<p>Text</p>
.......</li>
.......<li>
...........<p>Text</p>
.......</li>
.....</ol>

with extra white space between the list items and
 
.....<ol>
.......<li>
...........Text
.......</li>
.......<li>
...........Text
.......</li>
.....<ol>
 
without. As I understand this thread, something like this behavior was carried forward into the OT treatment of <li> in DITA.
 
The preceding is an example of text containing a single sentence interrupted by two block-type examples.

However, note that even in HTML <p> can contain only phrase-level elements, so that the above example must be marked up as something like
 
<p>Browsers interpret</p>
<blockquote>
.....&lt;ol&gt;
.......&lt;li&gt;
...........&lt;p&gt;Text&lt;/p&gt;
.......&lt;/li&gt;
 
.......&lt;li&gt;
...........&lt;p&gt;Text&lt;/p&gt;
.......&lt;/li&gt;
.....&lt;/ol&gt;
</blockquote>
<p>with extra white space between the list items and</p>
 
etc. In other words, it is not the case in HTML that <p> is a block-type element that can be interrupted by other block-type elements. As the HTML spec says "The P element represents a paragraph. It cannot contain block-level elements (including P itself)." In XML, the semantic container is the parent of these children, such as <section> or <concept>
 
Erik, you said:
>From an authoring perspective, it is attractive to be able to avoid
>unneeded markup (extra noise) when a list item has a single block.
Authors typically depend on an authoring tool to handle the markup. In an authoring tool, it is easy to provide for automatic insertion of a default <p> and only a very slight nuisance for the author to delete or move the <p> if some other element such as <table> is to be first. Consequently I think the authoring perspective has little weight in this particular issue, in which the primary concern is the separation of format/rendering from structure/semantics. All I have attempted then, barring any real change, is to affirm the importance of distinguishing between levels of structural complexity, and not using the a category such as "block" equivocally (as e.g. the HTML spec does even while making this distinction).
 
    /BN


From: Erik Hennum [mailto:ehennum@us.ibm.com]
Sent: Sunday, June 15, 2008 12:23 PM
To: Bruce Nevin (bnevin)
Cc: Andrzej Zydron; dita@lists.oasis-open.org; Michael Priestley
Subject: RE: [dita] Proposal for Consideration: Default Behavior for List Items

Hi, Andrzej, Bruce, and Michael:

Not to protract the thread, but Jim Early and Scott Hudson (I think) had the insight based on their mapping work that the DITA block elements are correctly understood as block containers.

That is, the start tag and end tag declare block boundaries, but the element itself can contain multiple blocks, as in:

... <section>
....... First block.
....... <p>
........... Second block.
........... <lq>
........... Third block
........... </lq>
........... Fourth block.
....... </p>
....... Fifth block.
... </section>

From a processing perspective, there's nothing ambiguous about that. The processor can find the block boundaries without issue.

From an authoring perspective, it is attractive to be able to avoid unneeded markup (extra noise) when a list item has a single block.

....... <li>Single block.</li>

but take advantage of additional markup for multiple blocks:

....... <li>First block.
........... <p>Second block</p>
....... </li>

People have been doing without issue in HTML for ages.

So, the big question is whether authors are significantly more likely to abuse the markup as follows:

... <section>
....... A sentence spanning
....... <p>A paragraph.</p>
....... before finishing.
... </section>

Than as follows:

... <section>
....... <p>A sentence spanning</p>
....... <p>A paragraph.</p>
....... <p>before finishing.</p>
... </section>

The markup itself can no more prevent the second case than the first.

Neither usage would occur to me. Are there are a significant number of users who would create the first but not the second example?


Hoping that's useful,


Erik Hennum
ehennum@us.ibm.com



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]