dita message

Subject: RE: [dita] Proposal for Consideration: Default Behavior for List Items
From: "Bruce Nevin (bnevin)" <bnevin@cisco.com>
To: "Andrzej Zydron" <azydron@xml-intl.com>, "Michael Priestley" <mpriestl@ca.ibm.com>
Date: Tue, 10 Jun 2008 17:06:29 -0400
Thanks for the cc to me, Andrej. (Someone please offline point me to the
right help file to learn how to engage this thread properly. Is it a
subscription thing?) 

> If it's a relic of HTML, I'm not sure why it's a bad relic. The 
> adoption of HTML hasn't exactly been crippled by this approach.


I suspect that on careful consideration, Michael, you might want to
rephrase that. I know I don't have to tell you that HTML browsers have a
much simpler rendering task because they're just about format, so the
HTML spec can get away with ignoring semantic criteria. We can't. That's
why a relic of HTML that is not semantically motivated is a bad thing
for XML, and a bad thing for DITA.

Case in point: To say that the text before a list (which may contain
paragraphs, tables, lists, figures, etc.) is in the _same_paragraph_ <p>
as the text after that list is perverse and contrary to any ordinary
notion of "paragraph". Conversely, if DITA has its own definition of
"paragraph" allowing that, then why not allow <p> as a child of <p>? The
same logic that proscribes that should proscribe anything like it,
including lists.

There is undoubtedly a cost to correcting bad decisions after their
effects have become established. Bear in mind that if not corrected such
considerations may become a barrier to adoption in the future after the
utility of an on-ramp to XML wears off and users want closer semantic
control of their content. We spoke of usability issues in the TC today.
Here is one staring us in the face. Users found it confusing. Very
possibly OT developers found it confusing, whence the disparate
rendering. Making a clean categorization of elements in terms of their
complexity could reduce confusion and simplify OT work. By complexity I
mean something like phrase can only containt #PCDATA, para can only
contain phrase and #PCDATA, "block" = {list, table, ...} can only
contain para and "block", etc.

This is representative of a larger issue. Another example is the
decision to make lists and tables semantically distinct. That is
properly a rendering distinction. Any table can be rendered as a list
whose list items (the row elements) are parallel in structure. Any list
whose items are parallel in structure (such as a list of steps) can be
rendered as a table. Development of adaptable facilities for semantic
tables is one of the unresolved challenges and potential benefits of
XML, and that decision to sunder lists from tables obscures the means.

That's a digression from the current thread, so we ought not to pursue
it here. I just mention it to indicate that this is part of a larger
issue of relics of HTML format markup that may be lurking, which should
have been put in question relative to the SGML standard during the
inception of DITA, but which for whatever reason were not. 

	/BN

> -----Original Message-----
> From: Andrzej Zydron [mailto:azydron@xml-intl.com] 
> Sent: Tuesday, June 10, 2008 4:05 PM
> To: Michael Priestley
> Cc: dita@lists.oasis-open.org; Bruce Nevin (bnevin)
> Subject: Re: [dita] Proposal for Consideration: Default 
> Behavior for List Items
> 
> Hi Michael,
> 
> Your example failed to highlight the real problem, which is:
> 
> <li>Do something.
>       <p>One of three things happens:
>           <ul><li>A</li>
>                   <li>B</li>        
>                   <li>B</li>        
>           </ul>
>         that really screw up segmentation, translation and any sane
>         form of linguistic processing.
>       </p>
> </li>
> 
> The problem is that HTML was a VERY BAD IMPLEMENTATION of 
> SGML. It concentrated on form rather than structure (mixing 
> up both which is, if not a sin against humanity, then 
> definitely one against common sense ;) ), which is why we 
> needed XML. Basing an XML vocabulary on HTML (which would not 
> even parse in SGML terms after about version 2.0) was, at 
> best IMHO a dubious choice.
> 
> Rather like <b>, <u>, <i> and translatable attributes this 
> should all be consigned to the DITA 'deprecated' bin of 
> history (BTW the same should be true of CONREF for individual 
> nouns or noun phrases), and good riddance to it all. Anybody 
> who has had to cope with translating such documents will 
> testify to the difficulties involved therein.
> 
> Best Regards,
> 
> AZ
> 
> 
> 
> Michael Priestley wrote:
> >
> > A few points:
> >
> > - This would be a backwards-incompatible change. That is, it would 
> > render invalid a large proportion of the existing DITA content out 
> > there. I think we could consider this for 2.0 if the cost of 
> > converting all back-level content was justified by the 
> benefits (I'm 
> > not currently convinced myself, but that would be the 
> timeline to make 
> > the arguments)
> > - This would also render the current task specialization invalid, 
> > since it specializes a <ph> element as the first child of 
> <step>. As 
> > an exercise, see what any of the list specializations would 
> look like, 
> > if only block-level elements were allowed (I suspect it would break 
> > most of them).
> >
> > Finally, and leaving aside the pragmatic reasons not to make a 
> > backwards-incompatible change to the schemas and DTDs at 
> this point, 
> > I'm still not sure why this:
> >
> > <li><p>Do something</p>
> >        <p>One of three things happens:</p>
> >        <ul><li><p>A</p></li>
> >                <li><p>B</p></li>        
> >                <li><p>B</p></li>        
> >        </ul>
> > </li>
> >
> > Is better than this:
> >
> > <li>Do something.
> >       <p>One of three things happens:
> >           <ul><li>A</li>
> >                   <li>B</li>        
> >                   <li>B</li>        
> >           </ul>
> >       </p>
> > </li>
> >
> > If it's a relic of HTML, I'm not sure why it's a bad relic. The 
> > adoption of HTML hasn't exactly been crippled by this 
> approach.        
> >
> > Michael Priestley
> > Lead IBM DITA Architect
> > mpriestl@ca.ibm.com
> > http://dita.xml.org/blog/25
> >
> >
> > *"Bruce Nevin (bnevin)" <bnevin@cisco.com>*
> >
> > 06/10/2008 12:22 PM
> >
> > 	
> > To
> > 	<dita@lists.oasis-open.org>
> > cc
> > 	"Bruce Nevin (bnevin)" <bnevin@cisco.com> Subject
> > 	RE: [dita] Proposal for Consideration: Default Behavior 
> for List 
> > Items
> >
> >
> >
> > 	
> >
> >
> >
> >
> >
> > [Not sure if this is the right way to contribute to this 
> thread, but I 
> > don't see any contributor hooks on the page or in the Help. 
> Responding 
> > to 
> _http://lists.oasis-open.org/archives/dita/200804/msg00060.html_.]
> >  
> > I agree that rendering is an OT issue.
> >  
> > The real issue IMO is that <li> permits #PCDATA and phrase-level 
> > elements. These should only be permitted in paragraph-level 
> elements, 
> > and any element that permits paragraph-level or "larger" 
> elements as 
> > children should not permit #PCDATA and phrase-level elements. This 
> > behavior seems to be a relic of the HTML standard.
> >  
> > It is easy for OT and vendors to insert <p> by default, and if <li> 
> > begins with some other child element it is only a minor nuisance to 
> > delete <p> or insert that child ahead of <p>.
> >  
> > This would simplify the work of rendering and remove the 
> ambivalence 
> > that is the topic of this thread.
> >  
> > Perhaps this is already being considered for 1.3 or 2.0.
> >  
> >     /Bruce Nevin
> 
> --
> email - azydron@xml-intl.com
> smail - c/o Mr. A.Zydron
> 	PO Box 2167
>         Gerrards Cross
>         Bucks SL9 8XF
> 	United Kingdom
> Mobile +(44) 7966 477 181
> FAX    +(44) 1753 480 465
> www - http://www.xml-intl.com
> 
> This message contains confidential information and is 
> intended only for the individual named.  If you are not the 
> named addressee you may not disseminate, distribute or copy 
> this e-mail.  Please notify the sender immediately by e-mail 
> if you have received this e-mail by mistake and delete this 
> e-mail from your system.
> E-mail transmission cannot be guaranteed to be secure or 
> error-free as information could be intercepted, corrupted, 
> lost, destroyed, arrive late or incomplete, or contain 
> viruses.  The sender therefore does not accept liability for 
> any errors or omissions in the contents of this message which 
> arise as a result of e-mail transmission.  If verification is 
> required please request a hard-copy version. Unless 
> explicitly stated otherwise this message is provided for 
> informational purposes only and should not be construed as a 
> solicitation or offer.
> 
> 
>
Follow-Ups:
- RE: [dita] Proposal for Consideration: Default Behavior for List Items
  - From: Michael Priestley <mpriestl@ca.ibm.com>
References:
- Re: [dita] Proposal for Consideration: Default Behavior for ListItems
  - From: Andrzej Zydron <azydron@xml-intl.com>