OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [dita] RE: Updated Strategy for DITA 1.2 Proposal 12020 (Reuse of smalltext)



This would add <text> to a lot of different contexts, which is exactly what I didn't want, since it has no semantics tied to it, and if we add semantics to it then we have yet another phrase-level ancestor with nothing to choose between it vs. ph/keyword.

My proposal was to add <text> to <ph> and <keyword> (and their specializations) only, so it is always in a context that can be (or has been) specialized to provide semantics. This would allow reuse below the <ph> or <keyword> level without adding yet another peer element that competes with them as a specialization ancestor.

Most contexts that allow PCDATA should already allow either <keyword> or <ph> or both - if we find a case that doesn't, I'd suggest adding whichever makes sense, but NOT adding <text> itself.

Michael Priestley
Lead IBM DITA Architect
mpriestl@ca.ibm.com
http://dita.xml.org/blog/25



Stan Doherty <Stanley.Doherty@Sun.COM>
Sent by: Stanley.Doherty@Sun.COM

09/11/2007 07:25 AM
Please respond to
Stanley.Doherty@Sun.COM

To
"Grosso, Paul" <pgrosso@ptc.com>
cc
dita@lists.oasis-open.org, Deborah_Pickett@moldflow.com
Subject
Re: [dita] RE: Updated Strategy for DITA 1.2 Proposal 12020 (Reuse of small text)





Hi Paul --

You are correct. Deborah Pickett has drafted a proposal (attached) for
TC and offline discussion. If the sentiment of the TC is to proceed with
her proposal as the foundation for ongoing discussion (my
recommendation), Deborah or I will upload the XML as an official
proposal for 12020.

Many many thanks to Deborah for jumping in here.

Stan
Grosso, Paul wrote:
> I'll reply to this message, though I have also seen some later ones.
>
> I think we will need to see an actual new proposal before we'll
> be able to make any progress on this.  As it stands, I still
> don't have a good idea what is being proposed.
>
> paul
>
>  
>> -----Original Message-----
>> From: Stanley.Doherty@Sun.COM [mailto:Stanley.Doherty@Sun.COM]
>> Sent: Sunday, 2007 September 09 21:50
>> To: dita@lists.oasis-open.org; azydron@xml-intl.com; Grosso, Paul
>> Cc: Stanley.Doherty@Sun.COM
>> Subject: Updated Strategy for DITA 1.2 Proposal 12020 (Reuse
>> of small text)
>>
>> Hi --
>>
>> At the last TC meeting, attending members converged on a new strategy
>> for addressing this issue. I took the action item to
>> summarize the new
>> strategy and to send that summary out to the list for further
>> review and
>> discussion.
>>
>> 1. The original scope for 12020 needs to be expanded beyond
>> the nesting
>> of keyword. Although nested keywords would provide one
>> solution, it does
>> not address many of use cases for reusing bits of text outside teh
>> context of keywords, e.g. reusing a text bit in <uicontrol>.
>>
>> 2. Micheal Priestley proposed that DITA 1.2 add the <text> element to
>> both <keyword> and <ph>. This fairly restricted implementation would
>> allow for reuse of <text> without creating more general
>> confusion about
>> exporting semantic or non-semantic applications of <keyword> and <ph>
>> inappropriately. Specializations of <keyword> and <ph> would
>> inherit <text>.
>>
>> 3. Eliot suggested that we audit DITA 1.2 for situations
>> where (#PCDATA)
>> is not allowed. In those situations we need to verify that
>> <keyword> or
>> <ph> are allowed.
>>
>> If you have comments, use cases, or alternate approaches, please send
>> them out to the list.
>>
>> Pax,
>> Stan Doherty
>>
>>    

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE reference PUBLIC "-//IBM//DTD DITA Reference//EN" "../dtd/reference.dtd">
<reference id="IssueNumber12020">
 <title>DITA Proposed Feature # 12020</title>
 <shortdesc>Generic <keyword>text</keyword> element</shortdesc>
 <refbody>
   <section id="section_38D173E0641D48A2AD9EDA69FF7AD77C">
     <title>Longer description</title>
     <p>Expand the content model of base elements which can contain #PCDATA so that they can also contain a new <keyword>text</keyword> element.</p><p>Previous discussion of this proposed feature:<ul id="ul_A068CC31FAB344A98C0AB6AB0F2890A3"><li id="li_3B95833007A74F17971B2B3A1E69C14F"><xref href=""http://www.oasis-open.org/committees/download.php/15522/IssueNumber02.html"" scope="external">Proposal for nesting keywords, from DITA 1.1</xref></li></ul></p>
   </section>
   <section id="section_837BA91C7C84446C89261B16CEA3A376">
     <title>Statement of Requirement</title>
     <p>Heavy users of conref have a need to create fragments of text which can be re-used in almost any context.  This example is from the DITA 1.1 keyword nesting proposal:<lq href=""http://www.oasis-open.org/committees/download.php/15522/IssueNumber02.html">The" <keyword>keyword</keyword> element is often used to store common text such as
product or platform names. This is done because keywords are
allowed in nearly all locations that allow text. However, it is not
possible to combine common strings into one single <keyword>keyword</keyword> for
reuse. For example, both <q>Product A</q> and <q>Platform B</q> are common
strings, and are often used together as <q>Product A for Platform B</q>.
To use the combined value, users must enter a new copy of each
string in another <keyword>keyword</keyword>. Alternatively, they could always
reference the two in text as <codeph>&lt;keyword
conref="#topic/product"/&gt; for &lt;keyword
conref="#topic/platform"/&gt;</codeph>. So, it is impossible to
reuse the string <q>Product A for Platform B</q> without repeating text
somewhere.</lq>Similar issues happen with other elements such as <keyword>term</keyword> and specializations of these and of <keyword>ph</keyword>.</p><p>Specializers may have a requirement to disallow mixed content in an element, to ensure correct arity of child elements.  For example, a specialization of <keyword>keyword</keyword> may need strict alternation of text with <keyword>data</keyword> elements:<codeblock>&lt;spec-keyword&gt;
 text&lt;spec-data-1/&gt;&lt;spec-data-2&gt;...&lt;/spec-data-2&gt;
 text&lt;spec-data-1/&gt;&lt;spec-data-2&gt;...&lt;/spec-data-2&gt;
 text&lt;spec-data-1/&gt;&lt;spec-data-2&gt;...&lt;/spec-data-2&gt;
&lt;/spec-keyword&gt;</codeblock>This cannot be validated with XML Schema or DTD because mixed content models are too lax.</p>
   </section>
   <section id="section_DF08B44F81F3469CB8CD4A57D14268B6">
     <title>Use Cases</title>
     <dl><dlentry><dt>Conref of small pieces of text</dt><dd><p>Some elements such as <keyword>keyword</keyword> and <keyword>term</keyword> have a content model which does not allow further nesting of elements (discounting <keyword>tm</keyword>, which is usually not generic enough).  Building such an element from multiple strings does not give the user a place to hang a conref.</p><p>Similarly, this problem can flow through to specializations, so that it is not possible to conref any text into a <keyword>wintitle</keyword> element.</p><p>With this proposal, <keyword>text</keyword> is available in all elements (including <keyword>text</keyword> itself) that don't contain <keyword>ph</keyword>. Strings can be built from pieces and inserted by conref into any context:<codeblock>&lt;topic id="strings"&gt;
 ...
 &lt;body&gt;
   &lt;text id="productA"&gt;Product A&lt;/text&gt;
   &lt;text id="platformB"&gt;Platform B&lt;/text&gt;
   &lt;text id="AforB"&gt;&lt;text conref="#strings/productA"/&gt; for &lt;text conref="#strings/platformB"/&gt;&lt;/text&gt;
 &lt;/body&gt;
&lt;/topic&gt;

&lt;topic&gt;
 &lt;title&gt;Using &lt;keyword&gt;&lt;text conref="strings.xml#strings/AforB"/&gt;&lt;/keyword&gt;&lt;/title&gt;
 ...
&lt;/topic&gt;</codeblock></p></dd></dlentry><dlentry><dt>Removing mixed content from a specialization</dt><dd>Mixed content models in DTD and XML Schema cannot prevent the appearance of text in undesired places:<codeblock>&lt;!ELEMENT spec-keyword ((#PCDATA | spec-data-1 | spec-data-2)*)&gt;</codeblock>By placing the text content inside a container element:<codeblock>&lt;spec-keyword&gt;
 &lt;text&gt;text&lt;/text&gt;&lt;spec-data-1/&gt;&lt;spec-data-2&gt;...&lt;/spec-data-2&gt;
 &lt;text&gt;text&lt;/text&gt;&lt;spec-data-1/&gt;&lt;spec-data-2&gt;...&lt;/spec-data-2&gt;
 &lt;text&gt;text&lt;/text&gt;&lt;spec-data-1/&gt;&lt;spec-data-2&gt;...&lt;/spec-data-2&gt;
&lt;/spec-keyword&gt;</codeblock>validation can now be more strict:

<codeblock>&lt;!ELEMENT spec-keyword ((text, spec-data-1, spec-data-2)*)&gt;</codeblock><note>It may be desirable for <keyword>text</keyword> to be specializable so that it has a more appropriate name in the context of the specialization.</note>
</dd></dlentry></dl><draft-comment>Should specialization of <keyword>text</keyword> be allowed?  It seems harmless to allow it (and one fewer exception to remember) but it would conflict with the tenet that <keyword>text</keyword> has no associated semantics. Allowing specializations of <keyword>text</keyword> would allow specializers great power.  I can even see specialization of <keyword>text</keyword> as a way of getting Japanese Ruby to every context universally.</draft-comment></section>
   <section id="section_93C48ACC12164596AB1FE632A5E9983C">
     <title>Scope</title>
     <p>Minor.  Content models for a few dozen elements require an additional entry.</p>
   </section>
   <section id="section_A3670E84AD4343B1BD0D9062761A3C15">
     <title>Technical Requirements</title>
     <p>The <keyword>text</keyword> element should be added to any element which allows #PCDATA but does not allow <keyword>ph</keyword>.<draft-comment>This differs from Michael Priestley's suggestion that only <keyword>keyword</keyword> and <keyword>ph</keyword> receive <keyword>text</keyword>. Adding to <keyword>term</keyword> and <keyword>tm</keyword> seems to be necessary too.  The remaining elements (marked with an asterisk) already allow <keyword>keyword</keyword> but not <keyword>ph</keyword>.  They all use the <keyword>%words.cnt</keyword> entity, so adding <keyword>text</keyword> to these would just be a matter of adding it to <keyword>%words.cnt</keyword>.</draft-comment> In DITA 1.1, these elements are<ul id="ul_7A6820240788452C833C969ACE232928"><li id="li_D0C3B1D145634630B16E34292580E542"><keyword>keyword</keyword></li><li id="li_08B0D42EC3D54E79ABD3E49F2A3E6971"><keyword>term</keyword></li><li id="li_BBAC82708A3447FCBC550AEF460C6E7F"><keyword>tm</keyword></li><li id="li_85474913837D43899FA6D920B4B0CE38"><keyword>alt</keyword>*</li><li id="li_AA6625B28E9A42E5B97D755BD2810090"><keyword>indexterm</keyword>*</li><li id="li_4E8127C6475848AB9AB169715B7609EA"><keyword>index-base</keyword>*</li><li id="li_6272A38DCE0D4E5EB0EE5294C39F3FA4"><keyword>linktext</keyword>*</li><li id="li_AC3F050A345D41DD934CA14FF4C4685E"><keyword>navtitle</keyword>*</li><li id="li_B869A888A28147A8BD854D31D94787C1"><keyword>searchtitle</keyword>*</li><li id="li_C89F78F307974D3694C3D0D3C54CB19D"><keyword>author</keyword>*</li><li id="li_D9488575FB2E42498BED6AF85236AB51"><keyword>source</keyword>*</li><li id="li_AFAAFE18142743FBAC1ED8C496144A45"><keyword>publisher</keyword>*</li><li id="li_F735077886214064B1C9DC351E0F6FAC"><keyword>copyrholder</keyword>*</li><li id="li_6076C79D88164C5FA065793FE1F7BAAA"><keyword>category</keyword>*</li><li id="li_DAA9BFB343F24F67B0F6FE8EC78BD569"><keyword>prodname</keyword>*</li><li id="li_0F44DE80E38F4E9382FCAF78556AD2C5"><keyword>brand</keyword>*</li><li id="li_7E5229DF049340069E8CF1EAB5C1916E"><keyword>series</keyword>*</li><li id="li_D837FCA69BBB41EF8AA59936F81CE99E"><keyword>platform</keyword>*</li><li id="li_88079AD03A30489FB731DBD5580EAF0F"><keyword>prognum</keyword>*</li><li id="li_6FBBA75946784B8B98B21923057FBD70"><keyword>featnum</keyword>*</li><li id="li_7F1BFB902A3742519F72947F152746EB"><keyword>component</keyword>*</li></ul>Items marked with an asterisk can contain <keyword>keyword</keyword> but not <keyword>ph</keyword>.</p><p>The <keyword>text</keyword> element should also be added to the content model of <keyword>&lt;ph&gt;</keyword> so that text can be conreffed into a phrase without additional semantics.</p><p>Specializations of these elements should also decide whether to include <keyword>text</keyword>. In DITA 1.1, these elements are<dl><dlentry><dt>Bookmap</dt><dd><keyword>revisionid</keyword>, <keyword>year</keyword>
, <keyword>month</keyword>
, <keyword>day</keyword>,
<keyword>edition</keyword>, <keyword>isbn</keyword>, <keyword>volume</keyword>, <keyword>person</keyword>, <keyword>organization</keyword>, <keyword>summary</keyword>, <keyword>printlocation</keyword>, <keyword>bookpartno</keyword>, <keyword>booknumber</keyword>


</dd></dlentry><dlentry><dt>Indexing domain</dt><dd><keyword>index-see</keyword>, <keyword>index-see-also</keyword>, <keyword>index-sort-as</keyword></dd></dlentry><dlentry><dt>Programming domain</dt><dd><keyword>option</keyword>, <keyword>parmname</keyword>, <keyword>synph</keyword>, <keyword>apiname</keyword>, <keyword>kwd</keyword>, <keyword>var</keyword>, <keyword>oper</keyword>, <keyword>delim</keyword>, <keyword>sep</keyword>, <keyword>repsep</keyword></dd></dlentry><dlentry><dt>Software domain</dt><dd><keyword>msgph</keyword>, <keyword>msgblock</keyword>, <keyword>filepath</keyword>, <keyword>userinput</keyword>, <keyword>systemoutput</keyword>, <keyword>msgnum</keyword>, <keyword>cmdname</keyword>, <keyword>varname</keyword></dd></dlentry><dlentry><dt>UI domain</dt><dd><keyword>wintitle</keyword>, <keyword>shortcut</keyword>, <keyword>uicontrol</keyword></dd></dlentry><dlentry><dt>Utilities domain</dt><dd><keyword>coords</keyword>, <keyword>shape</keyword></dd></dlentry><dlentry><dt>xNAL domain</dt><dd><keyword>honorific</keyword>, <keyword>firstname</keyword>, <keyword>middlename</keyword>, <keyword>lastname</keyword>, <keyword>generationidentifier</keyword>, <keyword>postalcode</keyword>, <keyword>country</keyword>, <keyword>contactnumber</keyword>, <keyword>otherinfo</keyword>, <keyword>addressdetails</keyword>, <keyword>locality</keyword>, <keyword>localityname</keyword>, <keyword>administrativearea</keyword>, <keyword>thoroughfare</keyword>, <keyword>emailaddress</keyword>, <keyword>url</keyword></dd></dlentry></dl></p><p>The content model of <keyword>text</keyword> is<codeblock>&lt;!ELEMENT text ((#PCDATA | text)*)&gt;</codeblock><keyword>text</keyword> contains all universal DITA attributes.</p>
   </section>
   <section id="section_D3C1B9A9A4A044D6BEC2B75EA17843F7">
     <title>New or Changed Specification Language</title>
     <p>This element requires no addition to the architectural specification.</p><p>The language specification should have an entry in the element reference for <keyword>text</keyword>.  Here is a suggested description:<lq><p>The <keyword>text</keyword> element
associates no semantics with its content. It exists to serve as a container for text where a container is needed (e.g., for conref, or for restricted content models in specializations).  Unlike <keyword>ph</keyword>, <keyword>text</keyword> cannot contain images.
The <keyword>text</keyword> element contains only text data, or nested <keyword>text</keyword> elements.  All universal attributes are available on <keyword>text</keyword>.</p><p>For contexts where <keyword>ph</keyword> is available, authors should use that element.  Where <keyword>ph</keyword> is not available, <keyword>text</keyword> can be used to pull content by conref.</p></lq></p>
   </section>
   <section id="section_9C191253350C40A5A286FDF7AB8D490E">
     <title>Costs</title>
     <p>DTDs and Schemas must be updated.</p><p>Implementations will need to include processing for <keyword>text</keyword>.  This may require implementations to handle XML fragments where they once needed to handle only strings. Fallback behaviour (flattening the XML a la xsl:value-of) is not appropriate for <keyword>text</keyword> if attributes like <keyword>dir</keyword> or <keyword>translate</keyword> or filtering properties are present.<draft-comment>Flattening is also not appropriate for specializations of <keyword>text</keyword>, should they be allowed.</draft-comment></p>
   </section>
   <section id="section_C7EF8D6AF0FF4AAF9FDDA3F9412775EB">
     <title>Benefits</title>
     <p>Greater re-use of boilerplate text by users and fewer (to users) arbitrary limitations.  Specializers can have more control over content models by avoiding mixed content.</p>
   </section>
 </refbody>
</reference>

issue_12020_pickett.pdf



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]