dita message

Subject: Re: [dita] RE: Updated Strategy for DITA 1.2 Proposal 12020 (Reuse of smalltext)

From: Michael Priestley <mpriestl@ca.ibm.com>
To: Stanley.Doherty@Sun.COM
Date: Tue, 18 Sep 2007 11:50:50 -0400

This would add <text> to a lot of different contexts, which is exactly what I didn't want, since it has no semantics tied to it, and if we add semantics to it then we have yet another phrase-level ancestor with nothing to choose between it vs. ph/keyword.

My proposal was to add <text> to <ph> and <keyword> (and their specializations) only, so it is always in a context that can be (or has been) specialized to provide semantics. This would allow reuse below the <ph> or <keyword> level without adding yet another peer element that competes with them as a specialization ancestor.

Most contexts that allow PCDATA should already allow either <keyword> or <ph> or both - if we find a case that doesn't, I'd suggest adding whichever makes sense, but NOT adding <text> itself.

Michael Priestley
Lead IBM DITA Architect
mpriestl@ca.ibm.com
http://dita.xml.org/blog/25

Stan Doherty <Stanley.Doherty@Sun.COM>
Sent by: Stanley.Doherty@Sun.COM

09/11/2007 07:25 AM

Please respond to
Stanley.Doherty@Sun.COM

To	"Grosso, Paul" <pgrosso@ptc.com>
cc	dita@lists.oasis-open.org, Deborah_Pickett@moldflow.com
Subject	Re: [dita] RE: Updated Strategy for DITA 1.2 Proposal 12020 (Reuse of small text)

Hi Paul -- You are correct. Deborah Pickett has drafted a proposal (attached) for TC and offline discussion. If the sentiment of the TC is to proceed with her proposal as the foundation for ongoing discussion (my recommendation), Deborah or I will upload the XML as an official proposal for 12020. Many many thanks to Deborah for jumping in here. Stan Grosso, Paul wrote: > I'll reply to this message, though I have also seen some later ones. > > I think we will need to see an actual new proposal before we'll > be able to make any progress on this. As it stands, I still > don't have a good idea what is being proposed. > > paul > > >> -----Original Message----- >> From: Stanley.Doherty@Sun.COM [mailto:Stanley.Doherty@Sun.COM] >> Sent: Sunday, 2007 September 09 21:50 >> To: dita@lists.oasis-open.org; azydron@xml-intl.com; Grosso, Paul >> Cc: Stanley.Doherty@Sun.COM >> Subject: Updated Strategy for DITA 1.2 Proposal 12020 (Reuse >> of small text) >> >> Hi -- >> >> At the last TC meeting, attending members converged on a new strategy >> for addressing this issue. I took the action item to >> summarize the new >> strategy and to send that summary out to the list for further >> review and >> discussion. >> >> 1. The original scope for 12020 needs to be expanded beyond >> the nesting >> of keyword. Although nested keywords would provide one >> solution, it does >> not address many of use cases for reusing bits of text outside teh >> context of keywords, e.g. reusing a text bit in <uicontrol>. >> >> 2. Micheal Priestley proposed that DITA 1.2 add the <text> element to >> both <keyword> and <ph>. This fairly restricted implementation would >> allow for reuse of <text> without creating more general >> confusion about >> exporting semantic or non-semantic applications of <keyword> and <ph> >> inappropriately. Specializations of <keyword> and <ph> would >> inherit <text>. >> >> 3. Eliot suggested that we audit DITA 1.2 for situations >> where (#PCDATA) >> is not allowed. In those situations we need to verify that >> <keyword> or >> <ph> are allowed. >> >> If you have comments, use cases, or alternate approaches, please send >> them out to the list. >> >> Pax, >> Stan Doherty >> >> <?xml version='1.0' encoding='UTF-8'?> <!DOCTYPE reference PUBLIC "-//IBM//DTD DITA Reference//EN" "../dtd/reference.dtd"> <reference id="IssueNumber12020"> <title>DITA Proposed Feature # 12020</title> <shortdesc>Generic <keyword>text</keyword> element</shortdesc> <refbody> <section id="section_38D173E0641D48A2AD9EDA69FF7AD77C"> <title>Longer description</title> <p>Expand the content model of base elements which can contain #PCDATA so that they can also contain a new <keyword>text</keyword> element.</p><p>Previous discussion of this proposed feature:<ul id="ul_A068CC31FAB344A98C0AB6AB0F2890A3"><li id="li_3B95833007A74F17971B2B3A1E69C14F"><xref href=""http://www.oasis-open.org/committees/download.php/15522/IssueNumber02.html"" scope="external">Proposal for nesting keywords, from DITA 1.1</xref></li></ul></p> </section> <section id="section_837BA91C7C84446C89261B16CEA3A376"> <title>Statement of Requirement</title> <p>Heavy users of conref have a need to create fragments of text which can be re-used in almost any context. This example is from the DITA 1.1 keyword nesting proposal:<lq href=""http://www.oasis-open.org/committees/download.php/15522/IssueNumber02.html">The" <keyword>keyword</keyword> element is often used to store common text such as product or platform names. This is done because keywords are allowed in nearly all locations that allow text. However, it is not possible to combine common strings into one single <keyword>keyword</keyword> for reuse. For example, both <q>Product A</q> and <q>Platform B</q> are common strings, and are often used together as <q>Product A for Platform B</q>. To use the combined value, users must enter a new copy of each string in another <keyword>keyword</keyword>. Alternatively, they could always reference the two in text as <codeph><keyword conref="#topic/product"/> for <keyword conref="#topic/platform"/></codeph>. So, it is impossible to reuse the string <q>Product A for Platform B</q> without repeating text somewhere.</lq>Similar issues happen with other elements such as <keyword>term</keyword> and specializations of these and of <keyword>ph</keyword>.</p><p>Specializers may have a requirement to disallow mixed content in an element, to ensure correct arity of child elements. For example, a specialization of <keyword>keyword</keyword> may need strict alternation of text with <keyword>data</keyword> elements:<codeblock><spec-keyword> text<spec-data-1/><spec-data-2>...</spec-data-2> text<spec-data-1/><spec-data-2>...</spec-data-2> text<spec-data-1/><spec-data-2>...</spec-data-2> </spec-keyword></codeblock>This cannot be validated with XML Schema or DTD because mixed content models are too lax.</p> </section> <section id="section_DF08B44F81F3469CB8CD4A57D14268B6"> <title>Use Cases</title> <dl><dlentry><dt>Conref of small pieces of text</dt><dd><p>Some elements such as <keyword>keyword</keyword> and <keyword>term</keyword> have a content model which does not allow further nesting of elements (discounting <keyword>tm</keyword>, which is usually not generic enough). Building such an element from multiple strings does not give the user a place to hang a conref.</p><p>Similarly, this problem can flow through to specializations, so that it is not possible to conref any text into a <keyword>wintitle</keyword> element.</p><p>With this proposal, <keyword>text</keyword> is available in all elements (including <keyword>text</keyword> itself) that don't contain <keyword>ph</keyword>. Strings can be built from pieces and inserted by conref into any context:<codeblock><topic id="strings"> ... <body> <text id="productA">Product A</text> <text id="platformB">Platform B</text> <text id="AforB"><text conref="#strings/productA"/> for <text conref="#strings/platformB"/></text> </body> </topic> <topic> <title>Using <keyword><text conref="strings.xml#strings/AforB"/></keyword></title> ... </topic></codeblock></p></dd></dlentry><dlentry><dt>Removing mixed content from a specialization</dt><dd>Mixed content models in DTD and XML Schema cannot prevent the appearance of text in undesired places:<codeblock><!ELEMENT spec-keyword ((#PCDATA | spec-data-1 | spec-data-2)*)></codeblock>By placing the text content inside a container element:<codeblock><spec-keyword> <text>text</text><spec-data-1/><spec-data-2>...</spec-data-2> <text>text</text><spec-data-1/><spec-data-2>...</spec-data-2> <text>text</text><spec-data-1/><spec-data-2>...</spec-data-2> </spec-keyword></codeblock>validation can now be more strict: <codeblock><!ELEMENT spec-keyword ((text, spec-data-1, spec-data-2)*)></codeblock><note>It may be desirable for <keyword>text</keyword> to be specializable so that it has a more appropriate name in the context of the specialization.</note> </dd></dlentry></dl><draft-comment>Should specialization of <keyword>text</keyword> be allowed? It seems harmless to allow it (and one fewer exception to remember) but it would conflict with the tenet that <keyword>text</keyword> has no associated semantics. Allowing specializations of <keyword>text</keyword> would allow specializers great power. I can even see specialization of <keyword>text</keyword> as a way of getting Japanese Ruby to every context universally.</draft-comment></section> <section id="section_93C48ACC12164596AB1FE632A5E9983C"> <title>Scope</title> <p>Minor. Content models for a few dozen elements require an additional entry.</p> </section> <section id="section_A3670E84AD4343B1BD0D9062761A3C15"> <title>Technical Requirements</title> <p>The <keyword>text</keyword> element should be added to any element which allows #PCDATA but does not allow <keyword>ph</keyword>.<draft-comment>This differs from Michael Priestley's suggestion that only <keyword>keyword</keyword> and <keyword>ph</keyword> receive <keyword>text</keyword>. Adding to <keyword>term</keyword> and <keyword>tm</keyword> seems to be necessary too. The remaining elements (marked with an asterisk) already allow <keyword>keyword</keyword> but not <keyword>ph</keyword>. They all use the <keyword>%words.cnt</keyword> entity, so adding <keyword>text</keyword> to these would just be a matter of adding it to <keyword>%words.cnt</keyword>.</draft-comment> In DITA 1.1, these elements are<ul id="ul_7A6820240788452C833C969ACE232928"><li id="li_D0C3B1D145634630B16E34292580E542"><keyword>keyword</keyword></li><li id="li_08B0D42EC3D54E79ABD3E49F2A3E6971"><keyword>term</keyword></li><li id="li_BBAC82708A3447FCBC550AEF460C6E7F"><keyword>tm</keyword></li><li id="li_85474913837D43899FA6D920B4B0CE38"><keyword>alt</keyword>*</li><li id="li_AA6625B28E9A42E5B97D755BD2810090"><keyword>indexterm</keyword>*</li><li id="li_4E8127C6475848AB9AB169715B7609EA"><keyword>index-base</keyword>*</li><li id="li_6272A38DCE0D4E5EB0EE5294C39F3FA4"><keyword>linktext</keyword>*</li><li id="li_AC3F050A345D41DD934CA14FF4C4685E"><keyword>navtitle</keyword>*</li><li id="li_B869A888A28147A8BD854D31D94787C1"><keyword>searchtitle</keyword>*</li><li id="li_C89F78F307974D3694C3D0D3C54CB19D"><keyword>author</keyword>*</li><li id="li_D9488575FB2E42498BED6AF85236AB51"><keyword>source</keyword>*</li><li id="li_AFAAFE18142743FBAC1ED8C496144A45"><keyword>publisher</keyword>*</li><li id="li_F735077886214064B1C9DC351E0F6FAC"><keyword>copyrholder</keyword>*</li><li id="li_6076C79D88164C5FA065793FE1F7BAAA"><keyword>category</keyword>*</li><li id="li_DAA9BFB343F24F67B0F6FE8EC78BD569"><keyword>prodname</keyword>*</li><li id="li_0F44DE80E38F4E9382FCAF78556AD2C5"><keyword>brand</keyword>*</li><li id="li_7E5229DF049340069E8CF1EAB5C1916E"><keyword>series</keyword>*</li><li id="li_D837FCA69BBB41EF8AA59936F81CE99E"><keyword>platform</keyword>*</li><li id="li_88079AD03A30489FB731DBD5580EAF0F"><keyword>prognum</keyword>*</li><li id="li_6FBBA75946784B8B98B21923057FBD70"><keyword>featnum</keyword>*</li><li id="li_7F1BFB902A3742519F72947F152746EB"><keyword>component</keyword>*</li></ul>Items marked with an asterisk can contain <keyword>keyword</keyword> but not <keyword>ph</keyword>.</p><p>The <keyword>text</keyword> element should also be added to the content model of <keyword><ph></keyword> so that text can be conreffed into a phrase without additional semantics.</p><p>Specializations of these elements should also decide whether to include <keyword>text</keyword>. In DITA 1.1, these elements are<dl><dlentry><dt>Bookmap</dt><dd><keyword>revisionid</keyword>, <keyword>year</keyword> , <keyword>month</keyword> , <keyword>day</keyword>, <keyword>edition</keyword>, <keyword>isbn</keyword>, <keyword>volume</keyword>, <keyword>person</keyword>, <keyword>organization</keyword>, <keyword>summary</keyword>, <keyword>printlocation</keyword>, <keyword>bookpartno</keyword>, <keyword>booknumber</keyword> </dd></dlentry><dlentry><dt>Indexing domain</dt><dd><keyword>index-see</keyword>, <keyword>index-see-also</keyword>, <keyword>index-sort-as</keyword></dd></dlentry><dlentry><dt>Programming domain</dt><dd><keyword>option</keyword>, <keyword>parmname</keyword>, <keyword>synph</keyword>, <keyword>apiname</keyword>, <keyword>kwd</keyword>, <keyword>var</keyword>, <keyword>oper</keyword>, <keyword>delim</keyword>, <keyword>sep</keyword>, <keyword>repsep</keyword></dd></dlentry><dlentry><dt>Software domain</dt><dd><keyword>msgph</keyword>, <keyword>msgblock</keyword>, <keyword>filepath</keyword>, <keyword>userinput</keyword>, <keyword>systemoutput</keyword>, <keyword>msgnum</keyword>, <keyword>cmdname</keyword>, <keyword>varname</keyword></dd></dlentry><dlentry><dt>UI domain</dt><dd><keyword>wintitle</keyword>, <keyword>shortcut</keyword>, <keyword>uicontrol</keyword></dd></dlentry><dlentry><dt>Utilities domain</dt><dd><keyword>coords</keyword>, <keyword>shape</keyword></dd></dlentry><dlentry><dt>xNAL domain</dt><dd><keyword>honorific</keyword>, <keyword>firstname</keyword>, <keyword>middlename</keyword>, <keyword>lastname</keyword>, <keyword>generationidentifier</keyword>, <keyword>postalcode</keyword>, <keyword>country</keyword>, <keyword>contactnumber</keyword>, <keyword>otherinfo</keyword>, <keyword>addressdetails</keyword>, <keyword>locality</keyword>, <keyword>localityname</keyword>, <keyword>administrativearea</keyword>, <keyword>thoroughfare</keyword>, <keyword>emailaddress</keyword>, <keyword>url</keyword></dd></dlentry></dl></p><p>The content model of <keyword>text</keyword> is<codeblock><!ELEMENT text ((#PCDATA | text)*)></codeblock><keyword>text</keyword> contains all universal DITA attributes.</p> </section> <section id="section_D3C1B9A9A4A044D6BEC2B75EA17843F7"> <title>New or Changed Specification Language</title> <p>This element requires no addition to the architectural specification.</p><p>The language specification should have an entry in the element reference for <keyword>text</keyword>. Here is a suggested description:<lq><p>The <keyword>text</keyword> element associates no semantics with its content. It exists to serve as a container for text where a container is needed (e.g., for conref, or for restricted content models in specializations). Unlike <keyword>ph</keyword>, <keyword>text</keyword> cannot contain images. The <keyword>text</keyword> element contains only text data, or nested <keyword>text</keyword> elements. All universal attributes are available on <keyword>text</keyword>.</p><p>For contexts where <keyword>ph</keyword> is available, authors should use that element. Where <keyword>ph</keyword> is not available, <keyword>text</keyword> can be used to pull content by conref.</p></lq></p> </section> <section id="section_9C191253350C40A5A286FDF7AB8D490E"> <title>Costs</title> <p>DTDs and Schemas must be updated.</p><p>Implementations will need to include processing for <keyword>text</keyword>. This may require implementations to handle XML fragments where they once needed to handle only strings. Fallback behaviour (flattening the XML a la xsl:value-of) is not appropriate for <keyword>text</keyword> if attributes like <keyword>dir</keyword> or <keyword>translate</keyword> or filtering properties are present.<draft-comment>Flattening is also not appropriate for specializations of <keyword>text</keyword>, should they be allowed.</draft-comment></p> </section> <section id="section_C7EF8D6AF0FF4AAF9FDDA3F9412775EB"> <title>Benefits</title> <p>Greater re-use of boilerplate text by users and fewer (to users) arbitrary limitations. Specializers can have more control over content models by avoiding mixed content.</p> </section> </refbody> </reference>

issue_12020_pickett.pdf

Follow-Ups:
- Re: [dita] RE: Updated Strategy for DITA 1.2 Proposal 12020 (Reuse of smalltext)
  - From: Deborah_Pickett@moldflow.com

References:
- Re: [dita] RE: Updated Strategy for DITA 1.2 Proposal 12020 (Reuse ofsmall text)
  - From: Stan Doherty <Stanley.Doherty@Sun.COM>