Re: [dita-lightweight-dita] Request for exemplar templates

dita-lightweight-dita message

Subject: Re: [dita-lightweight-dita] Request for exemplar templates

From: Don Day <donday@donrday.com>

To: dita-lightweight-dita@lists.oasis-open.org

Date: Tue, 9 Feb 2016 13:43:56 -0600

Thank you, Michael. The pointer helped me, and at this point, I'd say it is good enough considering that we may be modifying the "spec" considerably as things progress.

I took a shot at an XSLT based discombobulator and have some initial thoughts:

There are a range of possible implementations to consider. The ultimate might be a DITA-aware DTD in respecializable syntax, but the minimum might well be simply an identity transform that strips out the hints for a result that is ready for use as a field-based form. I consider both those examples to be illustrative endpoints, but:
a) we need to decide if there are any in-between use cases that are goals for this group (I can imagine an editor driven solely by Schematron rules that enshrine the template's intent, for example),
b) we need a story about how such a template can be created in the first case (e.g., with a validating editor I can create an Erik Hennum-style "specialization by example" -- http://svdig.ditamap.com/SpecializationWhyWhenHow.ppt). That process is not possible with Michael's template which contains special attributes.

Phase 1: Use an identity transform to parse the template into a web-based form using contentEditable sections. This is eminently doable and gets the resulting implementation into a useful form in the fastest manner.

Phase 2: Allocate some time to explore how far we can go with an XSLT-based analyzer. This could be open-ended which is why I think we need a way to budget the time and at least come away with recommendations. The difficulty lies in using XSLT for pure algorithmic function, not simply as a transform. XSLT 2.0 and 3.0 offer increasing capability here (some examples: http://www.saxonica.com/papers/ideadb-1.1/mhk-paper.xml).

Phase 3: Allocate some time to explore developing a separate parser in the programming language that will be used for the other algorithmic function. We'd be trading "easy to parse; hard to compute" for "hard to parse; easy to compute" as it were. I don't know if one is better than the other; it may be a case of which skills one is most productive with.

Yes, let's set up a Git home at OASIS. I'd like to ask someone's help with this. And to seed it, attached are the quasi-Phase 2 starter files I've worked with. All I did was make the template to be well-formed xml and extended an identity transform to get to where I could see the flat decomposition of all the node types (see the HTML result, where the numerics inside the square brackets indicate the string length so that I could see "whitespace" contributions to the view). From this one would have to count occurrences, capture nesting relationships, etc. to build up a text-based DTD-like output rendition of the model (not a transformation, per se). I added my PHP-based transform "ANT" file--you can use your preferred approach from here on. Ask me offline if you want a how-to, which was not my intent in this note.

I will turn my own work to the Phase 1 approach since I'm much closer to having that as a working demo in the application that spawned the template suggestion.

Phase 3 (if someone wants to try it) could be implemented in one's favorite language. I have a very simple state machine pseudocode for parsing the well-formed XML syntax that you can implement in Python, Java, whatever; the rest of the algorithm follows the Phase 2 approach.

Hoping this helps kick things off.

Don R. Day
Founding Chair, OASIS DITA Technical Committee
LinkedIn: donrday Twitter: @donrday
About.me: Don R. Day Skype: don.r.day

"Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?"
--T.S. Eliot

On 2/5/2016 12:45 PM, Michael Priestley wrote:

Hi Don,

Sorry for the slow reply.

In the example I linked to, I just adapted your tlotermtopic specialization, with the addition of a conref to another file to pull in the definition for some section or other - we could leave that off for now, but do want to get to conref-based specialization reuse just because I think it's cool ;-)

The email with my summary/example is the last in the thread - can you let me know what your concerns are?

Maybe we need a formal spec - I was hoping that my last email in the thread was specific enough to code from, but I'm not the right judge of that. Please take a look at this email in detail and let me know if there's enough info to code a rapid prototype, or if you have concerns that need to be addressed first (which is fine, I just want specifics, relative to the existing proposal):

https://lists.oasis-open.org/archives/dita-lightweight-dita/201509/msg00027.html

Michael Priestley, Senior Technical Staff Member (STSM)
Enterprise Content Technology Strategist
mpriestl@ca.ibm.com
http://dita.xml.org/blog/michael-priestley

From: Don Day <donday@donrday.com>
To: dita-lightweight-dita@lists.oasis-open.org
Date: 01/26/2016 02:00 PM
Subject: [dita-lightweight-dita] Request for exemplar templates
Sent by: <dita-lightweight-dita@lists.oasis-open.org>

Michael, of the templates that have been discussed, are any of them a reasonably simple exemplar that we should start with? Perhaps we need to start off with a "should do this" dummy example that does not yet carry extra requirements that would distract from getting to a proof point quickly. I seem to recall that parts of the syntax we'll be parsing may not be expressible as valid or perhaps even well-formed XML, so we need to get past that hurdle first, for example.

Also, for the templates that have been discussed, has each one's final design been documented, or will we need to read the discussion to arrive at the vetted design? I just don't have a sense of completion on those discussions. (And if this was covered at Monday's meeting, I deserve being lashed with wet angle brackets.)
--
Don

On 1/26/2016 10:29 AM, Michael Priestley wrote:
Should we use the new OASIS open repository process for this work?

https://lists.oasis-open.org/archives/dita/201601/msg00011.html

Let's us do this as an officially sanctioned OASIS opensource project at github.

Re inputs/outputs:

- input would be a topic that uses the updated doctype (with new elems/atts) as described in the email summary linked below
- output would be (at least initially) a single-file doctype definition using either DTD syntax, RNG syntax, or XSD syntax that defines the new specialization and provides the validation rules for it

Michael Priestley, Senior Technical Staff Member (STSM)
Enterprise Content Technology Strategist
mpriestl@ca.ibm.com
http://dita.xml.org/blog/michael-priestley

----- Original message -----
From: Mark Giffin <mark@markgiffin.com>
Sent by: <dita-lightweight-dita@lists.oasis-open.org>
To: Don Day <donday@donrday.com>
Cc: Carlos Evia <cevia@vt.edu>, tgrantham@timgrantham.com, "dita-lightweight-dita@lists.oasis-open.org"<dita-lightweight-dita@lists.oasis-open.org>
Subject: [dita-lightweight-dita] Generating schemas from templates
Date: Mon, Jan 25, 2016 12:23 PM

Hi Don,

I'm the culprit who volunteered you, I figured you would be interested. Also involved in this are Tim Grantham and Carlos Evia. In the meeting Michael P talked about making the tool that would convert a template to a specialization, based on the info at the bottom of this email and especially this thread:

http://markmail.org/message/pd4u5kfg44xp5x5c

Michael wanted XSLT that would run on the template file and output a specialization. I believe this output includes a topic and also a schema, and I mentioned making a RelaxNG schema. But I'm a bit muddy on what the output would be exactly, please correct me or verify.

Mark Giffin
Mark Giffin Consulting, Inc.
http://markgiffin.com/

On 1/25/2016 9:03 AM, Don Day wrote:
My apologies to all; I intended to meet but had the wrong time in mind. I deserve to have been volunteered for far worse; this fate looks acceptable. If anyone captured some insights that would be helpful, please put those in an email to the under a clear subject line so that we can discuss the approach (e.g., "Generating schemas from templates"). Thanks!
--
Don

On 1/25/2016 10:53 AM, Michael Priestley wrote:
Per our call today - there's a link below to the draft specialization architecture.

Mark, Carlos, Tim, and Don have (been) volunteered to take that email and implement it:

- adding the required elements/attributes to the base topic/map DTDs
- creating XSLT to generate RNG (or DTD, XSD...) from a template topic (ie any DITA topic, with/without use of the new template attributes/elements)

Michael Priestley, Senior Technical Staff Member (STSM)
Enterprise Content Technology Strategist
mpriestl@ca.ibm.com
http://dita.xml.org/blog/michael-priestley
----- Forwarded by Michael Priestley/Toronto/IBM on 01/25/2016 11:48 AM -----

From: Michael Priestley/Toronto/IBM@IBMCA
To: dita-lightweight-dita@lists.oasis-open.org
Date: 01/11/2016 09:59 AM
Subject: [dita-lightweight-dita] Current status of lightweight DITA spec
Sent by: <dita-lightweight-dita@lists.oasis-open.org>

Here's where I think we are:

Draft doctypes for topic/map:
https://tools.oasis-open.org/version-control/svn/dita/subcommittees/LightweightDITA/org.oasis.lwdita/

Draft specialization architecture:
http://markmail.org/message/pd4u5kfg44xp5x5c

Industry/domain validation:
- learning and training - analyzed scenarios, proposed new specializations adapting to lightweight model
- marketing - analyzed scenarios, working towards new specialization proposals
- software development - analyzed scenarios
- machine industries - analyzed scenarios

Cross-format mappings:
- markdown and HTML5 in progress, need updating
- JSON investigated
- others proposed: PPT, other slide formats, Word

Michael Priestley, Senior Technical Staff Member (STSM)
Enterprise Content Technology Strategist
mpriestl@ca.ibm.com
http://dita.xml.org/blog/michael-priestley

This email has been sent from a virus-free computer protected by Avast.
www.avast.com

--------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php

This email has been sent from a virus-free computer protected by Avast.
www.avast.com

This email has been sent from a virus-free computer protected by Avast.
www.avast.com

<topic id="termdef_term" outputclass="tlotermtopic"> <title outputclass="tloterm">Structured Content</title> <prolog outputclass="tlotermprolog"> <data outputclass="tlotermauthor" specrole="prompt">author name here</data> <specmeta> <ph outputclass="tlophrase" specrole="doc">A new global phrase element</ph> <data outputclass="tlodata" specmodel="choice" specrole="modelonly"> Simple text only for this global data specialization, but with a different specmodel you could do anything </data> <specatt outputclass="tloatt" specrole="doc"> A conditional processing attribute called tloatt </specatt> </specmeta> </prolog> <body outputclass="tlotermbody" specmodel="sequence"> <section outputclass="tlowhat"> <title specrole="generate">What is it?</title> <p>...</p> </section> <section outputclass="tlowhy"> <title specrole="generate">Why is it important?</title> <p>...</p> </section> <section outputclass="tloessay"> <title specrole="generate">Why does a technical writer need to know this?</title> <p>...</p> </section> <section outputclass="tlosummary" collection-type="sequence"> <title specrole="generate">Summary:</title> <p>text node with <ph>allowed</ph> phrase.</p> </section> <section conref="task-spec.dita/task-def/postreqs"/> </body> </topic>

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; xmlns:php="http://php.net/xsl"; >  <xml:strip-space elements="*"/>  <xsl:output method="xml" indent="no" /> <xsl:template match="/"> <ul> <xsl:apply-templates/> </ul> </xsl:template> <xsl:template match="*"> <li> <xsl:if test="not(name(.) = '')"> <b><xsl:value-of select="name(.)"/></b> </xsl:if> <xsl:apply-templates/> </li> </xsl:template> <xsl:template match="@*"> <xsl:value-of select="local-name()" />='<xsl:value-of select="." />'<br/> </xsl:template> <xsl:template match="text()"> <xsl:if test="not(string-length(.) = 0)"> [<xsl:value-of select="string-length(.)"/>:<xsl:value-of select="." />] </xsl:if> </xsl:template> </xsl:stylesheet>

topic [2: ]

title [18:Structured Content]

[3: ]

prolog [3: ]

data [16:author name here]

[3: ]

specmeta [4: ]

ph [27:A new global phrase element]

[4: ]

data [120: Simple text only for this global data specialization, but with a different specmodel you could do anything ]

[4: ]

specatt [57: A conditional processing attribute called tloatt ]

[3: ] [2: ] [2: ]

body [4: ]

section [5: ]

title [11:What is it?]

[5: ]

p [3:...]

[3: ] [4: ]

section [5: ]

title [20:Why is it important?]

[5: ]

p [3:...]

[3: ] [4: ]

section [5: ]

title [46:Why does a technical writer need to know this?]

[5: ]

p [3:...]

[3: ] [4: ]

section [5: ]

title [8:Summary:]

[5: ]

p [15:text node with ]

ph [7:allowed]

[8: phrase.] [3: ] [4: ]

section

[2: ] [2: ]