dita message

Subject: Re: [dita] Supporting LwDITA Implementors Quickly

From: Kristen James Eberlein <kris@eberleinconsulting.com>
To: dita@lists.oasis-open.org
Date: Fri, 8 Feb 2019 08:16:32 -0500

I want to address a few points that Alan raises:

Role of the LwDITA committee note

An OASIS committee note is (by definition) a non-standards track work product. As such, a committee note cannot provide a "canonical definition." However, I do think the LwDITA committee note presents a clear description of the LwDITA design that is more than adequate for implementers to use to develop LwDITA applications and implementations.

Terminology that the DITA TC uses in formal OASIS documents

What terminology we can use is very context-driven. OASIS publications are divided into two tracks, and the terminology that we can use differs, I think, based on the track:

Standards track
OASIS specifications define an OASIS standard. They must follow the OASIS-provided template that covers both what a specification must contain and how it is formatted. Specifications must contain normative references, use keywords from either [RFC2119] or [ISO/IEC Directives], and include conformance clauses that refer to the normative statements (made using [RFC2119] or [ISO/IEC Directives]) that appear in the body of the specification.

What terminology can (or perhaps must) we use in specifications? This has not been explored or tested. One approach might be that we must use the vocabulary that is used in the specifications that are the normative references: DITA, XHTML, and XML specifications. We may well need to get OASIS Technical Advisory Board involved to help us here.
Non-standards track
OASIS committee notes provide ancillary information to assist in understanding and implementing the Standards Track work. They must follow the OASIS-provided template that covers both what a committee note must contain and how it is formatted. Committee notes must not contain normative information.

I'd argue that committee notes have greater latitude in the vocabulary used. If LwDITA advocates want to use the generic term "component," that is OK, although I wonder if the term "structural component" might be more precise.

Note that I am avoiding the term vocabulary, since I think it overlaps with meaning of what structural components (elements and attributes) are permitted in LwDITA.

Best,
Kris

Kristen James Eberlein
Chair, OASIS DITA Technical Committee
Principal consultant, Eberlein Consulting
www.eberleinconsulting.com
+1 919 622-1501; kriseberlein (skype)

On 2/7/2019 9:37 AM, Alan Houser wrote:

I'm always grateful for Eliot's insights, and I think the TC's recent discussion of LwDITA has been helpful in framing the vision of the LwDITA spec editors and SC and (ultimately and hopefully) aligning the SC's vision with that of the full TC. To address these points:

- I disagree that the TC should defer a formal LwDITA spec, and position the committee note as the canonical definition of LwDITA. But that's probably a longer discussion.Â :-)

- Re: the SC's confidence in the LwDITA vocabulary -- I'll posit that Eliot has been in the standards business long enough to know that this is impossible to answer. Having said that:

ÂÂÂ + The LwDITA vocabulary was chosen carefully, with input from representatives of the target audiences.

ÂÂÂ + At the risk of speaking for other SC members, we have high confidence in the LwDITA vocabulary.

ÂÂÂ + LwDITA avoids the strategic mistake made by other vocabulary subsets, of omitting a major use case explicitly to encourage use of the "full" vocabulary. I have in mind Simplified DocBook, which punted on content aggregation, which it could have very easily supported. I'll also acknowledge that I was not on the DocBook TC, but this was my perception from the outside.

ÂÂÂ + Any subset will have feature boundaries, and we expect (and have begun to see) tension around this. "Why doesn't LwDITA support X?". To this question, our answer should be "You should use DITA". I'll note that these questions tend to come from adopters who have DITA experience, not members of our newer LwDITA target audiences.

- I would caution against efforts to align or modify LwDITA terminology to match any particular authoring format. LwDITA has the goal of being source-format agnostic. We chose the term "components" carefully. This term was tested pretty heavily in the committee note reviews. We got some pushback from "XML-centric" reviewers, but these individuals appeared to come on-board when we explained the term. The term "works" for the three initial authoring formats, and will work for other mappings as they come.

HTH!

-Alan

On 2/5/19 12:37 PM, Eliot Kimber wrote:

Given that Lightweight DITA is a proper subset of DITA plus a mapping from HTML and markdown syntax, the important definition of LwDITA is:

- The set of elements and attributes used from full DITA
- The syntactic and semantic mapping from HTML and markdown to those elements (including illustrative examples)

Both of these things appear to be adequately defined in the published committee note.

As long as both the subset and the mappings are stable then the committee note is sufficient for implementors and the TC can communicate that to the community.

So that raises the question: How confident are we that the subset and mappings are stable?Â I'm assuming that the SC is happy with the current state of the LwDITA definition and does not intend to change it but is that true? We have working implementations of MDITA and HDITA processing, which suggests that the mapping as define in the committee note is good.

The committee note is not a formal specification, so we still need that, but I don't see the existence of that specification as being a prerequisite for implementations as long as what is specified in the committee note is stable. The committee note certainly presents itself as "this is what LwDITA is", not "this is kind of what we have in mind for LwDITA".

I will also observe that the GitHub markdown specification referenced in the committee note is defined in terms of the mapping from markdown to HTML so it certainly presumes an understanding of angle bracketed things by the target audience of that document, which clearly includes markdown processing implementors as well as authors (there's a lot of language in that spec about processing, including a whole section on parsing strategies). Likewise the HTML5 recommendation uses the term "element". I've always understood the markdown community to understand that it is fundamentally a syntax for representing HTML, not a thing that exists in isolation.

One of the challenges with discussing markdown is that it does not have a one-to-one correspondence between significant syntax strings and the HTML elements they imply. Given that, there is no single term that will work for both markdown syntax as well as HTML and XML elements.

That is, markdown is *not* a markup language, it's a set of keyboarding shortcuts for HTML (what in SGML were "short tags" and that we explicitly removed from XML to make parsing it simple enough and to make the syntax invariant). I have never seen any discussion of any form of abstract data model representation of markdown, only its mapping to HTML.

So it seems reasonable to me to continue to use the term "element" in general and find a more precise language to describe the markdown representation of LwDITA documents.

It might make sense, for example, to have a standalone specification or committee note that is only the MDITA specification, more in line with the GitHub markdown spec, rather than having only an all-in-one specification.

That is, if one of the goals is to make LwDITA most accessible to markdown-primary implementors and authors, it seems like a markdown-specific specification or committee note would be the best way to do that. That specification or committee note could then simply make reference to the separate full DITA specification or to a separate XDITA/HDITA specification that provides the LwDITA-specific reference entries for the LwDITA elements. The main LwDITA specification would still need to formally define the mapping from markdown strings to elements and attributes, as well as additional mechanisms required to capture things that markdown can't represent directly.

Cheers,

E.
--
Eliot Kimber
http://contrext.com
Â

References:
- Supporting LwDITA Implementors Quickly
  - From: Eliot Kimber <ekimber@contrext.com>
- Re: [dita] Supporting LwDITA Implementors Quickly
  - From: Alan Houser <arh@groupwellesley.com>