I want to address a few points that Alan raises:
Role of the LwDITA committee note
An OASIS committee note is (by definition) a non-standards track
work product. As such, a committee note cannot provide a
"canonical definition." However, I do think the LwDITA committee
note presents a clear description of the LwDITA design that is
more than adequate for implementers to use to develop LwDITA
applications and implementations.
Terminology that the DITA TC uses in formal OASIS documents
What terminology we can use is very context-driven. OASIS
publications are divided into two tracks, and the terminology that
we can use differs, I think, based on the track:
- Standards track
OASIS specifications define an OASIS standard. They must follow
the OASIS-provided template that covers both what a
specification must contain and how it is formatted.
Specifications must contain normative references, use keywords
from either [RFC2119] or [ISO/IEC Directives], and include
conformance clauses that refer to the normative statements (made
using [RFC2119] or [ISO/IEC Directives]) that appear in the body
of the specification.
What terminology can (or perhaps must) we use in specifications?
This has not been explored or tested. One approach might be that
we must use the vocabulary that is used in the specifications
that are the normative references: DITA, XHTML, and XML
specifications. We may well need to get OASIS Technical
Advisory Board involved to help us here.
- Non-standards track
OASIS committee notes provide ancillary information to assist in
understanding and implementing the Standards Track work. They
must follow the OASIS-provided template that covers both what a
committee note must contain and how it is formatted. Committee
notes must not contain normative information.
I'd argue that committee notes have greater latitude in the
vocabulary used. If LwDITA advocates want to use the generic
term "component," that is OK, although I wonder if the term
"structural component" might be more precise.
Note that I am avoiding the term vocabulary, since I think
it overlaps with meaning of what structural components (elements
and attributes) are permitted in LwDITA.
Kristen James Eberlein
Chair, OASIS DITA Technical Committee
Principal consultant, Eberlein Consulting
+1 919 622-1501; kriseberlein (skype)
On 2/7/2019 9:37 AM, Alan Houser wrote:
always grateful for Eliot's insights, and I think the TC's recent
discussion of LwDITA has been helpful in framing the vision of the
LwDITA spec editors and SC and (ultimately and hopefully) aligning
the SC's vision with that of the full TC. To address these points:
- I disagree that the TC should defer a formal LwDITA spec, and
position the committee note as the canonical definition of LwDITA.
But that's probably a longer discussion.Â :-)
- Re: the SC's confidence in the LwDITA vocabulary -- I'll posit
that Eliot has been in the standards business long enough to know
that this is impossible to answer. Having said that:
ÂÂÂ + The LwDITA vocabulary was chosen carefully, with input from
representatives of the target audiences.
ÂÂÂ + At the risk of speaking for other SC members, we have high
confidence in the LwDITA vocabulary.
ÂÂÂ + LwDITA avoids the strategic mistake made by other vocabulary
subsets, of omitting a major use case explicitly to encourage use
of the "full" vocabulary. I have in mind Simplified DocBook, which
punted on content aggregation, which it could have very easily
supported. I'll also acknowledge that I was not on the DocBook TC,
but this was my perception from the outside.
ÂÂÂ + Any subset will have feature boundaries, and we expect (and
have begun to see) tension around this. "Why doesn't LwDITA
support X?". To this question, our answer should be "You should
use DITA". I'll note that these questions tend to come from
adopters who have DITA experience, not members of our newer LwDITA
- I would caution against efforts to align or modify LwDITA
terminology to match any particular authoring format. LwDITA has
the goal of being source-format agnostic. We chose the term
"components" carefully. This term was tested pretty heavily in the
committee note reviews. We got some pushback from "XML-centric"
reviewers, but these individuals appeared to come on-board when we
explained the term. The term "works" for the three initial
authoring formats, and will work for other mappings as they come.
On 2/5/19 12:37 PM, Eliot Kimber wrote:
Given that Lightweight DITA is a proper
subset of DITA plus a mapping from HTML and markdown syntax, the
important definition of LwDITA is:
- The set of elements and attributes used from full DITA
- The syntactic and semantic mapping from HTML and markdown to
those elements (including illustrative examples)
Both of these things appear to be adequately defined in the
published committee note.
As long as both the subset and the mappings are stable then the
committee note is sufficient for implementors and the TC can
communicate that to the community.
So that raises the question: How confident are we that the
subset and mappings are stable?Â I'm assuming that the SC is
happy with the current state of the LwDITA definition and does
not intend to change it but is that true? We have working
implementations of MDITA and HDITA processing, which suggests
that the mapping as define in the committee note is good.
The committee note is not a formal specification, so we still
need that, but I don't see the existence of that specification
as being a prerequisite for implementations as long as what is
specified in the committee note is stable. The committee note
certainly presents itself as "this is what LwDITA is", not "this
is kind of what we have in mind for LwDITA".
I will also observe that the GitHub markdown specification
referenced in the committee note is defined in terms of the
mapping from markdown to HTML so it certainly presumes an
understanding of angle bracketed things by the target audience
of that document, which clearly includes markdown processing
implementors as well as authors (there's a lot of language in
that spec about processing, including a whole section on parsing
strategies). Likewise the HTML5 recommendation uses the term
"element". I've always understood the markdown community to
understand that it is fundamentally a syntax for representing
HTML, not a thing that exists in isolation.
One of the challenges with discussing markdown is that it does
not have a one-to-one correspondence between significant syntax
strings and the HTML elements they imply. Given that, there is
no single term that will work for both markdown syntax as well
as HTML and XML elements.
That is, markdown is *not* a markup language, it's a set of
keyboarding shortcuts for HTML (what in SGML were "short tags"
and that we explicitly removed from XML to make parsing it
simple enough and to make the syntax invariant). I have never
seen any discussion of any form of abstract data model
representation of markdown, only its mapping to HTML.
So it seems reasonable to me to continue to use the term
"element" in general and find a more precise language to
describe the markdown representation of LwDITA documents.
It might make sense, for example, to have a standalone
specification or committee note that is only the MDITA
specification, more in line with the GitHub markdown spec,
rather than having only an all-in-one specification.
That is, if one of the goals is to make LwDITA most accessible
to markdown-primary implementors and authors, it seems like a
markdown-specific specification or committee note would be the
best way to do that. That specification or committee note could
then simply make reference to the separate full DITA
specification or to a separate XDITA/HDITA specification that
provides the LwDITA-specific reference entries for the LwDITA
elements. The main LwDITA specification would still need to
formally define the mapping from markdown strings to elements
and attributes, as well as additional mechanisms required to
capture things that markdown can't represent directly.