Re: [dita-lightweight-dita] HDITA and XDITA just got easier... I think?

dita-lightweight-dita message

Subject: Re: [dita-lightweight-dita] HDITA and XDITA just got easier... I think?

From: Don Day <donday@donrday.com>

To: dita-lightweight-dita@lists.oasis-open.org

Date: Fri, 29 Jul 2016 16:44:15 -0500

Nice, Carlos. May I ask what keyword in MDITA indicates the task type?

This got me thinking about the dictation-oriented approach I had tried awhile back for tasks. I coded your example using my dictation notation, and without adapting the parser for ol awareness, I still got a reasonable result:

http://learningbywrote.com/demo/pt.php

The Ingredients section needs a semantic identifier to map it to a prereqs section with an appropriate title (which can be generated if using a recipe specialization). Because my converter anticipated only going from a context to the steps, I was unable anyway to contain the list of ingredients properly, hence their odd placement. In a general topic, I think they would have mapped properly.
This converter assumes word-processor-like "styled paragraphs" as input, which scope the content by contextual assumptions (if a results section has started, then the step items are done and we can wrap them in their correct container, and so forth).
In effect, it is possible to have a simple, declarative format that fits between Markdown and XDITA in semantic richness while still being free of concrete markup. There is nothing sacred in the exact syntax I used originally--modern voice parsers can use a trigger word to interact with a voice-enabled editor. Someone who is just typing directly would probably use a heading style or reserved title string to indicate semantic sections. This could totally be enabled in Word using named styles.
As such, this declarative approach is actually format insensitive, yet intrinsically DITA-aware because the "schema" encoded into the logic of the parser is able to enforce DITA type- and specialization- rules based on styles and context.
An internalized version of this "schema" would say things like:

any first paragraph is the shortdesc;
any second paragraph in what we know is a task becomes a context;
any step allows another next block of step, but any non-step block delimits the sequence and we can wrap it as a steps section;
any result chunk is intrinsically the last thing in the task, so we can close and output the new DITA task syntax for the represented structure;
any first sentence (delimited by full stop '. ') in a step is a cmd; remaining content is the info.
And one that is missing now: any list item prior to a step pushes a new <prereq><ol> context onto the parse stack; when we hit any step, we close and wrap that context and pop the structure stack back to the body level (ready to set up the "required" <steps> context).

The schema would be based on assertions like these:
$closedby['p'] = 'p ul ol dl lq section'; (a 'p' context is closed by any of this list)
$closedby['section'] = 'section /body /section /topic'; (a 'section' context ends with any of these start or implied end cues)
The processing is identical for steps and list items, as long as the context allows a list. Hence the "specialization" is a matter of mapping name equivalence somehow. I haven't thought about how to declare this relationship, but it would be a component of the externalized schema, just as with regular DITA. In this sense, I think (not tested) that the parser would fall back to default mappings. But it is strongly topic typed in the sense that you can't use modules outside of any prescribed contextual order (since the mappings are intended to result in valid DITA). You can see how a final numbered list item creates its own list outside of the result context--the rules did not anticipate ol as a content model of result.
I also inserted a generated prolog here. The problem with metadata is that is is really far simpler than most required markup; for the most part, it's just name/value assignments with no required structure, as long as we know it is metadata. But we do have to plug it into the required XML syntax. .ini file syntax is about a simple as it gets, IMO.

The point of mentioning this is that a styles-based approach for describing semantic information types can be compatible with DITA. The question is whether this falls into scope or not. It is out of scope in the sense of having no traction at the moment; it is in scope in that we demonstrate here that it works in concept, and the principle of styles-based schemas may be of interest as a way to support DITA awareness in a word processor (or dictation scenario; this could totally be done as an app on a smart phone, with more verbal editor controls added to the syntax--'up two; delete' for example) (or drag and drop chunks in an HTML interface).
--
Don

On 7/26/2016 8:18 AM, Carlos Evia wrote:

Dear all,

As I was working on the outline for our LwDITA spec, I noticed in our current version we only have a generic topic. We are not, as you all probably know, pushing for default specialized content types. This means LwDITA does not come, in this first version, with concept, task, reference, troubleshooting, and glossary.

However, all our LwDITA examples so far (from the STC paper and our DITA NA presentations from 2015 and 2016) assume that in HDITA and MDITA there will be concept, task, and reference types... but... that would be a mess for XDITA. If there are no specialized types in the spec, then we do not need custom elements in HTML5, as Michael's original HDITA model uses existing HTML5 tags (article, section, ul, ol, etc.). Jarno’s plugins for XDITA and HDITA also assume we will have concept, task, and reference, but the implementation is problematic.

I present here a recipe for Marinara sauce (borrowed from a Sarah O'Keefe XML example) in what we currently have for XDITA, HDITA, and MDITA. As you can see, the markup is very simple in all cases, and then our concerns should be the following:

- Reuse for XDITA and HDITA... keys, conrefs... whatever we need

- Topic-level metadata (prolog equivalent) in XDITA and HDITA. I am using YAML headers in my examples, but Don pointed out that they duplicate the "title" element I already have in the H1. Should we use YAML or JSON?

- Specialization by template in the LwDITA model. We are only allowing that in XDITA, right??

This makes life way easier for HDITA and XDITA. Some people at conferences have asked me if LwDITA will allow any well-formed HTML5 or Markdown file to play as HDITA or MDITA, and I think we can propose that. Please let me know if this is crazy.

Here are the examples:

In XDITA,

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE topic PUBLIC "-//OASIS//DTD LW DITA Topic//EN" "topic.dtd">

<topic id="t-marinara">

<title>Marinara sauce</title>

<prolog>

<data>Some sort of metadata for author and type??? What is allowed in the XDITA prolog?</data>

</prolog>

<body>

Prepare a crowd-pleasing red sauce for pasta in about 30 minutes.



<ul>

<li>2 tbsp. of olive oil</li>

<li>2 cloves of garlic, minced</li>

<li>1/2 tsp. of hot red pepper</li>

<li>28 oz. of canned tomatoes, preferably San Marzano</li>

<li>2 tbsp. of parsley, chopped</li>

</ul>



<ol>

<li>Heat olive oil in a large saucepan on medium</li>

<li>Add garlic and hot red pepper and sweat until fragrant</li>

<li>Add tomatoes, breaking up into smaller pieces</li>

<li>Simmer on medium-low heat for at least 20 minutes</li>

<li>Add parsley</li>

<li>Simmer for another five minutes</li>

<li>Serve over long pasta.</li>

</ol>



</body>

</topic>

------------------------

In HDITA,

---

author: Unknown

tags:

- Italian

---

<article id="t-marinara">

<h1>Marinara sauce</h1>

Prepare a crowd-pleasing red sauce for pasta in about 30 minutes.

<ul>

<li>2 tbsp. of olive oil</li>

<li>2 cloves of garlic, minced</li>

<li>1/2 tsp. of hot red pepper</li>

<li>28 oz. of canned tomatoes, preferably San Marzano</li>

<li>2 tbsp. of parsley, chopped</li>

</ul>

<ol>

<li>Heat olive oil in a large saucepan on medium</li>

<li>Add garlic and hot red pepper and sweat until fragrant</li>

<li>Add tomatoes, breaking up into smaller pieces</li>

<li>Simmer on medium-low heat for at least 20 minutes</li>

<li>Add parsley</li>

<li>Simmer for another five minutes</li>

<li>Serve over long pasta.</li>

</ol>

</article>

--------------------------------

In MDITA,

---

author: Unknown

tags:

- Italian

---

# Marinara Sauce {#t-marinara}

Prepare a crowd-pleasing red sauce for pasta in about 30 minutes.

- 2 tbsp. of olive oil

- 2 cloves of garlic, minced

- 1/2 tsp. of hot red pepper

- 28 oz. of canned tomatoes, preferably San Marzano

- 2 tbsp. of parsley, chopped

1. Heat olive oil in a large saucepan on medium

2. Add garlic and hot red pepper and sweat until fragrant

3. Add tomatoes, breaking up into smaller pieces

4. Simmer on medium-low heat for at least 20 minutes

5. Add parsley

6. Simmer for another five minutes

7. Serve over long pasta.

Best,

Carlos

--

Carlos Evia, Ph.D.

Director of Professional and Technical Writing

Associate Professor of Technical Communication

Department of English

Center for Human-Computer Interaction

Virginia Tech

Blacksburg, VA 24061-0112

(540)200-8201

Don R. Day
Founding Chair, OASIS DITA Technical Committee (current version: DITA 1.3)
LinkedIn: donrday Twitter: @donrday
About.me: Don R. Day Skype: don.r.day

"Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?"
--T.S. Eliot

Virus-free. www.avast.com