Re: [dita-lightweight-dita] Re:Mark Baker's article about XML

I thought there were some interesting echoes of lightweight DITA in his post.

Here's my presentation from a year ago, with slides "Why XML sucks" and (in backup) equivalent slides for markdown, calling for a mapping of lightweight DITA capabilities across HTML5/XML/markdown:

http://www.slideshare.net/mpriestley/does-dita-need-xml-lightweight-dita-and-html5

Here's my co-presentation with Carlos Evia and Lui Ai on lightweight DITA with markdown and JSON:

http://www.slideshare.net/mpriestley/does-dita-need-tags

The lightweight DITA subcommittee charter explicitly mentions markdown authoring:

https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=dita-lightweight-dita

As often with Mark, I find I agree with at least some of his premises, but not at all with his conclusions.

If structured content is facing adoption challenges with some audiences, markdown authoring will be the answer for some of them but not all (I wouldn't want to tell a marketing agency their content has to be authored in markdown, for example). But if you want the content to flow across the silos, and have common services/capabilities across different authoring cultures, then we need to have standards that map across those boundaries, not create yet another standard that requires yet another tool stack.

One of the values of lightweight DITA is that it aims to preserve diverse existing investments and attachments to authoring tools and formats, while focusing on the important commonalities of content typing and portability.

If we didn't care about existing investments/infrastructure/authoring cultures, we'd have the absolute freedom to make something up from scratch. But who would we be making it for?

Michael Priestley, Senior Technical Staff Member (STSM)
Enterprise Content Technology Strategist
mpriestl@ca.ibm.com
http://dita.xml.org/blog/michael-priestley

----- Original message -----
From: Noz Urbina <noz.urbina@urbinaconsulting.com>
Sent by: <dita-lightweight-dita@lists.oasis-open.org>
To: Fredrik Geers <fgeers@sdl.com>
Cc: "Mr. Jang Graat" <jang@jang.nl>, dita-lightweight-dita@lists.oasis-open.org
Subject: Re: [dita-lightweight-dita] Re:Mark Baker's article about XML
Date: Fri, Feb 5, 2016 12:33 PM

Sigh... I can't do another analysis of one of Mark's controversy-baiting articles.

On 2 February 2016 at 11:22, Fredrik Geers <fgeers@sdl.com> wrote:
The first point I get from this blogpost is that XML is not easy enough to author in a text editor, while Markdown is pretty easy to work with. And I can totally relate to that.
Second point is that working with Markdown is easier than working with XML in an XML editor, although that is an unfair comparison in my opinion, as the content model of Markdown is less expressive and much simpler than the average XML vocabulary.

So what can we take from both approaches to get to the best of both worlds? Should that be the focus of Lightweight DITA? Or should the focus be on making something that is as easy as possible in situations where control over the structure is needed, but acknowledge the fact that Markdown has its place in more free-formed content?
Or do we want to create something that can be as easy as Markdown, and can be extended to be as rich as specialized DITA? So that we can gradually transition from one type of content to another? Like having base topic types that can be fully expressed in Markdown, but specializations done in XML?

What is best in a situation comes down to what you need the markup for. For example: if I'm writing a user story in our backlog, I'm happy I can use something like Markdown to structure my description and create some headers, instead of having to create a specialization for the different sections I want to write, and then author in XML. This way I am very flexible in how I structure my writing, and I like that. I could agree with my colleagues to use the same format, but it is not enforced in any way, and we all have the freedom to adapt where needed.
I could also go the XML route, create a specialization of section for each header I had in my Markdown source, and define if it's optional or required. That way I lose flexibility, but gain control. The way I see it, that's one of the main factors for choosing something XML-based: you want control over the content – either because you care about what content is captured, or because you want to have more control over how the content is displayed.

To get that control, I still believe the best option would be to use XML. Trying to put too much into a Markdown-like syntax defeats the purpose: working with something like that will rely heavily on recalling the syntax. Authoring documents will then involve often checking a syntax list to see how much whitespace and special characters you need to accomplish something, and toggling a preview to check if the processing indeed picked up your syntax correctly. I already see that happening with features like adding links in the different flavors of Markdown.
But indeed, there is no perfect solution here, and there is a cost with choosing XML as well – no matter what flavor. Mark did a good job of surfacing that cost. Both when editing manually (verbosity) and when using an XML editor (problems Mark mentioned like copy/paste). But at least with XML you have the option to express almost anything. So my take on this is that we have to try to minimize that cost. And I believe there is a role for us (creators of XML editing tools) to solve those issues that arise when editing XML.
Some of the problems mentioned in his blogpost are already things of the past for our Knowledge Center/Xopus products for example. By understanding more about the semantics of the different elements, we can give better editing behavior for those elements. Also the obscure error messages can be prevented in a lot of situations. But there is more to be done, and we (and our competitors) are working to further improve the usability of XML editing.

And what can be done when it comes to Lightweight DITA to make XML "suck" not as much? A very consistent base vocabulary without the ambiguity of elements that can contain both text and paragraphs for example, as already in Michael Priestly's first plans. So I think we're already on the right path there.

Fredrik Geers | SDL | (t) +31 (0)20 203 2094 | (e) fgeers@sdl.com

www.sdl.com

SDL PLC confidential, all rights reserved. If you are not the intended recipient of this mail SDL requests and requires that you delete it without acting upon or copying any of its contents, and we further request that you advise us.

SDL PLC is a public limited company registered in England and Wales. Registered number: 02675207.
Registered address: Globe House, Clivemont Road, Maidenhead, Berkshire SL6 7DY, UK.

-----Original Message-----
From: dita-lightweight-dita@lists.oasis-open.org [mailto:dita-lightweight-dita@lists.oasis-open.org] On Behalf Of Mr. Jang Graat
Sent: Monday, February 1, 2016 10:47 AM
To: dita-lightweight-dita@lists.oasis-open.org
Subject: [dita-lightweight-dita] Re:Mark Baker's article about XML

Thanks, Mark (Giffin), for suggesting to read Mark Baker's article. I also think that Mark Baker is knowledgeable and experienced, and even though his book Every Page is Page One does repeat the same mantra just a little too much, there are some good points that anyone new to structured content could and should take away.

But reading his article on XML has pretty much ruined his reputation for me, as he is so obviously off the mark (yes, that is a bad pun) on so many aspects. Following his line of thought, the only good system to create meaningful text with is an old-fashioned mechanical typewriter. Just use whitespace to add all the meaning you need. I have done that, many times, in my student days, and yes, I use whitespace in this email to separate the paragraphs and pause the thinking of the reader before moving on to a new point.

But this has nothing at all to do with meaningful markup, and it does not mean that Mark has a valid point about the supposedly bad move of XML doing away with whitespace.

Mark is utterly confused about the difference between the notation format (XML) and the tools that present the information in a meaningful way to a human reader or author. It is bad tooling that Mark is writing about, not a bad notation standard. If Mark would have looked beyond the surface of well-known "easy" word processors like MS Word, he would know that almost every computer file uses some kind of XML for its notation format.

The statement that XML is void of semantics is hilarious: all of XML is semantics. But semantics only has meaning in context, which is true for any signals humans use to communicate. It is exactly the whitespace that does NOT have meaning. And after his rant on the meaninglessness of XML and the value of systems without markup but lots of meaningful whitespace, he introduces what? Another markup language called SAM. Another markup language some poor author will have to learn to constrain all the round semantics in his mind and try to make it fit in the square holes that Mark's SAM will come up with.

Mark's main problem is that he cannot imagine other people having different set of semantic labels in their mind with which they want or need to create a model of their world in writing. The learning curve in understanding someone who speaks Slovenian might be higher than just sticking with your country folk, but it does not do justice to anyone to simply state that their view of the world - and the words or labels they choose to describe it - are the problem.

In my opinion, laziness is the real problem. Laziness of a lot of authors who do not want to learn about semantics in the XML that they are offered, laziness in the software developers of tools that cannot make those semantics manageable without all the ugly side effect (angular brackets, verbose labels etc), and laziness in the people who are supposed to configure the systems that do offer writer-friendly ways of presenting the semantics from the XML labels to the authors who have to work with them.

A lightweight markup language without the option to specialise and add my own meaningful labels is a dead horse, as it will never go anywhere beyond the limited horizon of the use case for which it was intended. And for Mark (Baker), I suggest he replaces his SAM with a good old Remington.

Now let's hear it from the others in this think tank.

Jang

This message has been scanned for malware by Websense. www.websense.com

--
Noz

Content Strategist, www.urbinaconsulting.com
Co-Author of "Content Strategy: Connecting the dots between business, brand and benefits"

dita-lightweight-dita message