OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita-lightweight-dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [dita-lightweight-dita] Tables and more


We have two general design routes before us:

1. To design a source model that aligns very closely with the Web model, meaning that it can intercept with the tools commonly used for Web authoring and publishing.
2. To design a source model that represents a CMS-informed way of managing parts of content, meaning that special departures from the Web Way can only be handled by vendors who invest into the special structures we invent.

The second line of thinking seems to be dominating the discussion. The final result will likely be a compromise of the two informing principles and for each feature we'll need to justify the benefit of which of the two approaches we've selected.

The table discussion is a case in point. From an SGML CMS point of view, CALS is valuable because general XML tools already support the common OASIS-defined behaviors including silent computation of colspec and span metadata. If we plan to turn LwD into a CMS-supportive design, CALS is the way to go. If the plan is to engage Web-based tools in doing the authoring work, then the HTML model should predominate our justifications. If we modify the default expectations of existing tools for handling either model, we risk alienating our design to the pile of other well-intended CMS-based designs that had no intersection with the knowledge and tools preferred by the intended community. On this matter, I strongly urge consideration of the HTML table as the forward representation for both LwD and D2.0.

However, DITA has always afforded a way for its designers to fix things that are broken in the Web's implicit information models. HTML5 in particular has lots of semantic markup but provides no mechanism for managing repeatable structures, and often ends up casting existing markup into HTML structures that mimic what DITA easily manages. In general, the best role for LwD is for fixing what is wrong with the Web's data model. In any design cases that involves departing from the Web architecture (such as semantic naming, data models, processing, inter-source dependencies like conref, etc.), XML is our way of fixing what's broken and thereby bringing value that we (I for one!) hope to offer to the Web community to induce greater uptake. I call those Web deficiencies "Holes in the Web" in order to brand the discussion--if DITA's non-Web-aligned models don't actually fix those Holes in the Web, we are only adding ignorable complexity.

One case in point is relational data management. HTML5 offers no conventional data structure for relational data, so authors typically press the presentation-intended HTML table into that role. Simpletable is not just a simpler table, although it can be used as such. Its intent from early DITA days has been to fix this particular Hole In the Web for exported relational data (particularly as data islands that local logic can manipulate).As such, simpletable is sufficient as designed because the "title" if anything is the SQL request used to populate it as a result view ("Users by age" or "Sprockets sorted by nub count"). The standard way to apply a human version of a title to query-oriented data is to embed the title with the data in a referenceable container, which is exactly what DITA's fig/title/whatever model enables (and which HTML5 now somewhat captures with its new figcaption element, except it was NAMED FIGCAPTION instead of CAPTION, so its use has been too narrowly predetermined--fie on them!). To label a simpletable, simply include it in a fig wrapper with a title element and you potentially gain the shared specialization handling of figure list naming/referencing/collecting in lookup tables, etc.).

What I would change in DITA to make this fix more general for all instances of this particular Hole in the Web is rename fig to exhibit (and then specialize it to wrap semantically-typed content like audio, video, data tables, images, equations, SVG, etc.).

I would then fix another related hole in both DITA and the Web: we used <title> for labels and captions, and these are not at all the same as headings. HTML5's figcaption is at least a step in the right direction, as is the existing caption element in a conventional HTML table. But both architectures end up pressing heading elements into other label-like roles, which only confounds authors and contributes to convoluted rules for overriding presentation and nesting intent in both languages. If DITA's section had <label> as its first child, I contend we could more easily argue why sections don't contribute to hierarchy. The fix in both architectures is to augment a generic exhibit element with a <caption> or <label> replacement for all the places where <title> (DITA's overloaded solution) and <h5> (HTML's most common fallback for  "ignore this as a heading") end up being used.

TL;DR summary: All that to say that we should consider preferring HTML5 tables over CALS for presentational cases, keep simpletable as is for data structures that the Web model fails at  managing as views outside of a database, change LwD's base fig/title to exhibit/caption (and then specialize fig for its resource-specific role, and same for audio and video etc.), nd replace title in section with caption (I see caption and label as semantically identical, but caption may help authors better internalize why "a caption is not a hierarchical title").

Why this matters:
The web manages 99% of its content as "title/content" fields. Truth. The data models behind practically every piece of titled/labeled content on the Web are stored as "title" and "content" fields or queried into widgets as such. A typical widget (now practically identical with HTML5's full figure structure) is:

<div class="widget">
<h5 class="title">Archives</h5>
<?php get_payload('table_of_posts_by_row_rendered_as_list');?>
</div>

And this scales up to posts and pages as data models as well, just adorned with more metadata. This is greatly at odds with how we generally process DITA to place into Web templates--the compile model makes the entire result of a transformed topic to be the payload part of  widget or post, meaning that the template or web publishing CMS cannot actually manage the main title as its own field. Effectively, standard DITA processing creates this tension--note the empty widget title:

<div class="widget">
<h5 class="title"></h5>
<?php get_payload('topic_rendered_in_full_as_html');?>
</div>

where the topic's transformation generates and embed the heading element as part of the payload, in contrast to the CMS's separation of title from payload. Even a slug (think topic id) is merely a version of the title used as the query for retrieving the actual title and content of a specific resource.

The degree to which we can help maintain this admittedly simplistic but ubiquitous Web publishing  data model is key to involving DITA more directly into this publishing world (particularly for dynamic DITA processing scenarios where marketers may want to dynamically modify the caption's content for specific SEO campaigns). The key innovation in expeDITA is early retrieval of topics in order to separate the title from the rest of the presentational payload so that each can be dropped into its respective view template with whatever dynamic processing it may need (vs dropping the entire compiled result into the payload spot with no chance for dynamic changes of container into which it is queried).

This should have been a blog post or an article in alistapart because I'm pointing fingers at the Web designers as well. There are other Holes in the Web, but tables happened to drive a long and related chain of concerns that I've had to consider in fitting present DITA into dynamic content delivery that intersects easily with the Web's expectations and limitations such a they are. expeDITA as a project will continue exploring this bifurcated world of queries, collections, and resources. I hope that the principles themselves may help inform on LwD design discussions and decisions, tough as they are.
--
Don

P.S.: Serendipitously, this related article was published yesterday on A List Apart--note the title/content nature of practically all the content on a home page and the author's willingness to enshrine that design principle in a simple-to-manage CMS form structure:
http://alistapart.com/article/homepage-exception
We would recognize each row of presented widgets as being a map of topics under the covers. Maps comprised of query results make the model even more dynamic and adaptable to particular audiences.


On 6/17/2015 3:44 AM, Jang F.M. Graat wrote:
I was in too much of a hurry to check where this mail went to, so here it is again, now addressed to the list instead of only to Fredrik.

Kind regards from Amsterdam

Jang

The Content Era, LLC
EMEA Office
Amsterdam - Netherlands
+31 646 854 996
www.thecontentera.com



Begin forwarded message:

From: "Jang F.M. Graat" <jang@jang.nl>
Subject: Re: [dita-lightweight-dita] Tables and more
Date: 16 Jun 2015 17:11:11 GMT+2
To: Fredrik Geers <fgeers@sdl.com>

Hi,

I was in a car racing back to Amsterdam yesterday so I missed another call. I do want to give my reaction to the points noted by Fredrik and throw in my 2 cents, as I have just been struggling with the table versus simpletable and do find the simpletable too simple and the CALS table too complicated.

The only thing I would like to add to a simpletable is a title. In my opinion, that was a serious mistake in the original design - stating that simple tables do not need table groups and straddling is one thing, but kicking out a table title in the same broad sweep was restricting the model way too much. Now I have to use the full, bloated, CALS model even if my simple is a straightforward shopping list, just because I want to call it a shopping list and make sure the reader interprets it as such. The other option would be to place the table in a section which then gets a title and does not contain anything else, which is a totally ugly solution with unwanted side-effects.

I would NOT allow any kind of straddling, as this immediately brings lots of complexity into the model, as well as in the software that supports it. How are you going to identify which cells are combined into a single cell via straddling? Can you allow column straddling but disallow row straddling? And what if the user wants to straddle one cell from the header row together with one cell in the first body row? I can think of many more scenarios that would be unwanted for a simplified authoring AND processing experience. If people really need straddling in tables they should choose the CALS table with all its features.

That was 1 cent, now the other one...

About sections:

I completely agree that, once you have used a section in a topic body, you should not be able to add content above that section level, as this is causing ambiguity in all cases where full content indenting is not forced in the output. Also, to further reduce ambiguity, I would make a title mandatory in a section, instead of optional. If you want to introduce a section, give it a title to tell the reader what the section is about.


Kind regards from Amsterdam

Jang

The Content Era, LLC
EMEA Office
Amsterdam - Netherlands
+31 646 854 996
www.thecontentera.com



On 16 Jun 2015, at 16:48, Fredrik Geers <fgeers@sdl.com> wrote:

Hi all,

After yesterday’s call I have some more thoughts on the subjects we talked about.

First of all, the discussion of the table model: simpletable vs constrained CALS table model. You can argue that a table is like svg or mathml, and a simpler model like with LW DITA doesn’t have to mean simpler tables. From a author’s perspective having cell spanning doesn’t add a lot of complexity – it’s a very common concept, exposed in MS Word and other common tools for years.
Maybe we have to look at it from this angle:  why do we want to have a simpler table model than the full CALS model?
To enable a markdown representation of a document with a table? To make it easier to process content? To make it easier to author content in an editor?
Or we can look at it from the perspective of the “complex” features in CALS, what is it we do not want?
Cell spanning? Separators/frames? Cell alignment? Multiple tgroups? Pgwide?

Another thing I noted down was the comment someone made: “nested tables are not possible in DITA, at least not directly”. I haven’t found any of these situations in the current proposal yet, but I think we should really try to prevent these situations in LW DITA: that something is not valid, but as soon as you wrap it in a ph (for example) it is. I’ll be looking for loopholes like that in any future proposals, but I’d like to invite everyone to help me in catching situations like that.

Related to Jan’s comment on reducing ambiguity and the subject of content that’s not in a section: the current DITA topic not only allows content before sections or without sections at all, but it also allows content after and in between sections. Do I understand correctly that in LW DITA we’re changing that, and only allow other sections after a section? (I hope so?)

And finally, the more I think about it, the more I realize that for this initiative to really work well we should make creating specializations as simple as possible. Similar to how web cms’s allow to create different page types by combining different component types in some kind of ui, this should be a task that anyone can do without DTD/XSD/RNG knowledge. Perhaps this is an area where us tool vendors can help out, but for that to be as successful, it helps if it is conceptually as straightforward as possible.

Fredrik Geers | Product Owner SDL LiveContent Create/SDL Xopus | SDL |  (t) +31 (0)20 201 0500 | (e) fgeers@sdl.com



www.sdl.com 


SDL PLC confidential, all rights reserved. If you are not the intended recipient of this mail SDL requests and requires that you delete it without acting upon or copying any of its contents, and we further request that you advise us.

SDL PLC is a public limited company registered in England and Wales. Registered number: 02675207. 
Registered address: Globe House, Clivemont Road, Maidenhead, Berkshire SL6 7DY, UK.



This message has been scanned for malware by Websense. www.websense.com


      

---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that 
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 


--
Don R. Day
Founding Chair, OASIS DITA Technical Committee
LinkedIn: donrday   Twitter: @donrday
About.me: Don R. Day   Skype: don.r.day
"Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?"
--T.S. Eliot



Avast logo

This email has been checked for viruses by Avast antivirus software.
www.avast.com




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]