OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docstandards-interop-discuss message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docstandards-interop-discuss] Clarifications / Scope of the intended work?


David,

it sounds like you are still advocating a page-based metaphor, but the document standards we are working with are not! Neither DocBook nor DITA have a concept of a "page". There are topics, sections, chapters and the like. They are semantic units of information or blocks of information, but not tied to a particular presentation format. You can take the same XML content in DocBook or DITA and transform them to various representations: PDF, HTML, HTMLHelp, JavaHelp, etc.

Your page representation would not capture a complete unit of information, as some units would need to span multiple pages.

I would disagree that DocBook, DITA and ODF are representing paper documents. Rather, they are representing semantic blocks of information. True, they can be rendered as a paper document, but you are not locked into that paradigm. We have enough issues getting people to think about separation of content and presentation without perpetuating the paper-based, WYSIWYG paradigm.

I honestly do not think Faxes, email, memos or letters have to be locked into a paper paradigm either. Sure, you may want a cover page for a FAX, but the output of that can be controlled via stylesheets, not in the source!

I don't see how the iText solution preserves the semantic distinctions that have been mentioned in previous emails on this thread. How would you handle conditional text in this model, where you want to produce multiple versions of a document based on certain parameters in the content (Linux vs. Mac vs. Windows versions of a maintenance manual)?

--Scott



David RR Webber (XML) wrote:
Michael,
 
Because the users and business functional people think in terms of page content - and they are writing their processing and handling requirements in that way.
 
I do not see that changing to a browser model even if theoretically they could!
 
Those XML documents are REPRESENTING PAPER documents.
 
If you are saying paper documents and paper representation is out of scope for your work - then fine - but then that kinda defeats the need to have interoperability between Word and PDF and ODF and Docbook, et al.  If you are simply creating an entirely new XML representation for documents that is not intended to be presented via page-based document interfaces - then that needs to be stated very clearly in the scope.
 
Obviously each document creator vendor then provides templates to store to your electronic document format - just like we have create New / Fax / EMail / Memo / Letter - now you have DITA / Techical Doc added to that list. 
 
Again this sounds very manual process - content creation / re-assembly - rather than automated scripted process via handling functions that includes verification and validation.
 
Good that we're getting this clarified.
 
DW

"The way to be is to do" - Confucius (551-472 B.C.)


-------- Original Message --------
Subject: RE: [docstandards-interop-discuss] Clarifications / Scope of
the intended work?
From: Michael Priestley <mpriestl@ca.ibm.com>
Date: Tue, April 10, 2007 1:51 pm
To: "David RR Webber (XML)" <david@drrw.info>
Cc: Dave Pawson <dave.pawson@gmail.com>,
docstandards-interop-discuss@lists.oasis-open.org, "Earley,Jim"
<Jim.Earley@flatironssolutions.com>


So why is that model preferable to the XML DOM? Keeping in mind we are working specifically with XML documents here.

Michael Priestley
IBM DITA Architect and Classification Schema PDT Lead
mpriestl@ca.ibm.com
http://dita.xml.org/blog/25



"David RR Webber \(XML\)" <david@drrw.info>
04/10/2007 01:46 PM
To
"Earley,Jim" <Jim.Earley@flatironssolutions.com>
cc
Dave Pawson <dave.pawson@gmail.com>, docstandards-interop-discuss@lists.oasis-open.org
Subject
RE: [docstandards-interop-discuss] Clarifications / Scope of the intended work?







Jim,
 
NO NO NO - its not PDF that is the answer!!!
 
Stop thinking PDF please.
 
Yes we know that iText is built for PDF - but that's just the start point here.
 
What I'm saying is use the model - not the specific rendering.
 
Our XML-script syntax is neutral - will work with ANY document syntax.  
 
It just is that PDF already has one implemented - so they are ahead of the game at this point - but should not take long for peoples development teams to adapt the iText code base to work for ODF and more as well.
 
So the real abstraction is the in-memory page object model that iText is using when it runs - just as the DOM is for XML inside a browser....
 
DW

"The way to be is to do" - Confucius (551-472 B.C.)



-------- Original Message --------
Subject: RE: [docstandards-interop-discuss] Clarifications / Scope of
the intended work?
From: "Earley, Jim" <Jim.Earley@flatironssolutions.com>
Date: Tue, April 10, 2007 1:33 pm
To: "David RR Webber (XML)" <david@drrw.info>
Cc: "Dave Pawson" <dave.pawson@gmail.com>,
<docstandards-interop-discuss@lists.oasis-open.org>

Dave,


Here in lies the rub, me thinks:

>> By creating a standard around the functions and the processing - we
establish that "lingua franca" at the level of the processing required
- not
the underlying vendor specific document syntax >> goup - that will change
every time they release a new product.

This is why we've proposed going to an abstraction layer to enable the
respective standards to keep evolving to meet their constituents'
needs, yet
allowing interoperability at the markup level, which I contend is much
more
robust with respect to content reusability, which is what I hear
repeatedly
from authors (and not just in the "standard" software Tech Pubs space).

--

I would argue that going to PDF and then to iText to produce
interoperability markup, you impose the presentation (and inherently, the
presentational structure, which may or may not be equivalent to the
originating markup structure) on the recipient, lock, stock and
barrel. DITA
topics do not necessarily have to begin on a new page, neither do DocBook
sections. These are presentatation-specific details that are
intentionally
left out of the markup to enable reuse across documents and output
formats.
Formatting is fluid based on many different factors: company branding,
output format, localization, even audience to name a few.  This is why I
believe that separating the presentation from the data is absolutely
critical.


In my experience at a Fortune 50 company that changed their branding
every
18-24 months, having the content in structured, semantic markup saved our
skins in more ways than you can possibly imagine. We could in a few weeks
rebrand thousands of manuals formatted in HTML and PDF in over 9
languages
(_because_ the content was in XML) that would otherwise take months if we
had to tinker with the formatting. I've been down that road with
things like
MS Word or FrameMaker (both structured and unstructured) and rapidly
run the
other way when the topic comes up.

What about effectivity (conditional processing) attributes?  What if I
create an XML document that contains content embedded for different
operating systems, each of which is rendered into separate outputs?
How do
I capture these at the presentation layer and then enable authors to
leverage the content appropriately?  These are things that XML markup are
very effective at.

It's also been my experience that authors are embracing XML markup now
because the tools support is now readily available from numerous vendors
(and they don't have to get their hands dirty with the actual markup!).
They are seeing the benefits of working with structured markup with
respect
to content reuse and single sourcing. Now their biggest problem is
pulling
in content from other sources using different XML standards into their
content.  I believe that what we've proposed enables authors to do this
without a significant amount of retooling.

Jim



================
Jim Earley
XML Developer/Consultant
Flatirons Solutions
4747 Table Mesa Drive
Boulder, CO 80301

Voice: 303.542.2156
Fax:   303.544.0522
Cell:  303.898.7193

Yahoo.IM: jmearley
MSN.IM:
jearley22@hotmail.com

jim.earley@flatironssolutions.com
-----Original Message-----
From: David RR Webber (XML) [mailto:
david@drrw.info]
Sent: Tuesday, April 10, 2007 10:11 AM
To: Earley, Jim
Cc: Dave Pawson;
docstandards-interop-discuss@lists.oasis-open.org
Subject: RE: [docstandards-interop-discuss] Clarifications / Scope of the
intended work?

Jim,

I'd argue that you are making my point for me!!!

What we need are FUNCTIONS that match the business requirements you state
here.

Your example - "In these cases, the structural and semantic characterists
are equally
important:  a procedure may appear as a numbered list
presentationally, but
semantically it is very different than a set of items in a sequenced
list."

So - if I was using iText to do this - I can handle this both ways -
either
get the XML from whereever - and then produce the numbered list (and
embed
matching XML metacontent) into PDF - or the reverse - find the
numbered list
in the PDF - extract it out - create the XML.

By creating a standard around the functions and the processing - we
establish that "lingua franca" at the level of the processing required
- not
the underlying vendor specific document syntax goup - that will change
every
time they release a new product.

The vendors then simply provide implementations to our functional set
- and
anyone can then create XML-script handling of their documents -
inbound or
outbound - in a consistent way to our specification.

Bottom line is - its the functional handling equivalence we are
wanting.  

This may ultimately drive syntax alignment - but we do not have to get
into
that ourselves.

DW

"The way to be is to do" - Confucius (551-472 B.C.)




       -------- Original Message --------
       Subject: RE: [docstandards-interop-discuss] Clarifications / Scope
of
       the intended work?
       From: "Earley, Jim" <
Jim.Earley@flatironssolutions.com>
       Date: Tue, April 10, 2007 11:49 am
       To: "David RR Webber (XML)" <
david@drrw.info>
       Cc: "Dave Pawson" <
dave.pawson@gmail.com>,
       <
docstandards-interop-discuss@lists.oasis-open.org>
       
       
       David,
       
       Respectfully, I believe the issue isn't at the presentation layer
but
       more
       at the content layer:  How do I leverage/reuse/repurpose content in
       one XML
       Standard (say DITA) in my content (say DocBook)? Here the question
is
       more
       targeted at content interoperability. For example, Vendor A provides
       content
       to an OEM partner who will rebrand it and integrate Vendor A's
content
       into
       their own doc set (could be PDF, HTML, HTML Help, JavaHelp, or any
       number of
       formats).  Further down the pipeline, the content is reused in
Training
       material by a different group using TEI.
       
       In these cases, the structural and semantic characterists are
equally
       important:  a procedure may appear as a numbered list
       presentationally, but
       semantically it is very different than a set of items in a sequenced
       list.
       
       By abstracting each XML standard's specific content models to a
common
       denominator, you can preserve structure along with semantics in a
way
       that
       enables other XML standards to leverage the content using their
       grammar with
       minimal loss to semantics from the original.
       
       Certainly, there are cases as you mentioned that require the
       presentational
       functionality to be preserved "as submitted" that do not apply here.
       And in
       these cases, your approach to maintaining the presentational
semantics is
       very interesting. I've used iText for personal projects, and yes, it
       is very
       mature.
       
       Cheers,
       
       Jim
       
       ================
       Jim Earley
       XML Developer/Consultant
       Flatirons Solutions
       4747 Table Mesa Drive
       Boulder, CO 80301
       
       Voice: 303.542.2156
       Fax:   303.544.0522
       Cell:  303.898.7193
       
       Yahoo.IM: jmearley
       MSN.IM:
jearley22@hotmail.com
       
       
jim.earley@flatironssolutions.com
       -----Original Message-----
       From: David RR Webber (XML) [mailto:
david@drrw.info]
       Sent: Tuesday, April 10, 2007 9:02 AM
       To: Earley, Jim
       Cc: Dave Pawson;
docstandards-interop-discuss@lists.oasis-open.org
       Subject: RE: [docstandards-interop-discuss] Clarifications / Scope
of the
       intended work?
       
       Jim,
       
       Why not focus on the handling functions instead?  That way you are
an
       abstraction layer above the lowlevel representation syntax.  
       
       The xhtml is problematic - especially when it comes to page counts
and
       page
       content.  Legally also - you need to leave things "as submitted" -
       because
       you may reject a submission as say not having content in the right
       place on
       a page, or total pages - and yet the original was OK when viewed in
the
       native format.
       
       Also - by going with functions - you put the onus on the individual
tool
       vendors to support those functions consistently - without having to
       get into
       the lower level syntax ourselves of how that occurs, either now or
future
       new formats.
       
       At the end of the day it is the BUSINESS FUNCTIONALITY that you want
       interoperability around - not the raw document.
       
       So from the business stance - if I need to check for certain
bookmarks,
       sections, text strings, page counts, word counts, etc - I can do
that.
       
       DW
       
       "The way to be is to do" - Confucius (551-472 B.C.)
       
       
       
       
               -------- Original Message --------
               Subject: RE: [docstandards-interop-discuss] Clarifications /
Scope
       of
               the intended work?
               From: "Earley, Jim" <
Jim.Earley@flatironssolutions.com>
               Date: Tue, April 10, 2007 10:46 am
               To: "Dave Pawson" <
dave.pawson@gmail.com>,
               <
docstandards-interop-discuss@lists.oasis-open.org>
               
               
               Dave,
               
               The current thinking with regard to a solution uses XHTML
       Microformats as
               the abstraction layer. All of the standards (DITA, DB, ODF)
share
       the
               same
               structural characteristics (Headings, paragraphs, lists,
tables,
       images,
               etc.) albeit in different ways.
               
               The premise thus far is:
               
               1. use standard XHTML markup for common semantic/structural
       components
               (table, img, p, ol, acronym, strong, em, etc)
               2. For structural components that do not have an equivalent
XHTML
               mapping,
               use <div>
               3. For inline semantics that do not have an equivalent XHTML
       mapping, use
               <span>
               
               - use the title attribute (available on any XHTML element)
to store
       the
               original element name
               - use the class attribute to store the "semantic category":
e.g.,
               "procedural" vs. "list" to delineate between a procedural
set of
       steps
               compared to a numbered list
               
               - there are a couple of ideas that we're playing with with
regard to
               capturing the attribute values from the original source:
               
               a) Use the object tag (with child param tags to capture the
       name/value
               pairs)
               b) Use a declared namespace to embed the attributes on the
element
               
               These are, of course, open for discussion.
               
               Jim
               
               
               ================
               Jim Earley
               XML Developer/Consultant
               Flatirons Solutions
               4747 Table Mesa Drive
               Boulder, CO 80301
               
               Voice: 303.542.2156
               Fax:   303.544.0522
               Cell:  303.898.7193
               
               Yahoo.IM: jmearley
               MSN.IM:
jearley22@hotmail.com
               
               
jim.earley@flatironssolutions.com
               -----Original Message-----
               From: Dave Pawson [mailto:
dave.pawson@gmail.com]
               Sent: Tuesday, April 10, 2007 8:12 AM
               To:
docstandards-interop-discuss@lists.oasis-open.org
               Subject: Re: [docstandards-interop-discuss] Clarifications /
Scope
       of the
               intended work?
               
               On 10/04/07, Michael Priestley <
mpriestl@ca.ibm.com> wrote:
               
               > - govt worker begins drafting a policy note in ODF with
the
       subject
               "the
               use of personal data received via email"
               > - govt worker pulls in the text of the relevant statute,
which is
       in a
               DITA specialization
               > - govt worker pulls in the legal disclaimer which must now
be
               included in
               every government email reply, from a different DITA
specialization
               > - govt worker pulls in the instructions on how to include
the text
               of the
               disclaimer in emails, from documentation of the email
software
       written in
               DocBook
               
               > - technical author 2, using DocBook, creates a customized
version
       of
               the
               email software documentation
               > - and pulls in portions of the procedures web site, in the
form of
       DITA
               topics and ODF policy notes
               
               OK, you've described the problem Michael. I hope we can all
       sympathise
               with that!
               
               Ignoring how, what do you see as a solution?
               
               A means of 'integrating' n streams?
               A way of reading n streams?
               A means of generating .... something readable by all....
(lcd
       solution)
               
               What class of solution is the goal please?
               
               
               regards
               
               
               --
               Dave Pawson
               XSLT XSL-FO FAQ.
               
http://www.dpawson.co.uk <http://www.dpawson.co.uk/>
<
http://www.dpawson.co.uk/>
               
               
       
---------------------------------------------------------------------
               To unsubscribe, e-mail:
       
docstandards-interop-discuss-unsubscribe@lists.oasis-open.org
               For additional commands, e-mail:
               
docstandards-interop-discuss-help@lists.oasis-open.org
               
       


--------------------------------------------------------------------- To unsubscribe, e-mail: docstandards-interop-discuss-unsubscribe@lists.oasis-open.org For additional commands, e-mail: docstandards-interop-discuss-help@lists.oasis-open.org
--------------------------------------------------------------------- To unsubscribe, e-mail: docstandards-interop-discuss-unsubscribe@lists.oasis-open.org For additional commands, e-mail: docstandards-interop-discuss-help@lists.oasis-open.org

begin:vcard
fn:Scott Hudson
n:Hudson;Scott
org:Flatirons Solutions
adr:Suite 200;;4747 Table Mesa Drive ;Boulder;CO;80305;USA
email;internet:scott.hudson@flatironssolutions.com
title:Consultant
tel;work:303-542-2146
tel;fax:303-544-0522
tel;cell:303-332-1883
url:http://www.flatironssolutions.com
version:2.1
end:vcard



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]