Michael, thank you for your responses and clarifications.
David and Dave, thank you for kicking off such a lively discussion!
We are really targeting the semantic interoperability of the content,
not passing around chunks of existing content "as is" and assembled
into new chunks. We want to make it easier for organizations to reuse
content from other organizations that might be using different
XML-based standards. We are specifically talking about exchange of
document content, not data as well. There are plenty of data exchange
formats and services out there. They are out of scope for this effort.
PDF is not really in scope for this effort either. We want to address
interoperability of the source content - before the PDF is generated.
Because XHTML is an XML standard, it can be easily processed with
XML-based tools and technologies. Working with PDF is more challenging
in that regard.
We are not limited to ODF, DITA and DocBook either! We could easily
bring in MS OOXML, TEI and other XML document standards as we move
forward. Our initial attempt is to align those standards housed under
the OASIS umbrella.
Best regards,
--Scott
Michael Priestley wrote:
Ah, here may be part of the
confusion:
we are talking about potentially thousands of topics/documents - we
purposefully
picked a simple scenario for the sake of understanding, but certainly
the
problem scales to e.g. a component provider using one standard
providing
docs for the components to a solution provider using another standard -
solutions with hundreds of components, each with hundreds of topics,
scale
quickly in complexity.
We are also not talking about cut
and
paste. This is reuse by reference.
See my other note for a discussion
of
the degrees of interoperability involved.
Michael Priestley
IBM DITA Architect and Classification Schema PDT Lead
mpriestl@ca.ibm.com
http://dita.xml.org/blog/25
Dave / Michael,
This type of problem looks weak for automation - its
one-off
stuff - desktop editing focused.
And really this falls under what I'd set as 1) in my
use
cases - setting up MoU / CPA for the who, what, why rules of
engagement.
Each vendor is going to provide tools within their editors to cope
with these "imports" / cut-and-paste assembly - and alot of this
is human facing. Print, sign - return to sender.
Where I see the bigger need occuring is mass
processing.
The type of processing I'd described involves 1,000's of documents
daily - from 100's of submitters.
And more importantly - is applied AFTER the type of
content
cut-and-paste occurs that Michael described. In fact that is EXACTLY
the type of thing we see already for past year with these PDF
submissions
to eGov - people have got content from all kinds of places - scanners,
PDF, Word docs, web pages - and then they assemble that - and output to
the PDF for submission - this leaves all kinds of hidden traces and
formatting
issues that need to be resolved at the central receiving site.
And each agency has their own descreet rules and
submission
type - so scripting is essential to reduce the cost of doing this.
Fortunately for PDFs - the iText library has been
evolved
over the past 4 years to be wonderfully adept at this for purpose. But
right now you have to hand-code Java to be able to use iText - and its
then static code that has to be re-compiled everytime a new document or
rule change happens.
Of course what is needed is a more general purpose
solution
- learning from the iText experience - but that's where we come in...?!
DW
"The way to be is to do" - Confucius (551-472 B.C.)
-------- Original Message --------
Subject: Re: [docstandards-interop-discuss] Clarifications / Scope of
the intended work?
From: "Dave Pawson" <dave.pawson@gmail.com>
Date: Tue, April 10, 2007 10:12 am
To: docstandards-interop-discuss@lists.oasis-open.org
On 10/04/07, Michael Priestley <mpriestl@ca.ibm.com>
wrote:
> - govt worker
begins drafting
a policy note in ODF with the subject
"the use of personal data received via email"
> - govt worker
pulls in
the text of the relevant statute, which is in
a DITA specialization
> - govt worker
pulls in
the legal disclaimer which must now be
included in every government email reply, from a different DITA
specialization
> - govt worker
pulls in
the instructions on how to include the text
of the disclaimer in emails, from documentation of the email
software written in DocBook
> - technical
author 2, using
DocBook, creates a customized version of
the email software documentation
> - and pulls in
portions
of the procedures web site, in the form of
DITA topics and ODF policy notes
OK, you've described the problem Michael. I hope we can all sympathise
with that!
Ignoring how, what do you see as a solution?
A means of 'integrating' n streams?
A way of reading n streams?
A means of generating .... something readable by all.... (lcd solution)
What class of solution is the goal please?
regards
--
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk
---------------------------------------------------------------------
To unsubscribe, e-mail:
docstandards-interop-discuss-unsubscribe@lists.oasis-open.org
For additional commands, e-mail:
docstandards-interop-discuss-help@lists.oasis-open.org
---------------------------------------------------------------------
To unsubscribe, e-mail:
docstandards-interop-discuss-unsubscribe@lists.oasis-open.org
For additional commands, e-mail:
docstandards-interop-discuss-help@lists.oasis-open.org
|