OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Opinion: importance of preserving XML markup and nodes


Title: Message
Hello all,
 
I have an opinion to share regarding the representation of inline elements. This opinion is relevant to the discussion between the XLIFF TC and our TMX partners in our exploration of moving toward a common markup strategy.
 
I took a better look at the proposed TMX 2.0 specification (http://www.lisa.org/fileadmin/standards/tmx2_2009_03_09.pdf), and I'd like to comment on section 4.2 (Representing Inline Elements).
 
I strongly believe that when representing XML, it is important (critical) that the XML is preserved as XML, and that nodes are preserved as nodes.  It is my opinion that XLIFF and TMX elements need to be able to be processed by XML means (i.e., XSLT) natively and directly as XML.
 
My difficulty begins in the second paragraph of 4.1 (Overview):
 
"At present, the best way to deal with these native codes in general is to delimit them by a specific set of elements that convey where they begin and end, and possibly additional information about what they are (bold, italic, footnote, etc.). (Note, however, that in some cases inline content markup may be left unencapsulated to meet specific needs. Guidance about how best to represent markup for specific needs and cases is beyond the scope of this standard.)"
 
This strikes me as being to accommodating to non-XML-aware TM tools, at the expense of enabling reasonable XML processing by XML-aware tools.
 
So when I read section 4.2, it seemed to me that the spec is saying there are basically two ways of representing inline elements in TMX (I know there three variations in 4.2.1, and four variations in 4.2.2 - but I think they boil down to the two following instructions):
 
If this is my source
 
<p>
  <b>XML</b> is a general-purpose  <i>specification</i>
  for creating custom markup languages.
</p>
 
It seems to me the new TMX standard prescribes only the following two recommendations:
 
<seg>
  <itag pos="start" x="1" type="b">&lt;b></itag>XML
  <itag pos="end" x="1" />&lt;/b></itag> is a general-purpose 
  <itag pos="start" x="2" type="i">&lt;i>specification
  <itag pos="end" x="2" />&lt;/i></itag>
  for creating custom markup languages.
</seg>
 

<seg>
  <itag pos="start" x="1" type="b" />XML
  <itag pos="end" x="1" /> is a general-purpose 
  <itag pos="start" x="2" type="i">specification
  <itag pos="end" x="2" />
  for creating custom markup languages.
</seg>
 
I really think each of them is bad.
 
I think the following is a much better way (and, at least on the XLIFF side, I will advocate strongly for this):
 
<seg>
  <itag type="b" x="1">XML</itag> is a general-purpose 
  <itag type="i" x="1">specification</itag>
  for creating custom markup languages.
</seg>
 
Let me be clear.  My point is strictly about preserving nodes here.  It is not related to my other favorite topics, like my dislike for escaping XML (&lt;p&gt;), or my dislike for shaping a standard to inordinately accommodate tools that create malformed XML. I'd be happy to talk about those topics, but not as part of this thread.
 
Maybe I'm the only one who feels so strongly about preserving XML nodes. Let's see what others think.
 
Thanks,
 
Bryan


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]