xliff-comment message

Subject: Preferred method of representing invalid XML chars in <source>?

From: Kristian Walsh <listreader@byteform.com>
To: xliff-comment@lists.oasis-open.org
Date: Mon, 26 Jul 2004 11:27:43 +0100

Hi,

I am developing an application which creates XLIFF 1.0 documents from 
source data. Unfortunately, sometimes this source data contains 
character codes below U+0020, which are invalid in an XML document.

I am unsure of the "canonical" way to deal with this in XLIFF 1.0 
(version 1.1 is not an option for this application); as far as I can 
see, <x/>, <g/> and <ph> can all be used for this purpose, as below:


Form 1: <x>

<trans-unit id="a920cf">
	<source xml:lang="en">Three tabs follow<x id="a920d0" 
ctype="character" clone="yes" ts="MyTool:chars=0008,0008,0008"> then 
the text continues</source>
</trans-unit>


Form 2: <g>

<trans-unit id="a920cf">
	<source xml:lang="en">Three tabs follow<g id="a920d0" 
ctype="character" clone="yes" ts="MyTool:chars">0008,0008,0008</g> then 
the text continues</source>
</trans-unit>


Form 3: <ph>

<trans-unit id="a920cf">
	<source xml:lang="en">Three tabs follow<ph id="a920d0" 
ctype="character" ts="MyTool:chars">0008,0008,0008</ph> then the text 
continues</source>
</trans-unit>


So my two questions are:

  1. Which of the above forms is preferred in XLIFF 1.0 for representing 
non-XML characters inside source (and/or target) data?

  2. Is there a standard ctype attribute value for "raw character codes"?

Any ideas would be greatly appreciated,
--
Kristian

Follow-Ups:
- RE: Preferred method of representing invalid XML chars in <source>?
  - From: "Doug" <doug@ektron.com>