OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Resend: [xliff] more comments on the XLIFF DTD


Hi there,

in one of our last conference calls, I was asked to resend
the notes on the XLIFF DTD which I put together in a very
early stage of our TC. Here it comes ...

Best,
Christian

-----Original Message-----
From: Lieske, Christian 
Sent: Freitag, 25. Januar 2002 14:41
To: 'xliff@lists.oasis-open.org'
Subject: [xliff] more comments on the XLIFF DTD


Hi All,

Encouraged by the positive replies which David Leland received for
his comments on the DTD (and the feeling that questions about the
specification and DTD are welcome at this very early state of the
TC's work), I dare to send out some comments of my own which I
recently jotted down. 

I am aware of the risk, that some active members of the dataDefinition
group may be somewhat bored (you know groans and "oh, not that again!")
because they already might have spent several hours on the topics to
which my comments pertain. However, I see the chance that explanations
at the very start might be beneficial to the overall process. The
specification will have to go through the approval process and presumably
will be exposed to those same type of questions anyway, so we might
as well address them now. 

For your convenience, I communicate my comments 

1. as integral part of this mail (extracted from the DTD;
   see bottom of the mail; digits at the beginning of the
   lines are line numbers)

2. interspersed in the DTD 

and classify each comment as belonging to one of three categories

R: remark that may not need to be addressed in the current
   specification but could be used for later work
Q: question related to explanantions in the specification/DTD that
   were unclear to me
I: idea that may not need to be addressed in the current
   specification but could be used for later work

Best regards,
Christian
-----------------------------------------------------
			Christian Lieske
		  Christian.Lieske@sap.com
	     SAP AG - MultiLingual Technology
                     www.sap.com
-----------------------------------------------------

   24:<!-- R: naming conventions: for multiwords sometimes mix of uppercase
and lowercase (eg. CodeContent), sometimes hyphenated form (eg.
source-language), sometimes all lowercase (eg. minheigth) -->
   41:<!-- Q: What's the semantics of the attribute 'xml:lang' for the
element 'xliff'? Does it identify the language of comments found in the
XLIFF file? -->
   45:<!-- I: Bring attribute 'original' for element 'file' in sync with
information that can be given for external files (for them, you cannot only
give a simple 'name' but for example an 'href'). -->
   47:<!-- I: Rename attribute 'source-language' for element 'file' to
'source-lang' to improve consistency related to language identification
(it's 'xml:lang', not 'xml:language'). -->
   48:<!-- I: Some companies distinguish original language (authoring
happens in this language), source language (that's the language from which
translation starts), and target language(s). Accordingly, an optional
attribute 'orig-lang' may be desirable. Consequently, one may not only have
references to source files, but to original files as well. -->
   50:<!-- R: The values for attribute 'datatype' for element 'file' may
have to be defined more rigorously, since for example one may have to
differentiate between types of Java resourceBundles (listResourceBundle vs.
propertyResourceBundle). -->
   52:<!-- I: In addition to the attribute 'tool' for the element 'file' one
may want an attribute 'toolVersion' since for example the capabilities of
Integrated Development Environment 'SuperJavaIDE' related to XLIFF may
change over time. -->
   54:<!-- I: Specify if the attribute 'date' for the element 'file' refers
to a creation or an update. -->
   55:<!-- I: Additional attribute related to dates: one that captures the
deadline for a certain phase in the localization process. -->
   58:<!-- I: Harmonizing the attribute 'ts' for element 'file' with the
'prop' element. Possibilities for harmonizing: renaming the attribute to
'prop', or doing away with either the attribute 'ts' or the element 'prop'.
-->
   60:<!-- I: A name like 'subjField' or 'domain' instead of 'category' for
element 'file' may not reflect the semantics of the data category more
clearly. -->
   62:<!-- I: See remark on attribute 'source-language' for element 'file'.
-->
   70:<!-- R: I do not see a strong need for using an abbreviated name for
element 'skl' since compared to other elements we will not see many
occurrances of the element. Thus, using a long form like 'skeleton-file'
should not increase file size too much but would increase readability (we
already have elements 'internal-file' and 'external-file'). -->
   74:<!-- I: Rename attribute 'form' for element 'internal-file' to
'mime-type', since the values are mime-types. -->
   77:<!-- Q: What is the long form of attribute name 'crc'? -->
   85:<!-- Q: The name of attribute 'uid' for element 'external-file' by
means of the starting letter 'u' indicates some kind of universal scope.
Does this hold (ie. are identifiers in skeleton files universal/unique)? In
any case: How about an alternative name like 'skeleton-file-id'? -->
   88:<!-- I: An attribute for subclassing references would be handy
(resulting in something like <reference refType="mandatory
TM">...</reference>). It would allow you to say 'Hey, here is a reference to
a translation memory that you must use.' -->
   95:<!-- I: An attribute 'to' for element 'note' (in addition to attribute
'from')? -->
  137:<!-- Q: What's the semantics of attribute 'phase-name' for element
'phase'? I am not really able to distinguish it from attribute
'process-name'. -->
  138:<!-- I: An additional attribute 'state' for element 'phase'. It could
for example be used to reveal that the pre-editing of the contents has been
started but has not been finished, yet (resulting in something like <phase
phase-name="y" process-name="proofreading" state="started">. Possible values
could be: 'not executable', 'executable', 'started', 'suspended', 'aborted',
'preliminary finished', and 'finished'. -->
  164:<!-- R: Element 'resname' may tie XLIFF too much to the Windows world
(I take 'control' to be an an explicit reference to Windows controls) -->
  167:<!-- Q: What is the semantics of attribute 'extradata' of element
'group'. -->
  175:<!-- I: References to css files in addition to in-place css
instructions via the attribute 'css-style' of element 'group'. -->
  178:<!-- Q: What is the semantics of attribute 'exstyle' of element
'group'. -->	
  208:<!-- Q: Why are the values of attribute 'maxbytes' and 'minbytes' of
element 'trans-unit' toolspecific? Isn't it desirable to have uniform
values? -->
  209:<!-- I: I have seen the need to encode relationships between for
example message strings (eg. to ensure consistency). Thus, a list-valued
attribute like 'relationships' which references other 'trans-unit' elements
might be handy. -->
  217:<!-- R: Since the 'phase' element allows several processing steps for
the source as well (eg. 'pre-edit for MT', 'general QA' ...) one may want
more overlap between attributes for element 'target' and for element
'source'. -->
  218:<!-- R: I wonder if element 'source' could not benefit from attributes
like 'css-style' as well, since they might be used during rendering XLIFF
data to end-users (and for example Yves' marvellous book taught us that
formatting styles for source and target may differ). -->
  241:<!-- I: See remarks on attribute 'tool' for element 'file' -->
  245:<!-- I: Harmonize the name of attribute 'origin' for element
'alt-trans' with that of attribute 'match-quality' (resulting in something
like 'match-origin'. -->
  273:<!-- Q: Don't we need attributes like 'maxbyte' (cf. element
'trans-unit') for element 'bin-unit' as well? -->

----------------------------------------------------------------
<!-- XLIFF
CAVEAT: This is not the original XLIFF DTD!
Public Identifier: "-//XLIFF//DTD XLIFF//EN"
Namespace URI: "http://www.xliff.org/xliff_1_0";

History of modifications (latest first):

May-15-2001 by YS: Add phase-name to <trans-unit> and <bin-unit>
May-15-2001 by YS: Reverse id for <trans-unit> to required
Apr-19-2001 by YS: Enda+JohnR last changes
Apr-18-2001 by YS: Removed empty ATTLISTs
Apr-12-2001 by YS: Changed target* to target+ in trans-match
Apr-11-2001 by YS: Fixed DOCTYPE id
Apr-10-2001 by YS: Synchronize from conference call 
Apr-05-2001 by YS: Synchronize with latest specs
Apr-04-2001 by YS: Synchronize with latest specs
Apr-03-2001 by YS: Added name in <prop-group>
Apr-02-2001 by YS: Implemented JR fixes
Mar-29-2001 by JC: fixes for xml:space and bin-unit
Mar-28-2001 by YS: First draft version

-->

<!-- R: naming conventions: for multiwords sometimes mix of uppercase and
lowercase (eg. CodeContent), sometimes hyphenated form (eg.
source-language), sometimes all lowercase (eg. minheigth) -->

<!ENTITY % CodeContent "#PCDATA|sub">
<!ENTITY % TextContent "#PCDATA|g|bpt|ept|ph|it|mrk|x|bx|ex">
<!ENTITY lt "&#38;#60;">
<!ENTITY amp "&#38;#38;">
<!ENTITY gt "&#62;">
<!ENTITY apos "&#39;">
<!ENTITY quot "&#34;">
<!-- ***************************************************************** -->
<!-- Structural Elements                                               -->
<!-- ***************************************************************** -->
<!ELEMENT xliff (file)+>
<!ATTLIST xliff
	version CDATA #FIXED "1.0"
	xml:lang CDATA #IMPLIED
>
<!-- Q: What's the semantics of the attribute 'xml:lang' for the element
'xliff'? Does it identify the language of comments found in the XLIFF file?
-->
<!ELEMENT file (header, body)>
<!ATTLIST file
	original CDATA #REQUIRED 
<!-- I: Bring attribute 'original' for element 'file' in sync with
information that can be given for external files (for them, you cannot only
give a simple 'name' but for example an 'href'). -->
	source-language CDATA #REQUIRED
<!-- I: Rename attribute 'source-language' for element 'file' to
'source-lang' to improve consistency related to language identification
(it's 'xml:lang', not 'xml:language'). -->
<!-- I: Some companies distinguish original language (authoring happens in
this language), source language (that's the language from which translation
starts), and target language(s). Accordingly, an optional attribute
'orig-lang' may be desirable. Consequently, one may not only have references
to source files, but to original files as well. -->
	datatype CDATA #REQUIRED
<!-- R: The values for attribute 'datatype' for element 'file' may have to
be defined more rigorously, since for example one may have to differentiate
between types of Java resourceBundles (listResourceBundle vs.
propertyResourceBundle). -->
	tool CDATA #IMPLIED
<!-- I: In addition to the attribute 'tool' for the element 'file' one may
want an attribute 'toolVersion' since for example the capabilities of
Integrated Development Environment 'SuperJavaIDE' related to XLIFF may
change over time. -->
	date CDATA #IMPLIED
<!-- I: Specify if the attribute 'date' for the element 'file' refers to a
creation or an update. -->
<!-- I: Additional attribute related to dates: one that captures the
deadline for a certain phase in the localization process. -->
	xml:space (default | preserve) "default"
	ts CDATA #IMPLIED
<!-- I: Harmonizing the attribute 'ts' for element 'file' with the 'prop'
element. Possibilities for harmonizing: renaming the attribute to 'prop', or
doing away with either the attribute 'ts' or the element 'prop'. -->
	category CDATA #IMPLIED
<!-- I: A name like 'subjField' or 'domain' instead of 'category' for
element 'file' may not reflect the semantics of the data category more
clearly. -->
	target-language CDATA #IMPLIED
<!-- I: See remark on attribute 'source-language' for element 'file'. -->
	product-name CDATA #IMPLIED
	product-version CDATA #IMPLIED
	build-num CDATA #IMPLIED
>
<!-- tool default = "manual" -->
<!ELEMENT header (skl?, phase-group?, (prop-group | glossary | reference |
note | count-group)*)>
<!ELEMENT skl (internal-file | external-file)>
<!-- R: I do not see a strong need for using an abbreviated name for element
'skl' since compared to other elements we will not see many occurrances of
the element. Thus, using a long form like 'skeleton-file' should not
increase file size too much but would increase readability (we already have
elements 'internal-file' and 'external-file'). -->
<!ELEMENT internal-file (#PCDATA)>
<!ATTLIST internal-file
	form CDATA #IMPLIED
<!-- I: Rename attribute 'form' for element 'internal-file' to 'mime-type',
since the values are mime-types. -->
	crc NMTOKEN #IMPLIED
>
<!-- Q: What is the long form of attribute name 'crc'? -->
<!-- text|base64 (text is default) -->
<!ELEMENT external-file EMPTY>
<!ATTLIST external-file
	href CDATA #REQUIRED
	crc NMTOKEN #IMPLIED
	uid NMTOKEN #IMPLIED
>
<!-- Q: The name of attribute 'uid' for element 'external-file' by means of
the starting letter 'u' indicates some kind of universal scope. Does this
hold (ie. are identifiers in skeleton files universal/unique)? In any case:
How about an alternative name like 'skeleton-file-id'? -->
<!ELEMENT glossary (internal-file | external-file)>
<!ELEMENT reference (internal-file | external-file)>
<!-- I: An attribute for subclassing references would be handy (resulting in
something like <reference refType="mandatory TM">...</reference>). It would
allow you to say 'Hey, here is a reference to a translation memory that you
must use.' -->
<!ELEMENT note (#PCDATA)>
<!ATTLIST note
	xml:lang CDATA #IMPLIED
	priority (1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10) "1"
	from CDATA #IMPLIED
>
<!-- I: An attribute 'to' for element 'note' (in addition to attribute
'from')? -->
<!ELEMENT prop-group (prop)+>
<!ATTLIST prop-group
	name CDATA #IMPLIED
>
<!ELEMENT prop (#PCDATA)>
<!ATTLIST prop
	prop-type CDATA #REQUIRED
	xml:lang CDATA #IMPLIED
>
<!ELEMENT context-group (context)+>
<!ATTLIST context-group
	name CDATA #REQUIRED
	crc NMTOKEN #IMPLIED
>
<!-- Processing instructions related to <context-group>:

<?xliff-show-context-group name='value' ?>

Indicates that any <context-group> element with a name set to 'value' should
be
displayed to the end-user.

-->
<!ELEMENT context (#PCDATA)>
<!ATTLIST context
	context-type CDATA #REQUIRED
	match-mandatory (yes | no) "no"
	crc NMTOKEN #IMPLIED
>
<!-- Processing instructions related to <context>:

<?xliff-show-context context-type='value' ?>

Indicates that any <context> element with a context-type set to 'value'
should 
be displayed to the end-user.

-->
<!ELEMENT phase-group (phase)+>
<!ELEMENT phase (note)*>
<!ATTLIST phase
	phase-name CDATA #REQUIRED
	process-name CDATA #REQUIRED
<!-- Q: What's the semantics of attribute 'phase-name' for element 'phase'?
I am not really able to distinguish it from attribute 'process-name'. -->
<!-- I: An additional attribute 'state' for element 'phase'. It could for
example be used to reveal that the pre-editing of the contents has been
started but has not been finished, yet (resulting in something like <phase
phase-name="y" process-name="proofreading" state="started">. Possible values
could be: 'not executable', 'executable', 'started', 'suspended', 'aborted',
'preliminary finished', and 'finished'. -->
	company-name CDATA #IMPLIED
	tool CDATA #IMPLIED
	date CDATA #IMPLIED
	job-id CDATA #IMPLIED
	contact-name CDATA #IMPLIED
	contact-email CDATA #IMPLIED
	contact-phone CDATA #IMPLIED
>
<!ELEMENT count-group (count)*>
<!ATTLIST count-group
	name CDATA #REQUIRED
>
<!ELEMENT count (#PCDATA)>
<!ATTLIST count
	count-type CDATA #IMPLIED
	unit CDATA #IMPLIED
>
<!ELEMENT body (group | trans-unit | bin-unit)*>
<!ELEMENT group ((context-group*, count-group*, prop-group*, note*), (group
| trans-unit | bin-unit)*)>
<!ATTLIST group
	id NMTOKEN #IMPLIED
	datatype CDATA #IMPLIED
	xml:space (default | preserve) "default"
	ts CDATA #IMPLIED
	restype CDATA #IMPLIED
<!-- R: Element 'resname' may tie XLIFF too much to the Windows world (I
take 'control' to be an an explicit reference to Windows controls) -->
	resname NMTOKEN #IMPLIED
	extradata CDATA #IMPLIED
<!-- Q: What is the semantics of attribute 'extradata' of element 'group'.
-->
	help-id NMTOKEN #IMPLIED
	menu CDATA #IMPLIED
	menu-option CDATA #IMPLIED
	menu-name CDATA #IMPLIED
	coord CDATA #IMPLIED
	font CDATA #IMPLIED
	css-style CDATA #IMPLIED
<!-- I: References to css files in addition to in-place css instructions via
the attribute 'css-style' of element 'group'. -->
	style NMTOKEN #IMPLIED
	exstyle NMTOKEN #IMPLIED
<!-- Q: What is the semantics of attribute 'exstyle' of element 'group'. -->

>
<!ELEMENT trans-unit (source, target?, (count-group | note | context-group |
prop-group | alt-trans)*)>
<!ATTLIST trans-unit
	id NMTOKEN #REQUIRED
	approved (yes | no) #IMPLIED
	translate (yes | no) "yes"
	reformat (yes | no) "yes"
	xml:space (default | preserve) "default"
	datatype CDATA #IMPLIED
	ts CDATA #IMPLIED
	restype CDATA #IMPLIED
	resname NMTOKEN #IMPLIED
	extradata CDATA #IMPLIED
	help-id NMTOKEN #IMPLIED
	menu CDATA #IMPLIED
	menu-option CDATA #IMPLIED
	menu-name CDATA #IMPLIED
	coord CDATA #IMPLIED
	font CDATA #IMPLIED
	css-style CDATA #IMPLIED
	style NMTOKEN #IMPLIED
	exstyle NMTOKEN #IMPLIED
	size-unit CDATA #IMPLIED
	maxwidth NMTOKEN #IMPLIED
	minwidth NMTOKEN #IMPLIED
	maxheight NMTOKEN #IMPLIED
	minheight NMTOKEN #IMPLIED
	maxbytes NMTOKEN #IMPLIED
	minbytes NMTOKEN #IMPLIED
<!-- Q: Why are the values of attribute 'maxbytes' and 'minbytes' of element
'trans-unit' toolspecific? Isn't it desirable to have uniform values? -->
<!-- I: I have seen the need to encode relationships between for example
message strings (eg. to ensure consistency). Thus, a list-valued attribute
like 'relationships' which references other 'trans-unit' elements might be
handy. -->
	phase-name CDATA #IMPLIED
>
<!-- size-unit: char|byte|pixel|glyph|dlgunit default='pixel' -->
<!ELEMENT source (%TextContent;)*>
<!ATTLIST source
	xml:lang CDATA #IMPLIED
	ts CDATA #IMPLIED
<!-- R: Since the 'phase' element allows several processing steps for the
source as well (eg. 'pre-edit for MT', 'general QA' ...) one may want more
overlap between attributes for element 'target' and for element 'source'.
-->
<!-- R: I wonder if element 'source' could not benefit from attributes like
'css-style' as well, since they might be used during rendering XLIFF data to
end-users (and for example Yves' marvellous book taught us that formatting
styles for source and target may differ). -->
>
<!-- coord = "x;y;cx;cy"
   font= "fontname[;size[;weight]]"
-->
<!ELEMENT target (%TextContent;)*>
<!ATTLIST target
	state NMTOKEN #IMPLIED
	phase-name NMTOKEN #IMPLIED
	xml:lang CDATA #IMPLIED
	ts CDATA #IMPLIED
	restype CDATA #IMPLIED
	resname NMTOKEN #IMPLIED
	coord CDATA #IMPLIED
	font CDATA #IMPLIED
	css-style CDATA #IMPLIED
	style NMTOKEN #IMPLIED
	exstyle NMTOKEN #IMPLIED
>
<!ELEMENT alt-trans (source?, target+, (note | context-group |
prop-group)*)>
<!ATTLIST alt-trans
	match-quality CDATA #IMPLIED
	tool CDATA #IMPLIED
<!-- I: See remarks on attribute 'tool' for element 'file' -->
	crc NMTOKEN #IMPLIED
	xml:lang CDATA #IMPLIED
	origin CDATA #IMPLIED
<!-- I: Harmonize the name of attribute 'origin' for element 'alt-trans'
with that of attribute 'match-quality' (resulting in something like
'match-origin'. -->
	datatype CDATA #IMPLIED
	xml:space (default | preserve) "default"
	ts CDATA #IMPLIED
	restype CDATA #IMPLIED
	resname NMTOKEN #IMPLIED
	extradata CDATA #IMPLIED
	help-id NMTOKEN #IMPLIED
	menu CDATA #IMPLIED
	menu-option CDATA #IMPLIED
	menu-name CDATA #IMPLIED
	coord CDATA #IMPLIED
	font CDATA #IMPLIED
	css-style CDATA #IMPLIED
	style NMTOKEN #IMPLIED
	exstyle NMTOKEN #IMPLIED
>
<!ELEMENT bin-unit (bin-source, bin-target?, (note | context-group |
prop-group | trans-unit)*)>
<!ATTLIST bin-unit
	id NMTOKEN #REQUIRED
	mime-type NMTOKEN #REQUIRED
	approved (yes | no) #IMPLIED
	translate (yes | no) "yes"
	reformat (yes | no) "yes"
	ts CDATA #IMPLIED
	restype CDATA #IMPLIED
	resname NMTOKEN #IMPLIED
	phase-name CDATA #IMPLIED
<!-- Q: Don't we need attributes like 'maxbyte' (cf. element 'trans-unit')
for element 'bin-unit' as well? -->
>
<!ELEMENT bin-source (internal-file | external-file)>
<!ATTLIST bin-source
	ts CDATA #IMPLIED
>
<!ELEMENT bin-target (internal-file | external-file)>
<!ATTLIST bin-target
	mime-type NMTOKEN #IMPLIED
	ts CDATA #IMPLIED
	state NMTOKEN #IMPLIED
	phase-name NMTOKEN #IMPLIED
	restype CDATA #IMPLIED
	resname NMTOKEN #IMPLIED
>
<!-- ***************************************************************** -->
<!-- In-Line Elements                                                  -->
<!-- ***************************************************************** -->
<!ELEMENT g (%TextContent;)*>
<!ATTLIST g
	id CDATA #REQUIRED
	ctype CDATA #IMPLIED
	clone (yes | no) "yes"
	ts CDATA #IMPLIED
>
<!ELEMENT x EMPTY>
<!ATTLIST x
	id CDATA #REQUIRED
	ctype CDATA #IMPLIED
	clone (yes | no) "yes"
	ts CDATA #IMPLIED
>
<!ELEMENT bx EMPTY>
<!ATTLIST bx
	id CDATA #REQUIRED
	rid NMTOKEN #IMPLIED
	ctype CDATA #IMPLIED
	clone (yes | no) "yes"
	ts CDATA #IMPLIED
>
<!ELEMENT ex EMPTY>
<!ATTLIST ex
	id CDATA #REQUIRED
	rid NMTOKEN #IMPLIED
	ts CDATA #IMPLIED
>
<!ELEMENT ph (%CodeContent;)*>
<!ATTLIST ph
	id CDATA #REQUIRED
	ctype CDATA #IMPLIED
	ts CDATA #IMPLIED
	crc CDATA #IMPLIED
	assoc CDATA #IMPLIED
>
<!ELEMENT bpt (%CodeContent;)*>
<!ATTLIST bpt
	id CDATA #REQUIRED
	rid NMTOKEN #IMPLIED
	ctype CDATA #IMPLIED
	ts CDATA #IMPLIED
	crc CDATA #IMPLIED
>
<!ELEMENT ept (%CodeContent;)*>
<!ATTLIST ept
	id CDATA #REQUIRED
	rid NMTOKEN #IMPLIED
	ts CDATA #IMPLIED
	crc CDATA #IMPLIED
>
<!ELEMENT it (%CodeContent;)*>
<!ATTLIST it
	id CDATA #REQUIRED
	pos (open | close) #REQUIRED
	rid NMTOKEN #IMPLIED
	ctype CDATA #IMPLIED
	ts CDATA #IMPLIED
	crc CDATA #IMPLIED
>
<!ELEMENT mrk (%TextContent;)*>
<!ATTLIST mrk
	mtype CDATA #REQUIRED
	mid NMTOKEN #IMPLIED
	comment CDATA #IMPLIED
	ts CDATA #IMPLIED
>
<!ELEMENT sub (%TextContent;)*>
<!ATTLIST sub
	datatype CDATA #IMPLIED
	ctype CDATA #IMPLIED
>
<!-- ***** End of DTD ************************************************ -->


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC