[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Resend: [xliff] more comments on the XLIFF DTD
Hi there, in one of our last conference calls, I was asked to resend the notes on the XLIFF DTD which I put together in a very early stage of our TC. Here it comes ... Best, Christian -----Original Message----- From: Lieske, Christian Sent: Freitag, 25. Januar 2002 14:41 To: 'xliff@lists.oasis-open.org' Subject: [xliff] more comments on the XLIFF DTD Hi All, Encouraged by the positive replies which David Leland received for his comments on the DTD (and the feeling that questions about the specification and DTD are welcome at this very early state of the TC's work), I dare to send out some comments of my own which I recently jotted down. I am aware of the risk, that some active members of the dataDefinition group may be somewhat bored (you know groans and "oh, not that again!") because they already might have spent several hours on the topics to which my comments pertain. However, I see the chance that explanations at the very start might be beneficial to the overall process. The specification will have to go through the approval process and presumably will be exposed to those same type of questions anyway, so we might as well address them now. For your convenience, I communicate my comments 1. as integral part of this mail (extracted from the DTD; see bottom of the mail; digits at the beginning of the lines are line numbers) 2. interspersed in the DTD and classify each comment as belonging to one of three categories R: remark that may not need to be addressed in the current specification but could be used for later work Q: question related to explanantions in the specification/DTD that were unclear to me I: idea that may not need to be addressed in the current specification but could be used for later work Best regards, Christian ----------------------------------------------------- Christian Lieske Christian.Lieske@sap.com SAP AG - MultiLingual Technology www.sap.com ----------------------------------------------------- 24:<!-- R: naming conventions: for multiwords sometimes mix of uppercase and lowercase (eg. CodeContent), sometimes hyphenated form (eg. source-language), sometimes all lowercase (eg. minheigth) --> 41:<!-- Q: What's the semantics of the attribute 'xml:lang' for the element 'xliff'? Does it identify the language of comments found in the XLIFF file? --> 45:<!-- I: Bring attribute 'original' for element 'file' in sync with information that can be given for external files (for them, you cannot only give a simple 'name' but for example an 'href'). --> 47:<!-- I: Rename attribute 'source-language' for element 'file' to 'source-lang' to improve consistency related to language identification (it's 'xml:lang', not 'xml:language'). --> 48:<!-- I: Some companies distinguish original language (authoring happens in this language), source language (that's the language from which translation starts), and target language(s). Accordingly, an optional attribute 'orig-lang' may be desirable. Consequently, one may not only have references to source files, but to original files as well. --> 50:<!-- R: The values for attribute 'datatype' for element 'file' may have to be defined more rigorously, since for example one may have to differentiate between types of Java resourceBundles (listResourceBundle vs. propertyResourceBundle). --> 52:<!-- I: In addition to the attribute 'tool' for the element 'file' one may want an attribute 'toolVersion' since for example the capabilities of Integrated Development Environment 'SuperJavaIDE' related to XLIFF may change over time. --> 54:<!-- I: Specify if the attribute 'date' for the element 'file' refers to a creation or an update. --> 55:<!-- I: Additional attribute related to dates: one that captures the deadline for a certain phase in the localization process. --> 58:<!-- I: Harmonizing the attribute 'ts' for element 'file' with the 'prop' element. Possibilities for harmonizing: renaming the attribute to 'prop', or doing away with either the attribute 'ts' or the element 'prop'. --> 60:<!-- I: A name like 'subjField' or 'domain' instead of 'category' for element 'file' may not reflect the semantics of the data category more clearly. --> 62:<!-- I: See remark on attribute 'source-language' for element 'file'. --> 70:<!-- R: I do not see a strong need for using an abbreviated name for element 'skl' since compared to other elements we will not see many occurrances of the element. Thus, using a long form like 'skeleton-file' should not increase file size too much but would increase readability (we already have elements 'internal-file' and 'external-file'). --> 74:<!-- I: Rename attribute 'form' for element 'internal-file' to 'mime-type', since the values are mime-types. --> 77:<!-- Q: What is the long form of attribute name 'crc'? --> 85:<!-- Q: The name of attribute 'uid' for element 'external-file' by means of the starting letter 'u' indicates some kind of universal scope. Does this hold (ie. are identifiers in skeleton files universal/unique)? In any case: How about an alternative name like 'skeleton-file-id'? --> 88:<!-- I: An attribute for subclassing references would be handy (resulting in something like <reference refType="mandatory TM">...</reference>). It would allow you to say 'Hey, here is a reference to a translation memory that you must use.' --> 95:<!-- I: An attribute 'to' for element 'note' (in addition to attribute 'from')? --> 137:<!-- Q: What's the semantics of attribute 'phase-name' for element 'phase'? I am not really able to distinguish it from attribute 'process-name'. --> 138:<!-- I: An additional attribute 'state' for element 'phase'. It could for example be used to reveal that the pre-editing of the contents has been started but has not been finished, yet (resulting in something like <phase phase-name="y" process-name="proofreading" state="started">. Possible values could be: 'not executable', 'executable', 'started', 'suspended', 'aborted', 'preliminary finished', and 'finished'. --> 164:<!-- R: Element 'resname' may tie XLIFF too much to the Windows world (I take 'control' to be an an explicit reference to Windows controls) --> 167:<!-- Q: What is the semantics of attribute 'extradata' of element 'group'. --> 175:<!-- I: References to css files in addition to in-place css instructions via the attribute 'css-style' of element 'group'. --> 178:<!-- Q: What is the semantics of attribute 'exstyle' of element 'group'. --> 208:<!-- Q: Why are the values of attribute 'maxbytes' and 'minbytes' of element 'trans-unit' toolspecific? Isn't it desirable to have uniform values? --> 209:<!-- I: I have seen the need to encode relationships between for example message strings (eg. to ensure consistency). Thus, a list-valued attribute like 'relationships' which references other 'trans-unit' elements might be handy. --> 217:<!-- R: Since the 'phase' element allows several processing steps for the source as well (eg. 'pre-edit for MT', 'general QA' ...) one may want more overlap between attributes for element 'target' and for element 'source'. --> 218:<!-- R: I wonder if element 'source' could not benefit from attributes like 'css-style' as well, since they might be used during rendering XLIFF data to end-users (and for example Yves' marvellous book taught us that formatting styles for source and target may differ). --> 241:<!-- I: See remarks on attribute 'tool' for element 'file' --> 245:<!-- I: Harmonize the name of attribute 'origin' for element 'alt-trans' with that of attribute 'match-quality' (resulting in something like 'match-origin'. --> 273:<!-- Q: Don't we need attributes like 'maxbyte' (cf. element 'trans-unit') for element 'bin-unit' as well? --> ---------------------------------------------------------------- <!-- XLIFF CAVEAT: This is not the original XLIFF DTD! Public Identifier: "-//XLIFF//DTD XLIFF//EN" Namespace URI: "http://www.xliff.org/xliff_1_0" History of modifications (latest first): May-15-2001 by YS: Add phase-name to <trans-unit> and <bin-unit> May-15-2001 by YS: Reverse id for <trans-unit> to required Apr-19-2001 by YS: Enda+JohnR last changes Apr-18-2001 by YS: Removed empty ATTLISTs Apr-12-2001 by YS: Changed target* to target+ in trans-match Apr-11-2001 by YS: Fixed DOCTYPE id Apr-10-2001 by YS: Synchronize from conference call Apr-05-2001 by YS: Synchronize with latest specs Apr-04-2001 by YS: Synchronize with latest specs Apr-03-2001 by YS: Added name in <prop-group> Apr-02-2001 by YS: Implemented JR fixes Mar-29-2001 by JC: fixes for xml:space and bin-unit Mar-28-2001 by YS: First draft version --> <!-- R: naming conventions: for multiwords sometimes mix of uppercase and lowercase (eg. CodeContent), sometimes hyphenated form (eg. source-language), sometimes all lowercase (eg. minheigth) --> <!ENTITY % CodeContent "#PCDATA|sub"> <!ENTITY % TextContent "#PCDATA|g|bpt|ept|ph|it|mrk|x|bx|ex"> <!ENTITY lt "&#60;"> <!ENTITY amp "&#38;"> <!ENTITY gt ">"> <!ENTITY apos "'"> <!ENTITY quot """> <!-- ***************************************************************** --> <!-- Structural Elements --> <!-- ***************************************************************** --> <!ELEMENT xliff (file)+> <!ATTLIST xliff version CDATA #FIXED "1.0" xml:lang CDATA #IMPLIED > <!-- Q: What's the semantics of the attribute 'xml:lang' for the element 'xliff'? Does it identify the language of comments found in the XLIFF file? --> <!ELEMENT file (header, body)> <!ATTLIST file original CDATA #REQUIRED <!-- I: Bring attribute 'original' for element 'file' in sync with information that can be given for external files (for them, you cannot only give a simple 'name' but for example an 'href'). --> source-language CDATA #REQUIRED <!-- I: Rename attribute 'source-language' for element 'file' to 'source-lang' to improve consistency related to language identification (it's 'xml:lang', not 'xml:language'). --> <!-- I: Some companies distinguish original language (authoring happens in this language), source language (that's the language from which translation starts), and target language(s). Accordingly, an optional attribute 'orig-lang' may be desirable. Consequently, one may not only have references to source files, but to original files as well. --> datatype CDATA #REQUIRED <!-- R: The values for attribute 'datatype' for element 'file' may have to be defined more rigorously, since for example one may have to differentiate between types of Java resourceBundles (listResourceBundle vs. propertyResourceBundle). --> tool CDATA #IMPLIED <!-- I: In addition to the attribute 'tool' for the element 'file' one may want an attribute 'toolVersion' since for example the capabilities of Integrated Development Environment 'SuperJavaIDE' related to XLIFF may change over time. --> date CDATA #IMPLIED <!-- I: Specify if the attribute 'date' for the element 'file' refers to a creation or an update. --> <!-- I: Additional attribute related to dates: one that captures the deadline for a certain phase in the localization process. --> xml:space (default | preserve) "default" ts CDATA #IMPLIED <!-- I: Harmonizing the attribute 'ts' for element 'file' with the 'prop' element. Possibilities for harmonizing: renaming the attribute to 'prop', or doing away with either the attribute 'ts' or the element 'prop'. --> category CDATA #IMPLIED <!-- I: A name like 'subjField' or 'domain' instead of 'category' for element 'file' may not reflect the semantics of the data category more clearly. --> target-language CDATA #IMPLIED <!-- I: See remark on attribute 'source-language' for element 'file'. --> product-name CDATA #IMPLIED product-version CDATA #IMPLIED build-num CDATA #IMPLIED > <!-- tool default = "manual" --> <!ELEMENT header (skl?, phase-group?, (prop-group | glossary | reference | note | count-group)*)> <!ELEMENT skl (internal-file | external-file)> <!-- R: I do not see a strong need for using an abbreviated name for element 'skl' since compared to other elements we will not see many occurrances of the element. Thus, using a long form like 'skeleton-file' should not increase file size too much but would increase readability (we already have elements 'internal-file' and 'external-file'). --> <!ELEMENT internal-file (#PCDATA)> <!ATTLIST internal-file form CDATA #IMPLIED <!-- I: Rename attribute 'form' for element 'internal-file' to 'mime-type', since the values are mime-types. --> crc NMTOKEN #IMPLIED > <!-- Q: What is the long form of attribute name 'crc'? --> <!-- text|base64 (text is default) --> <!ELEMENT external-file EMPTY> <!ATTLIST external-file href CDATA #REQUIRED crc NMTOKEN #IMPLIED uid NMTOKEN #IMPLIED > <!-- Q: The name of attribute 'uid' for element 'external-file' by means of the starting letter 'u' indicates some kind of universal scope. Does this hold (ie. are identifiers in skeleton files universal/unique)? In any case: How about an alternative name like 'skeleton-file-id'? --> <!ELEMENT glossary (internal-file | external-file)> <!ELEMENT reference (internal-file | external-file)> <!-- I: An attribute for subclassing references would be handy (resulting in something like <reference refType="mandatory TM">...</reference>). It would allow you to say 'Hey, here is a reference to a translation memory that you must use.' --> <!ELEMENT note (#PCDATA)> <!ATTLIST note xml:lang CDATA #IMPLIED priority (1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10) "1" from CDATA #IMPLIED > <!-- I: An attribute 'to' for element 'note' (in addition to attribute 'from')? --> <!ELEMENT prop-group (prop)+> <!ATTLIST prop-group name CDATA #IMPLIED > <!ELEMENT prop (#PCDATA)> <!ATTLIST prop prop-type CDATA #REQUIRED xml:lang CDATA #IMPLIED > <!ELEMENT context-group (context)+> <!ATTLIST context-group name CDATA #REQUIRED crc NMTOKEN #IMPLIED > <!-- Processing instructions related to <context-group>: <?xliff-show-context-group name='value' ?> Indicates that any <context-group> element with a name set to 'value' should be displayed to the end-user. --> <!ELEMENT context (#PCDATA)> <!ATTLIST context context-type CDATA #REQUIRED match-mandatory (yes | no) "no" crc NMTOKEN #IMPLIED > <!-- Processing instructions related to <context>: <?xliff-show-context context-type='value' ?> Indicates that any <context> element with a context-type set to 'value' should be displayed to the end-user. --> <!ELEMENT phase-group (phase)+> <!ELEMENT phase (note)*> <!ATTLIST phase phase-name CDATA #REQUIRED process-name CDATA #REQUIRED <!-- Q: What's the semantics of attribute 'phase-name' for element 'phase'? I am not really able to distinguish it from attribute 'process-name'. --> <!-- I: An additional attribute 'state' for element 'phase'. It could for example be used to reveal that the pre-editing of the contents has been started but has not been finished, yet (resulting in something like <phase phase-name="y" process-name="proofreading" state="started">. Possible values could be: 'not executable', 'executable', 'started', 'suspended', 'aborted', 'preliminary finished', and 'finished'. --> company-name CDATA #IMPLIED tool CDATA #IMPLIED date CDATA #IMPLIED job-id CDATA #IMPLIED contact-name CDATA #IMPLIED contact-email CDATA #IMPLIED contact-phone CDATA #IMPLIED > <!ELEMENT count-group (count)*> <!ATTLIST count-group name CDATA #REQUIRED > <!ELEMENT count (#PCDATA)> <!ATTLIST count count-type CDATA #IMPLIED unit CDATA #IMPLIED > <!ELEMENT body (group | trans-unit | bin-unit)*> <!ELEMENT group ((context-group*, count-group*, prop-group*, note*), (group | trans-unit | bin-unit)*)> <!ATTLIST group id NMTOKEN #IMPLIED datatype CDATA #IMPLIED xml:space (default | preserve) "default" ts CDATA #IMPLIED restype CDATA #IMPLIED <!-- R: Element 'resname' may tie XLIFF too much to the Windows world (I take 'control' to be an an explicit reference to Windows controls) --> resname NMTOKEN #IMPLIED extradata CDATA #IMPLIED <!-- Q: What is the semantics of attribute 'extradata' of element 'group'. --> help-id NMTOKEN #IMPLIED menu CDATA #IMPLIED menu-option CDATA #IMPLIED menu-name CDATA #IMPLIED coord CDATA #IMPLIED font CDATA #IMPLIED css-style CDATA #IMPLIED <!-- I: References to css files in addition to in-place css instructions via the attribute 'css-style' of element 'group'. --> style NMTOKEN #IMPLIED exstyle NMTOKEN #IMPLIED <!-- Q: What is the semantics of attribute 'exstyle' of element 'group'. --> > <!ELEMENT trans-unit (source, target?, (count-group | note | context-group | prop-group | alt-trans)*)> <!ATTLIST trans-unit id NMTOKEN #REQUIRED approved (yes | no) #IMPLIED translate (yes | no) "yes" reformat (yes | no) "yes" xml:space (default | preserve) "default" datatype CDATA #IMPLIED ts CDATA #IMPLIED restype CDATA #IMPLIED resname NMTOKEN #IMPLIED extradata CDATA #IMPLIED help-id NMTOKEN #IMPLIED menu CDATA #IMPLIED menu-option CDATA #IMPLIED menu-name CDATA #IMPLIED coord CDATA #IMPLIED font CDATA #IMPLIED css-style CDATA #IMPLIED style NMTOKEN #IMPLIED exstyle NMTOKEN #IMPLIED size-unit CDATA #IMPLIED maxwidth NMTOKEN #IMPLIED minwidth NMTOKEN #IMPLIED maxheight NMTOKEN #IMPLIED minheight NMTOKEN #IMPLIED maxbytes NMTOKEN #IMPLIED minbytes NMTOKEN #IMPLIED <!-- Q: Why are the values of attribute 'maxbytes' and 'minbytes' of element 'trans-unit' toolspecific? Isn't it desirable to have uniform values? --> <!-- I: I have seen the need to encode relationships between for example message strings (eg. to ensure consistency). Thus, a list-valued attribute like 'relationships' which references other 'trans-unit' elements might be handy. --> phase-name CDATA #IMPLIED > <!-- size-unit: char|byte|pixel|glyph|dlgunit default='pixel' --> <!ELEMENT source (%TextContent;)*> <!ATTLIST source xml:lang CDATA #IMPLIED ts CDATA #IMPLIED <!-- R: Since the 'phase' element allows several processing steps for the source as well (eg. 'pre-edit for MT', 'general QA' ...) one may want more overlap between attributes for element 'target' and for element 'source'. --> <!-- R: I wonder if element 'source' could not benefit from attributes like 'css-style' as well, since they might be used during rendering XLIFF data to end-users (and for example Yves' marvellous book taught us that formatting styles for source and target may differ). --> > <!-- coord = "x;y;cx;cy" font= "fontname[;size[;weight]]" --> <!ELEMENT target (%TextContent;)*> <!ATTLIST target state NMTOKEN #IMPLIED phase-name NMTOKEN #IMPLIED xml:lang CDATA #IMPLIED ts CDATA #IMPLIED restype CDATA #IMPLIED resname NMTOKEN #IMPLIED coord CDATA #IMPLIED font CDATA #IMPLIED css-style CDATA #IMPLIED style NMTOKEN #IMPLIED exstyle NMTOKEN #IMPLIED > <!ELEMENT alt-trans (source?, target+, (note | context-group | prop-group)*)> <!ATTLIST alt-trans match-quality CDATA #IMPLIED tool CDATA #IMPLIED <!-- I: See remarks on attribute 'tool' for element 'file' --> crc NMTOKEN #IMPLIED xml:lang CDATA #IMPLIED origin CDATA #IMPLIED <!-- I: Harmonize the name of attribute 'origin' for element 'alt-trans' with that of attribute 'match-quality' (resulting in something like 'match-origin'. --> datatype CDATA #IMPLIED xml:space (default | preserve) "default" ts CDATA #IMPLIED restype CDATA #IMPLIED resname NMTOKEN #IMPLIED extradata CDATA #IMPLIED help-id NMTOKEN #IMPLIED menu CDATA #IMPLIED menu-option CDATA #IMPLIED menu-name CDATA #IMPLIED coord CDATA #IMPLIED font CDATA #IMPLIED css-style CDATA #IMPLIED style NMTOKEN #IMPLIED exstyle NMTOKEN #IMPLIED > <!ELEMENT bin-unit (bin-source, bin-target?, (note | context-group | prop-group | trans-unit)*)> <!ATTLIST bin-unit id NMTOKEN #REQUIRED mime-type NMTOKEN #REQUIRED approved (yes | no) #IMPLIED translate (yes | no) "yes" reformat (yes | no) "yes" ts CDATA #IMPLIED restype CDATA #IMPLIED resname NMTOKEN #IMPLIED phase-name CDATA #IMPLIED <!-- Q: Don't we need attributes like 'maxbyte' (cf. element 'trans-unit') for element 'bin-unit' as well? --> > <!ELEMENT bin-source (internal-file | external-file)> <!ATTLIST bin-source ts CDATA #IMPLIED > <!ELEMENT bin-target (internal-file | external-file)> <!ATTLIST bin-target mime-type NMTOKEN #IMPLIED ts CDATA #IMPLIED state NMTOKEN #IMPLIED phase-name NMTOKEN #IMPLIED restype CDATA #IMPLIED resname NMTOKEN #IMPLIED > <!-- ***************************************************************** --> <!-- In-Line Elements --> <!-- ***************************************************************** --> <!ELEMENT g (%TextContent;)*> <!ATTLIST g id CDATA #REQUIRED ctype CDATA #IMPLIED clone (yes | no) "yes" ts CDATA #IMPLIED > <!ELEMENT x EMPTY> <!ATTLIST x id CDATA #REQUIRED ctype CDATA #IMPLIED clone (yes | no) "yes" ts CDATA #IMPLIED > <!ELEMENT bx EMPTY> <!ATTLIST bx id CDATA #REQUIRED rid NMTOKEN #IMPLIED ctype CDATA #IMPLIED clone (yes | no) "yes" ts CDATA #IMPLIED > <!ELEMENT ex EMPTY> <!ATTLIST ex id CDATA #REQUIRED rid NMTOKEN #IMPLIED ts CDATA #IMPLIED > <!ELEMENT ph (%CodeContent;)*> <!ATTLIST ph id CDATA #REQUIRED ctype CDATA #IMPLIED ts CDATA #IMPLIED crc CDATA #IMPLIED assoc CDATA #IMPLIED > <!ELEMENT bpt (%CodeContent;)*> <!ATTLIST bpt id CDATA #REQUIRED rid NMTOKEN #IMPLIED ctype CDATA #IMPLIED ts CDATA #IMPLIED crc CDATA #IMPLIED > <!ELEMENT ept (%CodeContent;)*> <!ATTLIST ept id CDATA #REQUIRED rid NMTOKEN #IMPLIED ts CDATA #IMPLIED crc CDATA #IMPLIED > <!ELEMENT it (%CodeContent;)*> <!ATTLIST it id CDATA #REQUIRED pos (open | close) #REQUIRED rid NMTOKEN #IMPLIED ctype CDATA #IMPLIED ts CDATA #IMPLIED crc CDATA #IMPLIED > <!ELEMENT mrk (%TextContent;)*> <!ATTLIST mrk mtype CDATA #REQUIRED mid NMTOKEN #IMPLIED comment CDATA #IMPLIED ts CDATA #IMPLIED > <!ELEMENT sub (%TextContent;)*> <!ATTLIST sub datatype CDATA #IMPLIED ctype CDATA #IMPLIED > <!-- ***** End of DTD ************************************************ -->
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC