OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [xliff] RE: [xliff-inline] Req 1.15 Representation of invalid XML characters

Hi Yves,


The draft for inline codes that is in SVN says:


  • hex - mandatory. Hexadecimal value of the character's code point.
    The value can be padded with zeros and in upper or lower case. Allowed values are between hexadecimal 0000 and 10FFFF, both included.

I would change the text to indicate that the following valid character ranges are excluded:


#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]


Notice that in the W3C recommendation for XML schemas the canonical representation for hex values uses upper case hexadecimal digits. Lower case digits ([a-f]) are not allowed. See http://www.w3.org/TR/xmlschema-2/#hexBinary item 2, Canonical Representation. It would be easier to validate using XML schema if XLIFF doesn’t allow lower case.





Rodolfo M. Raya   <rmraya@maxprograms.com>

Maxprograms      http://www.maxprograms.com



> -----Original Message-----

> From: Yves Savourel [mailto:ysavourel@enlaso.com]

> Sent: Monday, September 12, 2011 1:58 PM

> To: xliff-inline@lists.oasis-open.org

> Cc: xliff@lists.oasis-open.org

> Subject: [xliff] RE: [xliff-inline] Req 1.15 Representation of invalid XML

> characters


> Hi David, Steven, Helena, all


> In our discussion about how to represent characters invalid in XML in XLIFF

> we've adopted an element similar to LDML's cp.


> In the processing expectation we are trying to decide what the user agent is

> suppose to do when the hex attribute value is invalid (e.g. hex='qwerty').


> Christian suggested to reach out to LDML for some ideas as this may have

> been discussed there already.

> David, Stevens, Helena: Any thought?


> I'm guessing Stevens may be more involved with LDML than David or Helena

> (pure speculation from me).

> I'm adding the TC mailing list on the thread, so he can see and post an answer

> if needed. (joining the SC to be able to post there is the other option)


> Below is an extract of our latest exchange.

> You can see all the emails here:

> http://lists.oasis-open.org/archives/xliff-inline/

> (search for the one with "1.15 Representation of invalid XML characters" in

> their title)



> > Maybe: "If the value of the hex attribute is invalid,

> > the Readers MUST generate an error and MAY terminate

> > the process. This specification does not prescribe how

> > invalid <cp> values are represented in the parsed content."

> >

> > But I still think it would be better to have an expected

> > behavior: it helps interoperability. U+FFFD seems to be

> > applicable for such case according to

> >

> http://en.wikipedia.org/wiki/Replacement_character#Replacement_charact

> er).

> >

> CL> I would be tempted to reach out to someone from LDML

> CL> (or general Unicode) to get guidance.



> Any pointer would be welcome,


> Cheers,

> -yves





> ---------------------------------------------------------------------

> To unsubscribe from this mail list, you must leave the OASIS TC that

> generates this mail.  Follow this link to all your TCs in OASIS at:

> https://www.oasis-

> open.org/apps/org/workgroup/portal/my_workgroups.php


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]