OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita-translation message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [dita-translation] Changes to documentation of xml:lang and translateattributes

Hi Kevin,

The reason why we should not have a prescriptive list for xml:lang, is 
the same one that inclined the W3C to do the same, or for that matter 
RFC 3066 itself. RFC 3066 is designed as an open ended notation based on 
ISO 3166 and ISO 639. Why doesn't RFC 3066 provide a defined list of 
values? Because it would be too restrictive. There is little point in 
trying to outsmart the W3C or IANA. They have been through this process 
many, many times.

There is a clear distinction between what XML can do in terms of 
validation, and what externally referenced standards can do. XML 1.0 
specifies that the value of xml:lang is governed by RFC 3066. Therefore 
"english_for_the_united_kingdom" is not valid.

In the end you have to provide your own validation for xml:lang based on 
RFC 3066. This is an XML fact of life.

Best Regards,


Farwell, Kevin wrote:
> Hi,
> Let me clarify. I never said the rules are not clear. I said the method
> for enforcing the rules is not clear. While "en-uk" is not a valid
> locale according to the rules, it is completely valid according to the
> DTD. "english_for_the_united_kingdom" is also completely valid. My point
> is not to fix the rules but to apply them. Incidentally, the list of
> allowed values in the DITA reference features lowercase country codes,
> which violates the capitalization rule listed below, for what it's
> worth.
> Using secondary tools to determine whether XML is correct seems risky to
> me. Validating a file, in that case, guarantees it's valid when tested
> against a content model but not necessarily that it is valid within a
> production environment. At very least, doesn't that create the potential
> for a false sense of security? Must every file be validated twice by two
> different methods? I would think that arriving at a valid file should
> mean that the file can go through the rest of the system with no further
> work.
> Kevin
> -----Original Message-----
> From: Andrzej Zydron [mailto:azydron@xml-intl.com] 
> Sent: Wednesday, March 08, 2006 2:11 PM
> To: gershon@tech-tav.com
> Cc: Farwell, Kevin; 'Felix Sasaki'; 'Robert D Anderson'; bhertz@sdl.com;
> 'Bryan Schnabel'; 'Charles Pau'; 'Lieske, Christian'; 'Dave A Schell';
> dita-translation@lists.oasis-open.org; dpooley@sdl.com; 'Richard
> Ishida'; 'Jennifer Linton'; mambrose@sdl.com; patrickk@scriptware.nl;
> pcarey@lexmark.com; Reynolds, Peter; rfletcher@sdl.com; Munshi, Sukumar;
> tony.jewtushenko@productinnovator.com; 'Yves Savourel'
> Subject: Re: [dita-translation] Changes to documentation of xml:lang and
> translate attributes
> Hi Gershon,
> I agree with you. There is little point in setting out a proscriptive
> list. The implementation guidelines should state the contents of the
> xml:lang attribute must follow the rules of Extensible Markup Language
> (XML) 1.0 (Third Edition) section 2.12, which mandates the use of IETF
> RFC 3066. There should not be a need for a full proscriptive list as
> this would be too restrictive and inflexible. The rules for RFC 3066 are
> well defined and not as free form as Kevin's email suggests.
> The value 'en-uk' is not a valid RFC 3066 value as per Kevin's example
> for two reasons:
> 1) 'uk' is not a valid ISO 3166 country code. 'GB' is the ISO 3166
> country code for Great Britain.
> 2) It is in lower case. Country codes must be in upper case.
> I can see no benefit in trying to 'better' the XML standard itself. The
> only weakness in RFC 3066 is the inability to add script information to
> the locale as well as regional or variant settings. This is not going to
> be a problem for DITA in the near term. These issues are being addressed
> in RFC 3066bis, which is still in draft form. RFC 3066bis is backwards
> compatible with RFC 3066 and should not cause a problem for any DITA 1.1
> implementation anyway.
> Best Regards,
> AZ
> Gershon L Joseph wrote:
>>If we hard-code it in the DTD, we'll have a hard time keeping the set 
>>of allowable values up-to-date. Also, I've yet to find an accurate 
>>fully up-to-date list of values on the Web that's not draft or 
>>incomplete. I think it should be up to the implementation to ensure 
>>the value entered is valid, or to offer the user a list of options 
>>customized to the user's needs. I suspect offering a list of about 100
>>values will confuse the user almost as much as leaving them to
> research it themselves.
>>I don't mind adding a link in the spec documentation to an accurate 
>>list that's always going to be kept updated. I have not found such a 
>>list (I'm sure it exists, but I could find anything valuable via
> Google).
>>What do others think?
>>Best Regards,
>>-----Original Message-----
>>From: Farwell, Kevin [mailto:Kevin.Farwell@lionbridge.com]
>>Sent: Wednesday, March 08, 2006 6:58 PM
>>To: gershon@tech-tav.com; Felix Sasaki; Robert D Anderson
>>Cc: bhertz@sdl.com; Bryan Schnabel; Charles Pau; Lieske, Christian; 
>>Dave A Schell; dita-translation@lists.oasis-open.org; dpooley@sdl.com;
>>Richard Ishida; Jennifer Linton; mambrose@sdl.com; 
>>patrickk@scriptware.nl; pcarey@lexmark.com; Reynolds, Peter; 
>>rfletcher@sdl.com; Munshi, Sukumar; 
>>tony.jewtushenko@productinnovator.com; Yves Savourel
>>Subject: RE: [dita-translation] Changes to documentation of xml:lang 
>>and translate attributes
>>I have a question about the values of the xml:lang attribute. With 
>>phrases like "The allowed xml:lang values..." from the DITA reference 
>>and "This attribute must be set to a language identifier, as
> defined..."
>>from the email below, I don't understand why the values aren't set in 
>>the DTD and the users aren't given a list to pick from instead of a 
>>set of rules to follow. As an NMTOKEN, the value of the xml:lang 
>>attribute can be anything the user desires as still be valid. If 
>>something must be enforced, why leave it to users to enforce it? Why 
>>doesn't the content model enforce it?
>>Confusion surrounding the locale codes is fairly easy to understand. 
>>The textual description runs country-language, but the symbol runs 
>>language-country. If a user is trying to remember the symbol for UK 
>>English, gb-en is as likely as en-gb, and even if they remember the 
>>country comes first, why wouldn't UK English be en-uk? Latvian is 
>>lv-lv, so why isn't Japanese ja-ja or jp-jp? If what's "allowed" 
>>"must" be in the attribute value, why leave it to chance or leave it 
>>up to users doing research (which, in my opinion, are the same thing)?
>>-----Original Message-----
>>From: Gershon L Joseph [mailto:gershon@tech-tav.com]
>>Sent: Wednesday, March 08, 2006 8:38 AM
>>To: 'Felix Sasaki'; 'Robert D Anderson'
>>Cc: bhertz@sdl.com; 'Bryan Schnabel'; 'Charles Pau'; 'Lieske, 
>>Christian'; 'Dave A Schell'; dita-translation@lists.oasis-open.org;
>>dpooley@sdl.com; 'Richard Ishida'; 'Jennifer Linton'; 
>>mambrose@sdl.com; patrickk@scriptware.nl; pcarey@lexmark.com; 
>>Reynolds, Peter; rfletcher@sdl.com; Munshi, Sukumar; 
>>'Yves Savourel'
>>Subject: RE: [dita-translation] Changes to documentation of xml:lang 
>>and translate attributes
>>Thank you all for your input. I'm replying to all comments in a single
>>email to make it easier to follow this thread and where we're going...
>>Here are new proposals for the two attributes based on all the 
>>feedback I've received to-date, as well as our discussions during
> Monday's SC meeting.
>>My previous proposal kept the original descriptions in the current 
>>spec as much as possible, and I'm glad I received the reactions I did
> (e.g.
>>English being the default language -- I felt uneasy about that one
> too).
>>I took the default values from the spec, which I now see confused 
>>everyone; I've changed them to reflect their usage.
>>Name: translate
>>Description: Indicates whether the content of the element should be 
>>translated or not. The translate attribute setting applies to the 
>>element on which it is set, and is inherited by all child elements 
>>that do not specify the translate attribute. The translate attribute 
>>does not indicate whether attribute values of the element and its 
>>children should be translated; attribute values should never be 
>>translated. If this attribute is not specified on the document 
>>element, then processors must assume translate="yes".
>>Data Type: yes | no
>>Default Value: Not set
>>Required: #IMPLIED
>>Name: xml:lang
>>Description: Specifies the language and locale of the element content.
>>The intent declared with xml:lang is considered to apply to all 
>>attributes and content of the element where it is specified, unless 
>>overridden with an instance of xml:lang on another element within that
>>content. When no xml:lang value is supplied, the processor should
> assume a default value.
>>This attribute must be set to a language identifier, as defined by 
>>3066 (http://www.ietf.org/rfc/rfc3066.txt) or successor.
>>Data Type: NMTOKEN
>>Default Value: Not set
>>Required: #IMPLIED
>>----------------------------- Text inserted by Panda Platinum 2005 
>>Internet Security:
>> This message has NOT been classified as spam. If it is unsolicited 
>>mail (spam), click on the following link to reclassify it: 


email - azydron@xml-intl.com
smail - c/o Mr. A.Zydron
	PO Box 2167
         Gerrards Cross
         Bucks SL9 8XF
	United Kingdom
Mobile +(44) 7966 477 181
FAX    +(44) 1753 480 465
www - http://www.xml-intl.com

This message contains confidential information and is intended only
for the individual named.  If you are not the named addressee you
may not disseminate, distribute or copy this e-mail.  Please
notify the sender immediately by e-mail if you have received this
e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or error-free
as information could be intercepted, corrupted, lost, destroyed,
arrive late or incomplete, or contain viruses.  The sender therefore
does not accept liability for any errors or omissions in the contents
of this message which arise as a result of e-mail transmission.  If
verification is required please request a hard-copy version. Unless
explicitly stated otherwise this message is provided for informational
purposes only and should not be construed as a solicitation or offer.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]