OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita-translation message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [dita-translation] Changes to documentation of xml:lang and translate attributes


Hi,

I think we're talking about two different things. Every time I bring up
a problem of whether a valid XML file is correct, I am told my premise
is incorrect because the attribute values I suggest do not follow rules
external to the tools that validate the file. I'm not questioning the
work leading up to what constitutes a locale code in an XML file. I'm
questioning a plan that relies on authors to have read and understood
that work. I agree that the absurd locale codes I've offered are not
correct, but I contend once again that if you type them into the value
of the xml:lang attribute your file will be valid (and incorrect).

What I'm trying to get to is whether it's a good idea to have valid xml
files that can contain incorrect information and then rely on an output
process to find those incorrect values. I say it isn't. 

Should you think my suggestion that authors might use incorrect values
is not reasonable, I assure you I'm not just dreaming it up. Do a survey
of your clients or colleagues about various language or country codes.
Some will guess, and some will have such confidence in their guess they
will not consider looking it up. 

Kevin  

-----Original Message-----
From: Andrzej Zydron [mailto:azydron@xml-intl.com] 
Sent: Friday, March 10, 2006 7:22 AM
To: Farwell, Kevin
Cc: gershon@tech-tav.com; Felix Sasaki; Robert D Anderson;
bhertz@sdl.com; Bryan Schnabel; Charles Pau; Lieske, Christian; Dave A
Schell; dita-translation@lists.oasis-open.org; dpooley@sdl.com; Richard
Ishida; Jennifer Linton; mambrose@sdl.com; patrickk@scriptware.nl;
pcarey@lexmark.com; Reynolds, Peter; rfletcher@sdl.com; Munshi, Sukumar;
tony.jewtushenko@productinnovator.com; Yves Savourel
Subject: Re: [dita-translation] Changes to documentation of xml:lang and
translate attributes

Hi Kevin,

The reason why we should not have a prescriptive list for xml:lang, is
the same one that inclined the W3C to do the same, or for that matter
RFC 3066 itself. RFC 3066 is designed as an open ended notation based on
ISO 3166 and ISO 639. Why doesn't RFC 3066 provide a defined list of
values? Because it would be too restrictive. There is little point in
trying to outsmart the W3C or IANA. They have been through this process
many, many times.

There is a clear distinction between what XML can do in terms of
validation, and what externally referenced standards can do. XML 1.0
specifies that the value of xml:lang is governed by RFC 3066. Therefore
"english_for_the_united_kingdom" is not valid.

In the end you have to provide your own validation for xml:lang based on
RFC 3066. This is an XML fact of life.

Best Regards,

AZ


Farwell, Kevin wrote:
> Hi,
> 
> Let me clarify. I never said the rules are not clear. I said the 
> method for enforcing the rules is not clear. While "en-uk" is not a 
> valid locale according to the rules, it is completely valid according 
> to the DTD. "english_for_the_united_kingdom" is also completely valid.

> My point is not to fix the rules but to apply them. Incidentally, the 
> list of allowed values in the DITA reference features lowercase 
> country codes, which violates the capitalization rule listed below, 
> for what it's worth.
> 
> Using secondary tools to determine whether XML is correct seems risky 
> to me. Validating a file, in that case, guarantees it's valid when 
> tested against a content model but not necessarily that it is valid 
> within a production environment. At very least, doesn't that create 
> the potential for a false sense of security? Must every file be 
> validated twice by two different methods? I would think that arriving 
> at a valid file should mean that the file can go through the rest of 
> the system with no further work.
> 
> Kevin
> 
> -----Original Message-----
> From: Andrzej Zydron [mailto:azydron@xml-intl.com]
> Sent: Wednesday, March 08, 2006 2:11 PM
> To: gershon@tech-tav.com
> Cc: Farwell, Kevin; 'Felix Sasaki'; 'Robert D Anderson'; 
> bhertz@sdl.com; 'Bryan Schnabel'; 'Charles Pau'; 'Lieske, Christian'; 
> 'Dave A Schell'; dita-translation@lists.oasis-open.org; 
> dpooley@sdl.com; 'Richard Ishida'; 'Jennifer Linton'; 
> mambrose@sdl.com; patrickk@scriptware.nl; pcarey@lexmark.com; 
> Reynolds, Peter; rfletcher@sdl.com; Munshi, Sukumar;
tony.jewtushenko@productinnovator.com; 'Yves Savourel'
> Subject: Re: [dita-translation] Changes to documentation of xml:lang 
> and translate attributes
> 
> Hi Gershon,
> 
> I agree with you. There is little point in setting out a proscriptive 
> list. The implementation guidelines should state the contents of the 
> xml:lang attribute must follow the rules of Extensible Markup Language
> (XML) 1.0 (Third Edition) section 2.12, which mandates the use of IETF

> RFC 3066. There should not be a need for a full proscriptive list as 
> this would be too restrictive and inflexible. The rules for RFC 3066 
> are well defined and not as free form as Kevin's email suggests.
> 
> The value 'en-uk' is not a valid RFC 3066 value as per Kevin's example

> for two reasons:
> 
> 1) 'uk' is not a valid ISO 3166 country code. 'GB' is the ISO 3166 
> country code for Great Britain.
> 2) It is in lower case. Country codes must be in upper case.
> 
> I can see no benefit in trying to 'better' the XML standard itself. 
> The only weakness in RFC 3066 is the inability to add script 
> information to the locale as well as regional or variant settings. 
> This is not going to be a problem for DITA in the near term. These 
> issues are being addressed in RFC 3066bis, which is still in draft 
> form. RFC 3066bis is backwards compatible with RFC 3066 and should not

> cause a problem for any DITA 1.1 implementation anyway.
> 
> Best Regards,
> 
> AZ
> 
> Gershon L Joseph wrote:
> 
>>If we hard-code it in the DTD, we'll have a hard time keeping the set 
>>of allowable values up-to-date. Also, I've yet to find an accurate 
>>fully up-to-date list of values on the Web that's not draft or 
>>incomplete. I think it should be up to the implementation to ensure 
>>the value entered is valid, or to offer the user a list of options 
>>customized to the user's needs. I suspect offering a list of about 100
> 
> 
>>values will confuse the user almost as much as leaving them to
> 
> research it themselves.
> 
>>I don't mind adding a link in the spec documentation to an accurate 
>>list that's always going to be kept updated. I have not found such a 
>>list (I'm sure it exists, but I could find anything valuable via
> 
> Google).
> 
>>What do others think?
>>
>>
>>Best Regards,
>>Gershon
>>
>>-----Original Message-----
>>From: Farwell, Kevin [mailto:Kevin.Farwell@lionbridge.com]
>>Sent: Wednesday, March 08, 2006 6:58 PM
>>To: gershon@tech-tav.com; Felix Sasaki; Robert D Anderson
>>Cc: bhertz@sdl.com; Bryan Schnabel; Charles Pau; Lieske, Christian; 
>>Dave A Schell; dita-translation@lists.oasis-open.org; dpooley@sdl.com;
> 
> 
>>Richard Ishida; Jennifer Linton; mambrose@sdl.com; 
>>patrickk@scriptware.nl; pcarey@lexmark.com; Reynolds, Peter; 
>>rfletcher@sdl.com; Munshi, Sukumar; 
>>tony.jewtushenko@productinnovator.com; Yves Savourel
>>Subject: RE: [dita-translation] Changes to documentation of xml:lang 
>>and translate attributes
>>
>>Hi,
>>
>>I have a question about the values of the xml:lang attribute. With 
>>phrases like "The allowed xml:lang values..." from the DITA reference 
>>and "This attribute must be set to a language identifier, as
> 
> defined..."
> 
>>from the email below, I don't understand why the values aren't set in 
>>the DTD and the users aren't given a list to pick from instead of a 
>>set of rules to follow. As an NMTOKEN, the value of the xml:lang 
>>attribute can be anything the user desires as still be valid. If 
>>something must be enforced, why leave it to users to enforce it? Why 
>>doesn't the content model enforce it?
>>
>>Confusion surrounding the locale codes is fairly easy to understand. 
>>The textual description runs country-language, but the symbol runs 
>>language-country. If a user is trying to remember the symbol for UK 
>>English, gb-en is as likely as en-gb, and even if they remember the 
>>country comes first, why wouldn't UK English be en-uk? Latvian is 
>>lv-lv, so why isn't Japanese ja-ja or jp-jp? If what's "allowed"
>>"must" be in the attribute value, why leave it to chance or leave it 
>>up to users doing research (which, in my opinion, are the same thing)?
>>
>>Kevin
>>
>>-----Original Message-----
>>From: Gershon L Joseph [mailto:gershon@tech-tav.com]
>>Sent: Wednesday, March 08, 2006 8:38 AM
>>To: 'Felix Sasaki'; 'Robert D Anderson'
>>Cc: bhertz@sdl.com; 'Bryan Schnabel'; 'Charles Pau'; 'Lieske, 
>>Christian'; 'Dave A Schell'; dita-translation@lists.oasis-open.org;
>>dpooley@sdl.com; 'Richard Ishida'; 'Jennifer Linton'; 
>>mambrose@sdl.com; patrickk@scriptware.nl; pcarey@lexmark.com; 
>>Reynolds, Peter; rfletcher@sdl.com; Munshi, Sukumar; 
>>tony.jewtushenko@productinnovator.com;
>>'Yves Savourel'
>>Subject: RE: [dita-translation] Changes to documentation of xml:lang 
>>and translate attributes
>>
>>Thank you all for your input. I'm replying to all comments in a single
> 
> 
>>email to make it easier to follow this thread and where we're going...
>>
>>Here are new proposals for the two attributes based on all the 
>>feedback I've received to-date, as well as our discussions during
> 
> Monday's SC meeting.
> 
>>My previous proposal kept the original descriptions in the current 
>>spec as much as possible, and I'm glad I received the reactions I did
> 
> (e.g.
> 
>>English being the default language -- I felt uneasy about that one
> 
> too).
> 
>>I took the default values from the spec, which I now see confused 
>>everyone; I've changed them to reflect their usage.
>>
>>PROPOSAL FOR translate ATTRIBUTE:
>>
>>Name: translate
>>
>>Description: Indicates whether the content of the element should be 
>>translated or not. The translate attribute setting applies to the 
>>element on which it is set, and is inherited by all child elements 
>>that do not specify the translate attribute. The translate attribute 
>>does not indicate whether attribute values of the element and its 
>>children should be translated; attribute values should never be 
>>translated. If this attribute is not specified on the document 
>>element, then processors must assume translate="yes".
>>
>>Data Type: yes | no
>>
>>Default Value: Not set
>>
>>Required: #IMPLIED
>>
>>
>>PROPOSAL FOR xml:lang ATTRIBUTE:
>>
>>Name: xml:lang
>>
>>Description: Specifies the language and locale of the element content.
>>The intent declared with xml:lang is considered to apply to all 
>>attributes and content of the element where it is specified, unless 
>>overridden with an instance of xml:lang on another element within that
> 
> 
>>content. When no xml:lang value is supplied, the processor should
> 
> assume a default value.
> 
>>This attribute must be set to a language identifier, as defined by 
>>IETF RFC
>>3066 (http://www.ietf.org/rfc/rfc3066.txt) or successor.
>>
>>Data Type: NMTOKEN
>>
>>Default Value: Not set
>>
>>Required: #IMPLIED
>>
>>
>>
>>----------------------------------------------------------------------
>>----------------------------- Text inserted by Panda Platinum 2005 
>>Internet Security:
>>
>> This message has NOT been classified as spam. If it is unsolicited 
>>mail (spam), click on the following link to reclassify it:
>>http://127.0.0.1:6083/Panda?ID=pav_47530&SPAM=true
>>----------------------------------------------------------------------
>>-----------------------------
>>
>>
> 
> 
> 


-- 


email - azydron@xml-intl.com
smail - c/o Mr. A.Zydron
	PO Box 2167
         Gerrards Cross
         Bucks SL9 8XF
	United Kingdom
Mobile +(44) 7966 477 181
FAX    +(44) 1753 480 465
www - http://www.xml-intl.com

This message contains confidential information and is intended only for
the individual named.  If you are not the named addressee you may not
disseminate, distribute or copy this e-mail.  Please notify the sender
immediately by e-mail if you have received this e-mail by mistake and
delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or error-free as
information could be intercepted, corrupted, lost, destroyed, arrive
late or incomplete, or contain viruses.  The sender therefore does not
accept liability for any errors or omissions in the contents of this
message which arise as a result of e-mail transmission.  If verification
is required please request a hard-copy version. Unless explicitly stated
otherwise this message is provided for informational purposes only and
should not be construed as a solicitation or offer.








[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]