OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita-translation message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [dita-translation] Changes to documentation of xml:lang and translateattributes

I have a couple of questions about the locale. Perhaps I should not have
used that word in describing xml:lang -- the confusion I wish to clarify is
that it means more than simply language. For example, if we need to
generate text that describes the compartment on the back of a vehicle, a
setting of xml:lang="en-us" will generate "Trunk", while a setting of
xml:lang="en-gb" will generate "Boot".

So, the first question is - how can that distinction be made in the Spec?
If we simply call it language, it is not clear that DITA is *in fact* able
to distinguish between different flavors of a single language.

Second - is there a need for a separate locale distinction in the DITA
source?  I would expect most locale settings to be based on system settings
(where local production rules are applied), rather than on something
described in the source. In the beer example, I would also expect this to
be coded differently. If you want to display a value that varies based on
locale, you would code it as <ptr value="germanybeerprice"/>, and look up
both the value and the unit of measurement. If you want to say that it
costs 5, then you would indicate the unit of measurement. If you do not
want to type the Euro character, DITA can still make this easy by letting
you specialize a <euro> element for ease of authoring.

I don't mean to say by this that there is no need to set the locale in the
source, though - I feel that I'm on the edge of coming up with a good use
case, but just can't quite do it. The closest I can come is a price list
from a German store. If they want to sell in the US, they will translate
the product descriptions to "en-us", but would keep the prices constant in
Euros. However, I think they would still want to code the Euro character
with the price, just like American companies going the other way would
hardcode the $. Otherwise, down the road, somebody will force the display
into their native locale, and the German store will get $500 US for a 500
Euro product.

Can anybody else come up with a better use case? One that would require
users to set the locale in the source, rather than relying on the system
settings? As with the dir attribute, I would hesitate to rush it in for 1.1
unless we see a real need for it.

Robert D Anderson
Authoring Tools Development
Chief Architect, DITA Open Toolkit

             "Gershon L                                                    
             <gershon@tech-tav                                          To 
             .com>                     "'Felix Sasaki'" <fsasaki@w3.org>,  
                                       Robert D                            
             03/03/2006 04:04          Anderson/Rochester/IBM@IBMUS        
             AM                                                         cc 
                                       <bhertz@sdl.com>, "'Bryan           
             Please respond to         <bryan.s.schnabel@tek.com>, Charles 
                  gershon              Pau/Cambridge/IBM@Lotus, "'Lieske,  
                                       <christian.lieske@sap.com>, Dave A  
                                       org>, <dpooley@sdl.com>, "'Richard  
                                       Ishida'" <ishida@w3.org>,           
                                       "'Jennifer Linton'"                 
                                       com>, "'Yves Savourel'"             
                                       RE: [dita-translation] Changes to   
                                       documentation of xml:lang and       
                                       translate attributes                

Felix, has the ITC defined a separate attribute for locale? If so, what did
you call it? If the ITS specifies separate attributes for language and
locale, then I think DITA should too. I suppose they'd be xml:lang and
xml:locale? Please could you confirm, and I'll add them to my proposal.

Thanks also for the links, I'll add them to my documentation proposal.

Best Regards,

-----Original Message-----
From: Felix Sasaki [mailto:fsasaki@w3.org]
Sent: Friday, March 03, 2006 2:14 AM
To: Robert D Anderson
Cc: gershon@tech-tav.com; bhertz@sdl.com; 'Bryan Schnabel'; Charles Pau;
'Lieske, Christian'; Dave A Schell; dita-translation@lists.oasis-open.org;
dpooley@sdl.com; 'Richard Ishida'; 'Jennifer Linton'; mambrose@sdl.com;
patrickk@scriptware.nl; pcarey@lexmark.com; Peter.Reynolds@lionbridge.com;
rfletcher@sdl.com; Sukumar.Munshi@lionbridge.com;
tony.jewtushenko@productinnovator.com; 'Yves Savourel'
Subject: Re: [dita-translation] Changes to documentation of xml:lang and
translate attributes

Hi Robert, all,

Sorry for the sporadic participation. Below I have some comments.

Robert D Anderson wrote:
> Hi Gershon,
> I think it looks good, I just wanted to clarify a few points:
> 1. For translate, it says that the default is no. I just wanted to
> clarify that this is a processing default, rather than one set in the
> DTD or Schema. If the value for every element is defaulted to "no" in
> the doctype, then when you read the file in to a parser it will appear
> that the value is set everywhere. So, this would prevent the value from
> 2. For the first sentence of the xml:lang description, we should
> indicate that it is not only for the language, it also sets the
> locale. I'd suggest either "Specifies the locale of the element
> content."
> or
> "Specifies the language and locale of the element content."

There are problems with combining locale to language identification:

<p xml:lang="en-US">A beer in Germany costs <ptr value="5"/>.</p>

The @value should be spelled out as "5 Euro", but if you map the language
"en-US" to a locale, it would be "5 Dollar". So what you need here is a
separation between language and locale identification.

> I realize that the current spec only uses the term language. I think
> this has led to some confusion in the past.
> 3. For xml:lang, I do not think that the spec should explicitly
> designate that the default is English. This should probably be up to
> the tools. The DITA Open Toolkit sets the default language with a
> parameter in the stylesheets, so that it is possible for users to
> change the default if needed. If we do want to suggest a default, then
> how about something like "When no xml:lang value is supplied and no
> external method is used to set a default, the default value of English is
> I also realize that the current spec already specifies a default of
> English, but I've heard people express the desire to set a different
> default when authoring in another language.

I think this is a very important point.

> 4. I am not sure what is meant by this:
> "A list of supported values is given in xml:lang values."
> The current spec references ISO-3166 for Country Codes and RFC 3066
> for Language Codes. Different applications (as well as different
> specializations) may choose to support only a subset of all languages
> (for example, the DITA toolkit supports only 47 of the defined
> locales, and warns if users specify values it does not support). Since
> DITA was developed, the toolkit has added support for two additional
> locales (be-by and uk-ua). If we continue to reference the
> authoritative sources, then our description will remain current and
> correct at all times, even as new locales are created and new tool
is added.

The value of xml:lang is RFC 3066 or its successor, see
http://www.w3.org/TR/REC-xml/#sec-lang-tag . RFC 3066, see
http://www.ietf.org/rfc/rfc3066.txt , defines as the first subtag 2-letter
subtags as ISO 639 part 1 language codes, and 3-letter subtags as ISO 639
part 1 language codes.
As the second subtag, 2-letter are ISO 3166 country codes.
It might be useful to mention these sources directly.
There is also a successor or RFC 3066, see
http://www.ietf.org/internet-drafts/draft-ietf-ltru-registry-14.txt . It is
100% backward compatibly with RFC 3066, but allows also for specifying new
kinds of subtags (esp. for script, region and variant).
That might be worth mentioning.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]