[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: FW: Language codes as an UBL code list
Regards
Paul Spencer
Director
Boynings Consulting
Ltd
http://www.boynings.co.uk
-----Original Message-----
From: Colin Mackenzie [mailto:colin@elecmc.com]
Sent: 16 July 2004 16:24
To: Michael.Andrews@cabinet-office.gsi.gov.uk; Adam.Bailin@cabinet-office.gsi.gov.uk; paul.spencer@boynings.co.uk
Subject: Language codes as an UBL code listHi,At yesterdays meeting I volunteered to knock-up a UBL format codelist schema for the ISO 639.2 three letter language codes.If I had realised how much copying and pasting was involved I never would have bothered.Anyway, please find attached a first cut attempt at producing the list (schema, test schema that imports it, sample XML file).The list has been taken from http://www.loc.gov/standards/iso639-2/langcodes.html which is pointed to from the ISO site.Paul, would it be possible for you to cast your eyes over it and also to consider the points below?Some issues1/ some language codes have two descriptions e.g. "chu" is described as"Church Slavic; Old Slavonic; Church Slavonic; Old Bulgarian; Old Church Slavonic"I have represented this as the following
<xsd:enumeration value="chu">
<xsd:annotation>
<xsd:documentation>
<CodeName>Church Slavic</CodeName>
<CodeName>Old Slavonic</CodeName>
<CodeName>Church Slavonic</CodeName>
<CodeName>Old Bulgarian</CodeName>
<CodeName>Old Church Slavonic</CodeName>
</xsd:documentation>
</xsd:annotation>
</xsd:enumeration>Is it allowed to have more than one CodeName per code? Is this the recommended way? Who knows, that's the trouble with sticking elements inside xsd:documentation and not using a schema for them.2/ some CodeName s have two language codes e.g. Welsh is "cym" AND "wel".using the UBL schema, I have created two separate enumerations, this does not seem ideal.I am thinking if I was a programmer creating a drop down list for languages I would only want one "Welsh" on the list.3/ Ideally there would be a mapping from the three letter codes to two letter codes, perhaps this code be added by someone putting another element in the xsd:documentation element4/ I do not know who will end up owning the document and so I have kept in the OASIS comments, copyright, no eGIF meta data and used their style of namespace URNs etc. This means that it does not follow the current guidelines.5/ The types defined in the schema (which have been adapted from UBL country codes code list) do not follow eGIF guidelines e.g.a) complexType with name ending in "Type" not "Structure" (I never liked that rule anyway although I do follow it)b) the use attributes which are optional and fixed (although the guidelines do mention the UBL as a special case)6/ As I do not know which agency will take control of this, The codeListAgencyID and codeListAgencyName may be wrong.7/ I do not know the process of localising the list, e.g. if you want the language names in French.I also did not know that Klingon is an ISO recognised language,ThanksColin
Colin Mackenzie
XML Consultant/Director
Electronic Media Consultants Ltd
17 North Wall, Cricklade, Wiltshire, SN6 6DU
Tel/Fax: +44 (0)1793 752193
Mobile: +44 (0)7974 422091
E-Mail: colin@elecmc.com
Web: http://www.elecmc.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]