Subject: Re: [ubl] Code list metadata
http://www.unece.org/fileadmin/DAM/cefact/xml/XML-Naming-and-Design-Rules-V2.0.pdf And I see the following rule matches the URI strings that I have been finding: [R 166] The namespace names for code list schema holding specification status MUST be of the form: urn:un:unece:uncefact:codelist:standard:<Code List. Agency Identifier|Code List Agency Name Text>:<Code List Identification. Identifier|Code List Name Text>:<Code List Version Identifier>So I think that gives us the latitude to populate the <Version> field of the genericode file with the last field value of the URI string.
. . . . . . . . Ken At 2012-11-17 12:01 -0500, I wrote:
Thank you, Mark, for your comments. Looking at NDR 3: http://www.unece.org/fileadmin/DAM/cefact/xml/UNCEFACT+XML+NDR+V3p0.pdf... in section 18.104.22.168 specifying the components of the URI, pointing to example 8-44 as definitive, with line 2893 having the example:xmlns:clm64437="urn:un:unece:uncefact:codelist:common:D.04A:draft:6:4437" Neither of the XSD files that I cited follow that convention: http://www.unece.org/fileadmin/DAM/uncefact/codelist/standard/UNECE_PackageTypeCode_2006.xsd xmlns:clm67065="urn:un:unece:uncefact:codelist:standard:UNECE:PackageTypeCode:2006"http://www.unece.org/fileadmin/DAM/uncefact/codelist/standard/UNECE_CargoTypeCode_1996Rev2Final.xsdxmlns:clm6Recommendation21AnnexI="urn:un:unece:uncefact:codelist:standard:UNECE:CargoTypeCode:1996Rev2Final"And here is a July 2012 example that uses a different convention for version, but the same convention for the URI:http://www.unece.org/fileadmin/DAM/uncefact/codelist/standard/UNECE_AdjustmentReasonDescriptionCode_D10B.xsd xmlns:clm64465="urn:un:unece:uncefact:codelist:standard:UNECE:AdjustmentReasonDescriptionCode:D10B"What I see in the NDR doesn't seem "simple" to me since it isn't being used. What I described as an approach appears to be what is being followed consistently in the XSD files that I'm able to find as "current".Where can I find published examples that follow NDR 3?Where is the specification for the URI format that is currently being used? It isn't in NDR 3. Where can I find the UN/CEFACT NDR 2? A Google search isn't finding me anything. There is no such guidance in the UBL NDR 2.Lines 2875 and 2887 of NDR 3 are helpful in stating that the "version" of the code list is different than the "revision" of the code list. Since genericode is recording the "version", I think that gives me license to replace the <Version> element's past use of "Revision X" to instead be the last field of the URI associated with the namespace.Thank you, again, for making a comment. I look forward to finding the definitive rules to which you refer.. . . . . . . . . Ken At 2012-11-17 15:03 +0000, Crawford, Mark wrote:Ken,The codelist conventions are specified in the appropriate NDR (V2 or V3) and are rather simple.Mark Mark Crawford SAP Standards Strategist Industry Standards & Open Source, TIP Governance SAP Labs LLC, 1300 Pennsylvania Avenue Suite 600, Washington D.C. email@example.com T +17036700920 M +17034855232 Mobile: (703) 485-5232 ----- Original Message ----- From: G. Ken Holman [mailto:gkholman@CraneSoftwrights.com]Sent: Saturday, November 17, 2012 08:43 AM Eastern Standard Time To: firstname.lastname@example.org <email@example.com>Subject: Re: [ubl] Code list metadata At 2012-11-17 15:20 +0800, Tim McGrath wrote: >we have to tread carefully here because the maintenance procedures >for these inside CEFACT are complex and we need to ensure the >correct identification to the canonical versions. Absolutely! But I wouldn't have to "tread carefully" if someone gave me a clear walking path to follow, which I do not have. I'm walking blindly here without guidance trying to come up with the most appropriate approach. It seems that UN/CEFACT is not following the same conventions for versioning for all code lists. Today I will try to mechanically distill some kind of pattern out of the following sources of code lists, but I think a pattern to find "Revision" is impossible: http://www.unece.org/cefact/xml_schemas/index.html >for example, i suspect the 66411 may be to reflect the UN/EDIFACT >element number 6411 "Measurement unit code" with the extra 6 >prepended to denote UNECE as the agency responsible. following this >model the Packaging type code number would be 7065 "Package type >description code" prefixed with 6, so 67065. I do see "67065" used as part of a namespace prefix, but not as part of the URI ... and I see "2006" as the version at the end of the URI here: http://www.unece.org/fileadmin/DAM/uncefact/codelist/standard/UNECE_PackageTypeCode_2006.xsd ... where this is used: xmlns:clm67065="urn:un:unece:uncefact:codelist:standard:UNECE:PackageTypeCode:2006" That URI, then, is very different than the URI found in: http://docs.oasis-open.org/ubl/os-UBL-2.0-update/xsd/common/CodeList_UnitCode_UNECE_7_04.xsd ... where this is used: xmlns:clm66411="urn:un:unece:uncefact:codelist:specification:66411:2001" There is no pattern for "Revision" or for the URI! But I do see "Version" information at the end. >so yes, there is duplication and inconsistency but staying aligned >with UN/EDIFACT is a good idea. How does one stay "aligned" when there is no pattern to follow? Those two URIs are from Rec 20 and Rec 21, so I would have expected them to have the same structure. >This includes using the word 'Revision' to be really clear that >Version is the same piece of metadata. Note there is no revision information in the URI. The web site documentation for Recs 20 and 21 states which revision information and it is reflected somewhat in the name of the file if you know where to look. Such information is not found reliably. Also, I note that "Revision X" also does not appear to be documented as the code list version. Note the following excerpt from the Rec 21 schema documentation: Code list name: Package Type Code Code list agency: UNECE Code list version: 2006 Even though there is no such comment in the Rec 20 schema documentation, the "Rev9e" of the Excel file name (!!) that would indicate "Revision 9" appears *not* to be used as the code list version. So, based on the comment "Code list version:", what does appear to be a pattern is the final field of the URI appears always to reflect a version, but not the revision. Can we not simply use this last field as the version for the code list? I think that appears to be more reliable and consistent than trying to glean "Revision" information from details that are outside of the file contents. For Rec 20 this would be (where "CVUri" is Canonical Version Uri): Ver: 2001 CUri: urn:un:unece:uncefact:codelist:specification:66411 CVUri: urn:un:unece:uncefact:codelist:specification:66411:2001 For Rec 21: Ver: 2006 CUri: urn:un:unece:uncefact:codelist:standard:UNECE:PackageTypeCode CVUri: urn:un:unece:uncefact:codelist:standard:UNECE:PackageTypeCode:2006 Then, for example, forhttp://www.unece.org/fileadmin/DAM/uncefact/codelist/standard/UNECE_CargoTypeCode_1996Rev2Final.xsdI could use: Ver: 1996Rev2Final CUri urn:un:unece:uncefact:codelist:standard:UNECE:CargoTypeCode CVUri: urn:un:unece:uncefact:codelist:standard:UNECE:CargoTypeCode:1996Rev2Final That appears to be a consistent pattern that I could follow, and I note in the documentation that that last field *does* appear to be the code list version and not some "Revision" value: Schema agency: UN/CEFACT Schema version: 1.0 Schema date: 18 July 2012 Code list name: Cargo Type Code Code list agency: UNECE Code list version: 1996 Rev 2 Final ... note that I would not try to preserve the spaces ... I would just use the last field. Then anyone seeing a UN/CEFACT URI would know from the URI and not from some filename or web site documentation exactly which version it is. And I think I am justified because of the word "version", not "revision", found in the comments in those files. Would this be acceptable? . . . . . . . . Ken p.s. I've published the tool that converts CSV files to genericode equivalents here: http://www.CraneSoftwrights.com/resources/ubl/#csv2gc >On 17/11/12 7:02 AM, G. Ken Holman wrote: >>Fellow UBL TC members, >>>>Today I'm struggling with list-level metadata for our code lists for UBL 2.1.>> >>In UBL 2.0, we oriented our list-level metadata around UN/CEFACT >>for those code lists that matched the enumerations baked into the >>schemas. Consider, for example, the Units of Measure list, UN/ECE >>Recommendation 20: >>>>http://docs.oasis-open.org/ubl/os-UBL-2.0-update/cl/gc/cefact/Uni tOfMeasureCode-2.0.gc>> >> <Identification> >> <ShortName>UnitOfMeasureCode</ShortName> >> <LongName xml:lang="en">Unit Of Measure</LongName> >> <LongName Identifier="listID">UN/ECE rec 20</LongName> >> <Version>Revision 4</Version>>><CanonicalUri>urn:un:unece:uncefact:codelist:specification:66411< /CanonicalUri> >><CanonicalVersionUri>urn:un:unece:uncefact:codelist:specification :66411:2001-update</CanonicalVersionUri> >><LocationUri>http://docs.oasis-open.org/ubl/os-UBL-2.0-update/cl/ gc/cefact/UnitOfMeasureCode-2.0.gc</LocationUri>>> <Agency> >> <LongName xml:lang="en">United Nations Economic >> Commission for Europe</LongName> >> <Identifier>6</Identifier> >> </Agency> >> </Identification> >> >>I cannot see where the version information came from in the schema. >>And I note two concepts of version: "Revision 4" (in <Version>) >>and "2001-update" (in <CanonicalVersionUri>). >>>>For UN/ECE Recommendation 21, the Packaging Type list, we used UBL metadata:>>>>http://docs.oasis-open.org/ubl/os-UBL-2.0-update/cl/gc/default/Pa ckagingTypeCode-2.0.gc>> >> <Identification> >> <ShortName>PackagingTypeCode</ShortName> >> <LongName xml:lang="en">Packaging Type</LongName> >> <LongName Identifier="listID">UN/ECE rec 21</LongName> >> <Version>Revision 5</Version>>><CanonicalUri>urn:oasis:names:specification:ubl:codelist:gc:Packa gingTypeCode</CanonicalUri> >><CanonicalVersionUri>urn:oasis:names:specification:ubl:codelist:g c:PackagingTypeCode-2.0-update</CanonicalVersionUri> >><LocationUri>http://docs.oasis-open.org/ubl/os-UBL-2.0-update/cl/ gc/default/PackagingTypeCode-2.0.gc</LocationUri>>> <Agency> >> <LongName xml:lang="en">United Nations Economic >> Commission for Europe</LongName> >> <Identifier>6</Identifier> >> </Agency> >> </Identification> >> >>In both cases I think the version information should not include >>the word "Revision", so I'm suggesting changing that. >> >>What do we do about identification? Both code lists come from >>UN/ECE recommendations. One would think their identification would >>be very similar. I cannot correlate on the UN/ECE web site the >>UN/CEFACT schema reference to "66411" for the Units of Measure. Is >>there a similar reference for the Packaging Type? >> >>Should we use the UN/ECE format for both? If so, for others that >>did not have that format in UBL 2.0? >> >>Or should we use the UBL format for all code lists now that we >>don't have any UN/CEFACT XSD files with enumerations? Then we >>would change the identification approach for the other code list >>files that used to come from XSD enumerations. >> >>There are no genericode files yet published on the UN/ECE web site, >>so we have to create our own. >> >>Thank you for any discussion and guidance on this subject. I've >>got the mechanics all working, but I need help to know what should >>go into these files. >> >>. . . . . . . . Ken >> >>p.s. today I put together a utility to convert CSV files into >>genericode files, so anyone finding code list information should be >>able to easily create CSV without needing to think about the XML
-- Contact us for world-wide XML consulting and instructor-led training Free 5-hour lecture: http://www.CraneSoftwrights.com/links/udemy.htm Crane Softwrights Ltd. http://www.CraneSoftwrights.com/o/ G. Ken Holman mailto:gkholman@CraneSoftwrights.com Google+ profile: https://plus.google.com/116832879756988317389/about Legal business disclaimers: http://www.CraneSoftwrights.com/legal