[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ubl-lcsc] Re: UBL 0.81 CCT draft-9-mod
On Sun, 7 Sep 2003, Tim McGrath wrote: >>what is the problem with making code and identifier (and every other >>data type) as 'string'? (Currently in 0.81 CCT draft-9-mode, CodeType's content type is xsd:token, while IdentifierType's content type is xsd:normalizedString) I suppose it's not a question of whether the system will break down if we too narrowly restrict the type base (such as making CodeType an xsd:token as opposed to xsd:string), but the way I look at it, a question of to what extent we can leverage on Stage (B)'s schema-validation stage to filter out what might be an "easily" filtered-off syntactical problem with the data in the instance space. Let's just look at an example. If an application receives an element that is supposed to be CodeType, it expects it to contain proper code values, such as a string without CR, LF, TAB and without initial and trailing spaces. A code value of, say, " UN/CEFACT " may look the same on a printout as "UN/CEFACT", but they compare differently in memory. Thus, if CodeType has xsd:string as base type, Stage (B) will pass off both values are "OK" to the application, which must EITHER filter off again the initial and trailing and find that both values are alright (an action that xsd:token would have required the sender to perform), OR check and flag the first as incorrect while the second is acceptable. On the other hand, if CodeType has base type xsd:token, then the sender first of all cannot generate " UN/CEFACT " if it should interop with other systems (as schema shows that CodeType should be xsd:token). On the receiving end, Stage (B) will flag this field as erraneous as it does not validate against a base type of xsd:token, saving application from further syntactical checks (and focus on whether the values are semantically correct). >>what are we trying to gain by enforcing patterns >>in the data? Similar reasons, tapping on what schema-validator could already provide to ensure the values are syntactically right before processing. Again, this is not a make-or-break issue, but if we provide a lax type such as changing all to xsd:string, then applications will just have to duplicate some of the checking functions and work harder to find out if values are ok. For certain system- or processing- related types, such as GloballyUniqueIDType, I'd think the stricter the pattern, the better in terms of lesser chance of misinterpretation and accomodative processing. This GloballyUniqueIDType specifies the wire-format for representing a GUID, which is a consecutive 128bit of ID. This is like (but cannot compared with) having ISO8601 to specify representation of what might be a conceptual form of date & time. The example given in the draft CCT was: 2B93C220-E0C2-11D7-94FC-00E0290FEEC7 But without a pattern, sender could send in various forms: 2B93C220:E0C2:11D7:94FC:00E0290FEEC7 2B93C220 E0C211D794FC 00E0290FEEC7 2B93 C220 E0C2 11D7 94FC 00E0 290F EEC7 2B93C220E0C211D794FC00E0290FEEC7 etc. They're all intended to mean a GUID value, but either the receiving application has to be very smart and accomodative, or the sender has to anticipate in advance to what system expecting which sort of GUID format it will be sending. I think either case is not something edging closer to interoperability than upfront specification of a clear format. Just my opinions. Best Regards, Chin Chee-Kai SoftML Tel: +65-6820-2979 Fax: +65-6743-7875 Email: cheekai@SoftML.Net http://SoftML.Net/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]