[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: [ubl-ndrsc] Absence of Data
(What is PESC?) Thanks for sending this! I assume I can consider you the champion for this one... I doubt we'll have time to discuss this in tomorrow's call, but maybe we can get some reaction on the list for next time. I have some comments below: At 02:48 PM 11/20/01 -0600, Mike Rawlins wrote: >One thing that we'll have to deal with, particularly for those coming >from an EDI background, is that absence of data (or absent elements) are >not handled the same way. For example: > >In XML: > ><Name>Mike</Name> ><Name> </Name> ><Name/> > >All satisfy the condition that the "Name" element be present, even >though there may be no real data in it. > >In X12 or EDIFACT: > >N1*ST*Rawlins *91*1234567 - Element is present because data is present >N1*ST**91*1234567 - Element is not present because data is not present. > >The PESC paper has a section which deals with this. This may be a bit >long for our purposes, but I think it's something that needs to be >addressed. > >Extract from PESC paper: > > >2.3.9 Nulls, Zeroes, Spaces, and Absence of Data >The following rules SHALL apply in designing schemas and interpreting >instance documents: > >1. Absence of data - If an element is defined as OPTIONAL (minOccurs >attribute value of zero) and the element does not occur in an instance >document, semantics SHALL NOT be interpreted from the element other than > >that the originator of the instance document and did not include it. No Try as I might, I can't make sense of the last bit about the originator. Is there a word missing or extra? Also, the notion of "semantics being interpreted" sounds a bit fuzzy to me. Would it be clearer to say something like "The absence of an optional element in an instance SHALL NOT be interpreted as a signal that the element, if present, would have had a null value" or something like that? A concrete example of how an absent element could be misinterpreted would be helpful. Also, we should separate out the schema-design advice from the generic advice to those who have to interpret an instance that uses a schema designed according to our guidelines. In fact, the interpretation advice should perhaps offer brief documentation boilerplate that should be attached to all elements that are in this situation; after all, it's not advice about how to properly structure an instance (like our advice about processing instructions etc. will be), but rather how to properly understand it. I get the feeling that I'm not being clear, but I'll rely on you to tell me. :-) >default values are to be assumed. Likewise, if an attribute is >declared as OPTIONAL ("use" attribute value of OPTIONAL) and the >attribute does not occur in an instance document, semantics SHALL NOT be > >interpreted from the attribute other than that the originator and did >not include it; no default values are to be assumed. Same problem with the originator wording here. >NOTE: All string items defined with a minOccurs of one SHALL have a >minimum length requirement of one character. > >2. Zeroes - Zeroes, when appearing in a numeric element in an instance >document, SHALL be interpreted as a zero value. Should we qualify numeric by saying listing the built-in datatypes this applies to and saying it also applies to any datatypes derived from them? >3. Spaces - Spaces sent as values for elements or attributes (of type >string) in instance documents SHALL be interpreted as spaces. It is >RECOMMENDED that leading and trailing spaces be removed, but when they >appear they SHALL have semantic significance. Sending an element with >just spaces is not the same as sending a nulled element (see #4 below). > >4. Nullability - In certain cases, it MAY be desirable to convey that an I don't believe that this is a legitimate use of the uppercase MAY (it's not being used in a normative sense). I usually use "might" in these circumstances, or it could say "Where a schema is designed to be nillable, ..." >element has no value (a null value) rather than indicating that it has a > >value of spaces or that it is not present in a document. In these >cases, the originator of the instance document SHOULD convey explicitly >that an element is null. An example is an address update for a >previously transmitted address. The previous address had two address >lines, whereas the current address has just one line. The originator of > >the document indicates that the second address line is removed by >indicating that the element is nulled as follows: ><addressLineTwo xsi:nill="true"></addressLineTwo> > >To support this the addressLine element in the schema is defined as >nullable via: > ><xsd:element name="addressLine" type="xsd:string" nillable="true"/> > >When this type of nullable semantics are desired, the "nill" and >"nillable" attributes SHALL be used (as opposed to spaces for strings or > >zeroes for numerics). The "nillable" attribute SHALL NOT be used in >element declarations with a minOccurs of greater than zero. When there >is a requirement that an element be OPTIONAL and not appear in an >instance document, the minOccurs attribute with a value of zero SHALL be > >used in the element declaration. By default, any element defined in >analysis as having a minimum occurrence of zero SHALL be represented in >the schemas as nullable. To be honest, the whole nillable thing is new to me, and I recall that the XSD working group was sharply divided on it (with only the relational DB people on the "for" side). Does it really buy us anything over application-specific token values like "none" or "no" or whatever? Eve -- Eve Maler +1 781 442 3190 Sun Microsystems XML Technology Center eve.maler @ sun.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC