OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

emergency-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: CAP and attribute-free encodings...


Art Botterell wrote on 19-Nov-2003: (see:
http://www.incident.com/pipermail/cap-list/2003-November/001379.html)
> "Since fairly early on we took a stylistic choice to 
> avoid the use of [XML] attributes if we could... different
> participants had different reasons and I won't attempt to 
> recap that whole discussion just now.
	This was in response to David Robison's suggestion that the
use of name/value pairs in elements like geocode was not in
conformance with normal XML usage and practice. Quite frankly, it is
likely that just about anyone with any substantive XML experience is
going to ask the same question... I know that I do! Having a ready
explanation at hand would be a good thing.
	While it may be unreasonable to "recap the whole discussion,"
I think it would be useful to at least provide some hints as to what
concerns were expressed about the use of attributes. It seems odd that
an XML based format would discard such an important part of XML. I've
dug extensively through the archives, minutes of OASIS meetings, etc.
and can't find the discussions that lead to this decision...
	Could someone please provide some hints as to why this
decision was made?

	In any case, the method used in CAP (i.e. name/value pairs) is
*NOT* the correct way to avoid using XML attributes. If you wish to
avoid the use of attributes, the normal and accepted method of doing
so is to create an element for what would have been an attribute. It
is *not* correct to use a hack like name/value pairs... Thus, where
CAP calls for:

	<geocode>fips6=006109</geocode>

	David Robison proposes the obvious and normal XML
attribute-based solution:

	<geocode code="fips6">006109</geocode>

	However, the "correct" attribute-free encoding would be:

	<geocode>
	  <codetype>fips6</codetype>
	  <value>006109</value>
	</geocode>

	The use of name/value pairs in an element which is defined in
the XML Schema as a "string" should be considered as a violation of
the Schema... The reason is that a non-string item has been "hidden"
in a string field. The fact that the element is encoded as a string is
irrelevant. It is *not* of *type* string! Thus, processors of the data
cannot properly use "string" processing routines in handling,
formatting, or comparing values that are in the "string" element. This
means that much of the value of having the schema is lost... and, it
means that custom coding is required to accomplish what otherwise
could have been accomplished using standard schema-aware XML parsers.
	It should be noted that since there is no default "code_type"
for the geocode field, this implies that a "code_type" *MUST* be
provided for any value in the field. This requirement should be stated
in the specification (i.e. amend the spec please...) and developers of
CAP processors should remember to write code to enforce it. Of course,
if the proper attribute-free encoding illustrated above was being
used, the XML Schema could simply indicate that both <codetype> and
<value> were required and this sort of validation would be
automatically handled by any decent XML parser.
	Other elements in CAP which are specified to have "name/value
pair" values should also either be converted to using XML attributes
or use the attribute-free encoding that I illustrate above. (i.e.
eventCode, parameter, etc.)
	Even if the proposal that normal XML practice be followed in
this area is rejected, there are still problems of ambiguity and
under-specification with the existing name/value pair based system.
For instance, the specification provides no ABNF or other means for
stating clearly what the assumptions are concerning the encoding of
this field and parsing requirements. While it is clear that the intent
is that a code_type and value are separated by the equals character
"=", it is not specified whether or not white space preceding or
following the "=" is considered significant. Also, there is no mention
of the handling of entities. For instance, are the following all
equivelant?
	"foo=bar", "foo =bar", "foo = bar", 
	"foo= bar", "foo    = bar", "foo&nbsp;=bar"
	"foo=
bar" (note: there was a CRLF "whitespace" following the "=" in that
one...)
	Providing ABNF definitions would, of course, not be necessary
if the CAP specification stuck to normal XML conventions since all the
parsing issues, etc. that are relevant here would be covered by
well-developed and well-understood XML specifications... But, since
XML has been rejected, the burden of clearly defining parsing rules
explicitly falls on CAP -- if interoperablity is a goal.
	Please consider improving conformance of these element
definitions to common patterns of XML usage. Also, please consider
clarifying the specification in at least the areas outlined above.

		bob wyman



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]