RE: [ubl-ndrsc] Elements vs. attributes: discussion kickoff

ubl-ndrsc message

Subject: RE: [ubl-ndrsc] Elements vs. attributes: discussion kickoff

From: "Stuhec, Gunther" <gunther.stuhec@sap.com>

To: ubl-ndrsc@lists.oasis-open.org

Date: Tue, 05 Feb 2002 21:14:55 +0100

Title: Message

Hello all,

I think, we can use attributes in much more possibilities as for ID, IDRED and xml:lang only. I created some examples for it, because I can explain the advantages of attributes with that examples a little bit better.

I guess, we can use attributes for supplementary components especially. I will show this on my first example:

<xs:complexType name="AmountType_0p1" id="000105">
  <xs:simpleContent>
   <xs:extension base="cct:AmountContentType">
    <xs:attribute name="amountCurrencyIdentificationCode" type="cct:AmountCurrencyIdentificationCodeType"/>
   </xs:extension>
  </xs:simpleContent>
</xs:complexType>

The first example represents the core component type "AmountType". The AmountType is derived by the content component "AmountContentType". Therefore it is possible, the we can show the value as an element value of the AmountType. And the "AmountType" includes on supplementary component "AmountCurrencyIdentificationCode". This supplementary component is derived by "AmountCurrencyIdentificationCodeType" and is represented as an attribute. The XML instance of this "AmountType" is represented as is follows:

<Amount_0p1 amountCurrencyIdentificationCode="EUR">33.34</Amount_0p1>

May be, the name of the attrribute is too long,. but I guess, this XML instance is for everyone readable too.

The next example shows the "AmountType" without any attribute:

<xs:complexType name="AmountType_0p2">
  <xs:sequence>
   <xs:element name="AmountContent" type="cct:AmountContentType"/>
   <xs:element name="AmountCurrencyIdentificationCode" type="cct:AmountCurrencyIdentificationCodeType"/>
  </xs:sequence>
</xs:complexType>

I guess, it is more complicate for the parser as well as the user, if we using content and a childelement inside the complexType "AmountType". Therefore it is necessary to create two childelements inside of "AmountType". The XML instance will be then shown as the following example:

<Amount_0p2>
<AmountContent>33.34</AmountContent>
<AmountCurrencyIdentificationCode>EUR</AmountCurrencyIdentificationCode>
</Amount_0p2>

For describing "Amount" you need much more additional information. If you would like to describe huge documents based on that example, you need much more data as you use the first example. And I guess, that example is not better readable as the first example. I have recognized that the new version of DOM as well as the SAX parser parsing all attributes in a very fast and elegant way, faster as a lot of additional childelements.

The following example shows you, how you can create date-time elements in two different ways, if you using an attribute for describing the format:

If you would like to describe the dateTime format based on one built-in simpleType, you have to create the following schema:

Otherwise, if you create a Date Time in a special format, based on ISO 8601, you can use the following complexType:

By the attribute, you can describe the specific format:

<DateTime_0p1 DateTimeFormat="YY-MM-DD">02-02-05</DateTime_0p1>

This XML instance represented the content in the same element. There is on attribute for describing the special format additionally. Therefore, there is no changing of the representation. That will be not so, if you describe the format by an additional child element, like the following esample:

<DateTime_0p2>
<DateTimeContent>02-02-05</DateTimeContent>
<DateTimeFormat>YY-MM-DD</DateTimeFormat>
</DateTime_0p2>

You need two child elements. One for the content and the other for the format description. That makes much more data and is not so easy understandable as the example before.

A problem is, if you have more than one supplementary components. For example "codeType". Since as the values of each supplementary components do represent some processable data or codes respectively.

I give some examples:

A.)

<Code_0p1 codeListIdentifier="1B" codeListAgencyIdentifier="28" codeListVersionIdentifier="1" codeName="Special Code" languageCode="en">ABCX</Code_0p1>

In the first example (A) are all supplementary components represented as attributes. The problem in that example is that will happen no direct relationship between codeName and languageCode. This must be necessary, because the languageCode is related to the codeName.

B.)

<Code_0p2 CodeListAgencyIdentifier="1B" CodeListIdentifier="28" CodeListVersionIdentfier="1">
<CodeContent>ABCX</CodeContent>
<CodeName languageCode="en">Special Code</CodeName>
</Code_0p2>

In the second example (B) are the supplementary components shared in attributes and child elements. Supplementary components would like represented as attributes, if the data could be processably or coded information respectively. Supplementary components which represents user readable information represented as child elements. The content component is represented as child element, too. One expection have the attribute "languageCode" due to related to the readable name of the code it will be placed inside of the child element "CodeName".

C.)

<Code_0p3>
  <CodeContent>ABCX</CodeContent>
  <CodeListAgencyIdentifier>1B</CodeListAgencyIdentifier>
  <CodeListIdentifier>28</CodeListIdentifier>
  <CodeListVersionIdentifier>1</CodeListVersionIdentifier>
  <CodeName>Special Code</CodeName>
  <LanguageCode>us</LanguageCode>
</Code_0p3>

The last example the CodeType without any attributes. There are much more data and there is no relationship between CodeName and LanguageCode, too.

The attributes doesn't make the readability much more complicated. It help us, to build relationship in a very short and easy matter. The XML instances are much shorter and there are not so much hierachies for representing that data. That helps that the parsing of that structure is much more faster. And is helpful to map elements in an internal workflow or database respectively.

Regards,

Gunther

-----Original Message-----
From: John C Dumay [mailto:jcd@progressivemanagement.com.au]
Sent: Montag, 4. Februar 2002 12:38
To: 'Matthew Gertner'; 'CRAWFORD, Mark'; ubl-ndrsc@lists.oasis-open.org
Subject: RE: [ubl-ndrsc] Elements vs. attributes: discussion kickoff

Folks,

I tend to agree. The only real use I can see for attributes of any kind is for genuine meta-data as eluded by Matt. I prefer the use of attributes in schema when ever possible as I find it much easier to map the elements to databases that need to store the data contained the UBL instance documents. It certainly makes life easier and less confusing. What we should be specifying then are the particular instances when to use attributes, rather than creating a rule that satisfies an 80/20 situation, i.e.: ID, IDREF and xml:Lang as espoused by Mark.

John Dumay

-----Original Message-----
From: Matthew Gertner [mailto:matthew.gertner@schemantix.com]
Sent: Monday, 4 February 2002 1:43 AM
To: 'CRAWFORD, Mark'; ubl-ndrsc@lists.oasis-open.org
Subject: RE: [ubl-ndrsc] Elements vs. attributes: discussion kickoff

My position on this is use attributes only for document level information. This means that for the most part only built-in document level attributes such as xml:lang, id and idref should be used and elements should be used for all other transmitted data. My understanding is there are parsing, ordering, and performance issues surrounded with attributes. I also believe that by enforcing attributes at the document level, we will provide clarity, avoid confusion, and enable better structuring.

Here, here! My tendency would be to throw down the gauntlet and ask: why ever use attributes? Most of the reasons raised by Gunther are no longer relevant now that XSD lets elements do most of what attributes can (i.e. be simple types, hold enumerations, have default values, etc.). Using both elements and attributes is just confusing for the user, developer and others, poorly supported by authoring environments, etc. I would suggest that use of attributes be restricted to real metadata: identifiers, links, etc. as proposed by Mark.

Matt