[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Types produced from regular models are sets
Lattice, I don't know, but I have been thinking that RELAX NG types should be treated as sets. Perhaps this is obvious, but I haven't seen it, so I'll ramble on for a bit. The motivation for this is to construct a type system that respects the regular model from which types must be derived and isn't ashamed of non-determinism. Since RELAX NG models are closed under union, which last time I looked was a commutative operator, ambiguity resolution by ordering (a la W3 XML Schema) isn't well-founded theoretically and requires the validator to maintain declaration order in inherently unordered patterns. While understandable within the confines of a single pattern expression, the choice of a single type out of a number of equally valid types is likely to seem "random" to users when patterns are ambiguous at the element level. In this note, I'm going to stick to "atomic" types because they are simpler and RELAX NG doesn't have a concept of named types for non-text values. To illustrate by example, given: element e { xsd:integer | xsd:boolean } and the input <e>1</e>, which matches both xsd:integer and xsd:boolean, the type of e is the set, {xsd:integer, xsd:boolean} which, it should go without saying, is unordered and contains no duplicates. It is also reasonable to view these types as patterns with (only) choice operators, as long as the set properties are honored by the implementation. (The pattern view may be easier to extend to complex/hedge types, but I'm not going to go there.) Each member of a type set names a set of strings; the type set describes the union of the member sets. Neither uniquely names its set. We can define the subtype (<=) relation in a conventional way: A type is a subtype of a base type if every valid value of the subtype is a valid value of the base. Equivalently, if the set of strings permitted by a type is a subset of the set allowed by a base type, the type is a subtype of the base. Then, trivially, if two types are subtypes of each other, they are equivalent (identify the same set of strings), and each member of a type set is a subtype of the type set. Given two type sets, one is a subtype of the other if every member of the one is a subtype of the other. Every atomic type is a subtype of {text}. This is not an object-oriented definition; subtypes may or may not have an "isa" relationship with their base types. For example, it is not true that an integer "isa" decimal even though every integer value may also also be a valid decimal value. This is a semantic issue outside the scope of the type system (as it is in object-oriented languages, except for exhortation). Note that in this interpretation, a list type is a sequence/array of sets. Assuming there were an application API to receive it, the (atomic) type of each element and attribute in an input stream (that has an atomic type)can be constructed during validation by forming the union of the types of each successfully matched pattern. (This seems easy to build as a byproduct of the derivative with respect to text. I don't know about other implementations.) Since the type decorations of nodes would be sets, an API like that of SAX2 for attributes seems inappropriate, and not just because single names are inappropriate. Instead, if an application is written to have different behaviors depending on the discovered type of the input, it would be more useful if the API allowed the application to ask if a particular type is a subtype of the node type. For example, given the pattern: element color { "red"|"green"|"blue"| list { part, part, part }} ... part = xsd:int { minValue="0" maxValue="255" } the application could ask if the input is a token and if so do a table lookup, otherwise if it is a list consisting of three xsd:int values, construct a color. The more rare application that needs to know the entire set of valid types could be provided a list of names. Bob Foster
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]