[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: [relax-ng] Annotations in the non-XML syntax
The aspect of the non-XML syntax that has caused me by far the most difficulty is annotations. I've finally come up with a design that I feel reasonably happy with. Attached is a description of the syntax using this new design. I've also implemented this, but the code isn't yet ready for public consumption. Some of the things I would commend to you about this new syntax are: - The annotations applicable to a syntactic object appear in a consistent position (immediately preceding the object) - There is a nice similarity with C# where annotations also occur in square brackets before an object - All annotations expressible in the XML syntax are expressible in the non-XML syntax (actually there's one case that isn't yet handled, which I describe below) - Annotation attributes are written the same wherever they occur - Annotation elements are written the same wherever they occur - Square brackets are used in two contexts, but the uses are harmonious: in each case, the square brackets contain attributes followed by content - Just as a sequence of definitions is allowed without any connector, so a sequence of annotation elements is allowed at the definition level without connectors - Just as adjacent patterns or name classes in a group require a connector (e.g. |), so annotation elements that are siblings of patterns or name classes require a connector (>) - The relationship between annotation attributes and annotation elements that occur at the top-level outside square brackets is harmonious with the relationship between those that occur within square brackets: annotation attributes are followed by annotation elements without any intervening connector and they will end of as attributes and initial children of the same parent element - The implementation doesn't to cope with parsing embedded XML On the negative side, it's a bit harder to parse: - In several cases, two tokens of lookahead are required: in several contexts, when you see a name you have to lookahead to see whether there's a following "[" in order to know how to proceed - There's one case where an arbitrary amount of lookahead is required: in order to determine whether a file contains a sequence of definitions or a pattern, you may have to lookahead past an annotation in square brackets, which can consist of arbitrarily many tokens (however, this is easily implementable in JavaCC without hackery) There's one kind of annotation that is possible in the XML syntax that can still not be expressed in the non-XML syntax: annotations that attach to an <except> element, more specifically annotations that occur as attributes, initial child elements or following siblings of <except> elements (in <data>, <nsName> or <anyName>). I'm not wedded to the choice of ">" as the connector for connecting patterns and name-classes to following sibling annotation elements. If you think another character would be preferable, please say so. JamesTitle: A Non-XML Syntax for RELAX NG
This document describes a non-XML syntax for RELAX NG (a schema language for XML). The design goals of this syntax are:
The syntax is similar to the type syntax in the XQuery 1.0 Formal Semantics W3C Working Draft.
The syntax is defined by the following BNF:
topLevel ::= decl* topLevelBody topLevelBody ::= pattern | prefixedAnnotationAttribute* grammar decl ::= "namespace" identifier "=" (literal | "inherit") | "default" "namespace" identifier? "=" (literal | "inherit") | "datatypes" identifier "=" literal pattern ::= particle | particle ("|" particle)+ | particle ("," particle)+ | particle ("&" particle)+ | exceptParticle particle ::= annotations? primary followAnnotations occurrence? exceptParticle ::= annotations? datatypeName params? "-" annotations? primary followAnnotations primary ::= "(" pattern ")" | "element" nameClass "{" pattern "}" | "attribute" nameClass "{" pattern "}" | "mixed" "{" pattern "}" | "empty" | "notAllowed" | "text" | "list" "{" pattern "}" | datatypeName params? | datatypeName? datatypeValue | "grammar" "{" grammar "}" | ref | "parent" ref | "externalRef" literal inherit? occurrence = ("*" | "+" | "?") followAnnotations nameClass ::= basicNameClass followAnnotations | basicNameClass followAnnotations ("|" basicNameClass followAnnotations)+ | openNameClass "-" basicNameClass followAnnotations basicNameClass ::= annotations? QName | openNameClass | annotations? "(" nameClass ")" openNameClass ::= annotations? (nsName | anyName) ref ::= identifierNotKeyword datatypeName ::= CName | "string" | "token" datatypeValue ::= literal params ::= "{" (annotations? identifier "=" literal)+ "}" grammar ::= (definition | include | annotationElementNotKeyword)* definition ::= annotations? subject ("=" | "|=" | "&=") pattern subject ::= "start" | identifierNotKeyword include ::= annotations? "include" literal inherit? includeBody? includeBody ::= "{" (definition | annotationElementNotKeyword)* "}" inherit ::= "inherit" "=" identifier followAnnotations ::= (">" annotationElement)* annotations ::= "[" prefixedAnnotationAttribute* annotationElement* "]" annotationAttribute ::= (identifier | CName) "=" literal prefixedAnnotationAttribute ::= CName "=" literal annotationElement ::= (identifier | CName) annotationElementBody annotationElementNotKeyword ::= (identifierNotKeyword | CName) annotationElementBody annotationElementBody ::= "[" annotationAttribute* (annotationElement | literal)* "]" identifierNotKeyword ::= identifier - keyword identifier ::= NCName | escapedIdentifier keyword ::= "attribute" | "default" | "datatypes" | "element" | "empty" | "externalRef" | "grammar" | "include" | "inherit" | "list" | "mixed" | "namespace" | "notAllowed" | "parent" | "start" | "string" | "text" | "token" CName ::= NCName ":" NCName escapedIdentifier ::= "\" NCName literal ::= '"' ([^"] | '""')* '"' | "'" ([^'] | "''")* "'" nsName ::= NCName ":*" anyName ::= "*"
Comments start with a #
and continue to the end of the
line.
element
is defined in the XML 1.0 Recommendation;
NCName
is defined in the
XML Namespaces
Recommendation.
Note that keywords are case-sensitive. To use a keyword as the name
of a definition, the keyword must be escaped with \
. It
is not necessary to escape a keyword that is used as the name of an
element, attribute or datatype parameter.
The correspondence between the non-XML syntax and RELAX NG's XML syntax is shown by the following tables.
Non-XML Syntax | RELAX NG Syntax |
---|---|
p1 | p2
|
<choice> p1 p2 </choice> |
p1 , p2
|
<group> p1 p2 </group> |
p1 & p2
|
<interleave> p1 p2 </interleave> |
p*
|
<zeroOrMore> p </zeroOrMore> |
p+
|
<oneOrMore> p </oneOrMore> |
p?
|
<optional> p </optional> |
(p)
|
p |
element QName { p }
|
<element name="QName"> p </element>
|
element nameClass { p }
|
<element> nameClass p </element>
|
attribute QName { p }
|
<attribute name="QName"> p </attribute>
|
attribute nameClass { p }
|
<attribute> nameClass p </attribute>
|
empty |
<empty/> |
notAllowed |
<notAllowed/> |
text |
<text/> |
mixed { p }
|
<mixed> p </mixed>
|
list { p }
|
<list> p </list>
|
identifierNotKeyword |
<ref name="identifierNotKeyword"/> |
\identifier |
<ref name="identifier"/> |
externalRef "uri" |
<externalRef href="uri"/> |
parent identifier |
<parentRef name="identifier"/> |
grammar { defs } |
<grammar> defs </grammar> |
"string" |
<value>string</value> |
string |
<data type="string"/> |
token |
<data type="token"/> |
prefix:localName |
<data type="localName" datatypeLibrary="uri"/> |
prefix:localName "string" |
<value type="localName" datatypeLibrary="uri">string</value> |
prefix:localName - p |
<data type="localName" datatypeLibrary="uri"><except> p </except></data> |
prefix:localName { params } |
<data type="localName"
datatypeLibrary="uri">
params
</data> |
Non-XML Syntax | RELAX NG Syntax |
---|---|
QName |
<name>QName</name> |
prefix:* |
<nsName ns="uri"/> |
prefix:* - nameClass |
<nsName ns="uri"<except> nameClass </except></nsName> |
* |
<anyName/> |
* - nameClass |
<anyName><except> nameClass </except></anyName> |
nameClass1 | nameClass2
|
<choice> nameClass1 nameClass2 </choice> |
(nameClass)
|
nameClass |
Non-XML Syntax | RELAX NG Syntax |
---|---|
localName = "string" |
<param name="localName">string</param> |
Non-XML Syntax | RELAX NG Syntax |
---|---|
identifierNotKeyword = p |
<define name="identifierNotKeyword"> p </define> |
identifierNotKeyword |= p |
<define name="identifierNotKeyword" combine="choice"> p </define> |
identifierNotKeyword &= p |
<define name="identifierNotKeyword" combine="interleave"> p </define> |
start = p |
<start> p </start> |
\identifier = p |
<define name="identifier"> p </define> |
include "uri" |
<include href="uri"/> |
include "uri" { defs } |
<include href="uri"> defs </include> |
A datatypes
declaration declares a prefix used in a
QName identifying a datatype. For example,
datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes" element height { xsd:double }
A namespace
declaration declares a prefix used in a
QName specifying the name of an element or attribute. For
example,
namespace rng = "http://relaxng.org/ns/structure/1.0" element rng:text { empty }
A default namespace
declaration declares
the namespace used for unprefixed names specifying the name
of an element (but not of an attribute). For example,
default namespace = "http://example.com" element foo { attribute bar { string } }
is equivalent to
namespace ex = "http://example.com" element ex:foo { attribute bar { string } }
A default namespace
declaration may have a prefix
as well. For example,
default namespace ex = "http://example.com"
is equivalent to
default namespace = "http://example.com" namespace ex = "http://example.com"
The URI may be empty. This makes the prefix stand for the absent namespace URI. This is necessary for specifying a name class that matches any name with an absent namespace URI. For example:
namespace local = "" element foo { attribute * - local:* { string }* }
is equivalent to
<element xmlns="http://relaxng.org/ns/structure/1.0"" name="foo" ns="http://example.com"> <zeroOrMore> <attribute> <anyName> <except> <nsName ns=""/> </except> </anyName> <data type="string"/> </attribute> <zeroOrMore> </element>
RELAX NG has the feature that if a file does not specify an
ns
attribute then the ns
attribute can be inherited from the including file. To support this
feature, the keyword inherit
can be specified in place of
the namespace URI in a namespace declaration. For example,
default namespace this = inherit element foo { element * - this:* { string }* }
is equivalent to
<element xmlns="http://relaxng.org/ns/structure/1.0"" name="foo"> <zeroOrMore> <element> <anyName> <except> <nsName/> </except> </anyName> <data type="string"/> </element> <zeroOrMore> </element>
In addition, the include
and externalRef
patterns can specify inherit = prefix
to
specify the namespace to be inherited by the referenced file. For
example,
namespace x = "http://www.example.com" externalRef "foo.rng" inherit = x
is equivalent to
<externalRef href="foo.rng" ns="http://www.example.com" xmlns="http://relaxng.org/ns/structure/1.0"/>
In the absence of an inherit
parameter on
include
or externalRef
, the default
namespace will be inherited by the referenced file.
In the absence of a default namespace
declaration, a
declaration of
default namespace = inherit
is assumed.
RELAX NG supports two kinds of annotation: element annotations and
attribute annotations. In this non-XML syntax, attribute annotations
are written in a similar way to the XML syntax. For example,
xml:lang = "en"
. Element annotations are written
using the syntax
elementName [ attributesAndContent ]
where elementName
is the QName of the
element and attributesAndContent
is a list of
attributes followed by a list of elements and literals.
Annotations are attached in one of the following ways:
>
and
then an element annotation; this is equivalent to a following sibling
element in the XML syntaxFor example,
namespace a = "http://relaxng.org/ns/compatibility/annotations/1.0" [ a:documentation [ "Represents a foo" ] ] element foo { [ a:defaultValue = "42" ] attribute bar { text }?, empty }
turns into
<element name="foo" xmlns="http://relaxng.org/ns/structure/1.0" xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"> <a:documentation>Represents a foo</a:documentation> <optional> <attribute a:defaultValue="42" name="bar"> <text/> </attribute> </optional> <empty/> </element>
Here's another example using the RelaxNGCC annotations:
datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes" namespace c = "http://www.xml.gr.jp/xmlns/relaxngcc" [ c:class="sample1" ] start = element team { element player { attribute number { [ c:alias="number" ] xsd:positiveInteger > c:java [ "System.out.println(number);" ] }, element name { [ c:alias="name" ] text > c:java [ "System.out.println(name);" ] } }+ }
turns into
<grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes" xmlns:c="http://www.xml.gr.jp/xmlns/relaxngcc"> <start c:class="sample1"> <element name="team"> <oneOrMore> <element name="player"> <attribute name="number"> <data c:alias="number" type="positiveInteger"/> <c:java>System.out.println(number);</c:java> </attribute> <element name="name"> <text c:alias="name"/> <c:java>System.out.println(name);</c:java> </element> </element> </oneOrMore> </element> </start> </grammar>
div
elementThe non-XML syntax cannot represent the div
element.
value
There is a problem in translating a schema such as
<element xmlns="http://relaxng.org/ns/structure/1.0"" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes" name="foo"> <choice> <value type="QName" xmlns:bar="http://example.com/1">bar:baz</value> <value type="QName" xmlns:bar="http://example.com/2">bar:baz</value> </choice> </element>
into the non-XML syntax. Although this can be translated, for example, into
namespace bar1 = "http://example.com/1" namespace bar2 = "http://example.com/2" datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes" element foo { xsd:QName "bar1:baz" | xsd:QName "bar2:baz" }
doing so requires that the translator have knowledge of the QName datatype.
James Clark[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC