[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: The relaxng-compact processing instruction
One possibility for a mechanism for associating a schema with a document is to have a processing instruction that directly contains a compact syntax schema. For the sake of discussion, I will use a target name of relaxng-compact for this. If a user wants simply to reference an external schema, they can just do: <?relaxng-compact external "myschema.rnc"?> But they can also easily do customizations as is sometimes done in the internal subset. For a one-off schema, they could even put the entire schema in the processing instruction. Lexically, this works rather well. Processing instructions can contain anything other than ?>, but that is unlikely to occur in a compact schema. The compact syntax provides its own character escaping mechanism which works well with the fact that processing instructions don't have a character escaping mechanism. With the compact syntax inside a processing instruction: - there's one and only one way to escape characters - there's a way to allow any schema no matter what character sequences it contains We also get to take advantage of the XML encoding declaration mechanism, which works well the fact that the compact syntax doesn't provide an encoding declaration syntax. One of the arguments against processing instruction is that it doesn't expose the internal structure of the contents as XML, but that doesn't really apply here since the whole point of the compact syntax is to be an alternative non-XML syntax. Another thing I like about this is that there's very little to specify. There are basically only two things: i) what target name to use ii) where the processing instruction can go To keep things simply I think the processing instruction should go before the document element; I don't want to get into specifying schemas for subtrees. So I think the reasonable possibilities are a) anywhere in the prolog except after the DOCTYPE b) anywhere in the prolog, but in the document entity c) anywhere in the prolog, including the external subset and external parameter entities (c) creates some intriguing possibilities. For example, the XHTML folks have, as far as I understand it, a couple of problems in moving from DTDs to schemas. One big problem is that they need to be able to use character entities. Another is that they have lots of different profiles with the same namespace URI: they have a tradition of using the DOCTYPE public id identifies what profile they are using. XHTML conformance requires a particular DOCTYPE external id; browsers tweak their rendering behaviour based on particular DOCTYPEs. With (c), you can create a DTD that contains just internal general parsed ENTITY declarations and a processing instruction containing a compact syntax schema. Users can continue to create documents just as before, including character entities and a DOCTYPE declaration. The OASIS catalog mechanism could be used to switch between different versions of the DOCTYPE depending on the context: perhaps one without the processing instruction, one with the processing instruction and another approximating the RELAX NG schema with complete DTD. James
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]