OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

relax-ng message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: Re: Datatype interface for RELAX NG

> > One more comment.  I am concerned about the efficiency when processing
> > base64Binary or hexBinary.  It seems quite possible that applications
> > have extremely large content with these types (many megabytes). It would
> > desirable for the interface to allowed validating these types without
> > keeping the entire instance of the datatype in memory.
> How important is this capability?

Hard to say.  It depends how frequently documents will contain instances of
simple types with very long lexical representations.

> I thought about this at the very beginning of the development (of Sun
> XML Datatypes Library). But I abandoned this goal.
> - Firstly, this change makes base64/hex validators stateful objects.
>   - All other grammar primitives (ChoicePattern, etc) are stateless and
>     therefore can be shared and reused; stateful objects can't. Sharing
>     a grammar among multiple threads seemed important.
>   - OTOH, changing validators to the factory objects that creates actual
>     stateful validators seemed cumbersome to me.

This need not be cumbersome for clients of the library.  You add a method to

  ValueIterator createValueIterator();

Then create a ValueIterator interface roughly like this:

interface ValueIterator {
  boolean characters(char[] buf, int start, int len, boolean isLast);

Clients that need this capability can have it.  Those that don't can use the
existing, simpler interface.

As well as base64Binary and hexBinary, this would be useful for string.

> - Secondly, it also affects the core validation code, which is another
>   thing I don't want.

It certainly has a complexity cost for the code.

> - Thirdly, if the performance is really a problem, you can always skip
>   validating that specific part by modifying your grammar a little bit.

I don't think that's an option in many cases.  It is also useful for string
type; even with a pattern facet there is no need for the entire string to be
in memory.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC