xcbf message

Subject: An alternative proposal has been uploaded
From: John Larmouth <j.larmouth@salford.ac.uk>
To: j.larmouth@salford.ac.uk
Date: Tue, 20 May 2003 13:59:47 +0100
You will shortly get (or have already got) official notification from 
the OASIS Web software that I have uploaded another document.

This is an alternative proposal, based on NOT using BASE64.

I want to discuss here some points related to the use or non-use of 
BASE64, in order to allow an informed decision on which to progress to a 
  CS ballot.

(BOTH documents are complete specifications that I personally am happy 
with, and would vote YES on either.  I do, however, as will become clear 
below, prefer this second proposal, as it is much simpler and easier to 
implement, loses virtually nothing, and does not require a revision when 
EXTENDED-XER is approved.)

It hung in the balance on our last telecon whether to use BASE64 or not, 
with most people saying they did not really care, with the decision to 
use it swinging almost entirely on the remarks from Bancroft and myself 
that Phil, who left the meeting early, would be strongly pushing for 
BASE64.  That may well still be true.  But a technical assesment follows.

First, let us dispose of the "BASE64 armoured" concept.  You will see 
that I deleted that term from the text in both proposals.  It is 
meaningless when applied to an octet string value.  A hex encoding of an 
octet string value is just as much "armoured" as a base64 encoding.  The 
**only** difference is that the number of characters needed by base64 is 
typically reduced by 30% from the hex character count.  I repeat, this 
is the ONLY difference for an octet string.

The term "base64 armoured" CAN be legitimately applied to a character 
string.  This is actually its main value in EXTENDED-XER.

The point here is that the rules of XML FORBID some characters in an XML 
document, EVEN IF THEY ARE EXPRESSED USING THE XML-DEFINED ESCAPE 
MECHANISMS. So if you want your character string (at the abstract level) 
to be able to carry all ISO 10646 | Unicode characters, you cannot do it 
using either the UTF8 encoding of the character nor using the 
XML-defined escape sequences without violating XML rules.  Applying 
BASE64 to the UTF8-encoding of the character string value and then 
UTF8-encoding the BASE64 characters allows that XML element to contain a 
representation of ANY character string, without violating XML rules. 
This is truly "base64 armouring".  Typically, the size of the encoding 
will be INCREASED by 30%.

But to repeat myself, when applied to an OCTET STRING, base64 does 
nothing that hex does not do other than reducing the verbosiuty by 30%.

Is the reduction worth having?  Here we have to examine where and when 
BASE64 is applied in the first proposal (which mirrors the last proposal 
from Phil, in this regard).  It is applied ONLY to the octet strings 
that contain X.509 certificates and X.509 certificate revocation lists, 
and ONLY if the outer-level encoding is BASIC-XER, not if BER is used.
So it will produce a 30% reduction in a couple of octet strings that 
will form a very small part of the total (verbose) XML document.  The 
gains are actually miniscule.

(There is the small(?) point that XCBF and ANSI X9.84 - currently out 
for public comment - are aligned in this area.  If BASE64 is NOT used in 
XCBF, it would be good to get comment to have it removed in X9.84 as 
well, so that the two stay aligned.)

Now, OK.  But why not avoid problems with non-compatibility with X9.84 
and stay with BASE64 for these two octet strings?  What are the 
disadvantages?  They certainly exist.

We have tried to say that the use of BASE64 is "in anticipation" of the 
X.693 ammendment 1 that defines EXTENDED-XER.  I think the text I have 
given you is about as good as can be got in this area (see the footnotes 
1, 2, and 3 in the first proposed revision and the text in 7.4.2 - use 
the "View Print Layout" to see the footnotes).

This "anticipation" is in itself a bit unsatisfactory, but the problems 
are more serious.

I want to draw attention to footnote 3, and to expand on it.  Here is a 
copy of that footnote:

 >>>>>>
This is in anticipation of the acceptance of Amendment 1 to X.693, which 
makes provision for the use of BASE64 encodings.  Formal use of this 
amendment will require the outer level encoding to be changed to 
EXTENDED-XER (see 7.4.3) and the addition of XER encoding instructions. 
  This will also imply that a decoder will be required to accept the 
presence of XML DTDs, Processing Instructions, Comment, and accept and 
ignore attributes such as xsi:type and xsi:SchemaLocation.
<<<<<<<

This footnote raises ambiguity on what is a conforming implementation to 
this actual spec:  is a conforming implementation required to conform to 
BASIC-XER until the Amendment is approved, and then to EXTENDED-XER? I 
think I have written the text in such a way that EXTENDED-XER is NEVER 
used unless or until we produce a new version of the spec referring to 
EXTENDED-XER rather than to BASIC-XER.  Remember, the only reason for 
wanting to do that is this minimal use of BASE64 in the current spec, 
and alignment with X9.84 (which in my view has also probably got it 
wrong!)  But X9.84 will not get finally approved until after the 
Amendment is in place, and the overheads of a full EXTENDED-XER encoding 
(see below) are likely to be more acceptable there than in an OASIS 
standard?????

What are the overheads of saying that the outer-level is EXTENDED-XER 
and not BASIC-XER?  The above copy of the footnote summarises it.  It is 
importent here to realise that the primary raison d'etre for 
EXTENDED-XER was to provide support for the mapping from XSD, and the 
use of ASN.1 in conjunction with general XML/XSD tools.  A BASIC-XER 
encoding (in the absence of EXTENDED-XER encoding instructions) *is* a 
valid EXTENDED-XER encoding, so for encoders there is no problem.  The 
problem is for conforming decoders.  They are REQUIRED to accept DTDs in 
the XML document (for example that define character entities to reduce 
the verbosity of some XML documents), and they are REQUIRED to accept 
and ignore random xsi:type and xsi:SchemaLocation attributes, and they 
are REQUIRED to accept XML Proceeing Instructions and Comment wherever 
XML permits these to occur (more-or-less everywhere).  All this adds to 
the implementation cost of an EXTENDED-XER decoder.

Note that there is no option available in prospective ASN.1 
standardisation to be able to include an encoding instruction to say 
"BASE64" **without** the requirement for a decoder to accept these 
additional options in the encoding.

So the real issue is not so much which spec we decide to approve now, 
but rather where we intend to progress after that.  We can:

	a)	Use the alternative proposal (no use of BASE64), and be finished and 
simple, but not (currently) X9.84 compatible, and **very** marginally 
more verbose;  or

	b)	Use the first proposed revision and never move formally to 
EXTENDED-XER, accepting that we will be doing a "special" for our 
encodings,  albeit a "special" that tool vendors that have imnplemented 
EXTENDED-XER can easily support, because all it means is an EXTENDED-XER 
encoder (to recognise the [XER:BASE64] syntax) and a decoder with lots 
of functionality that should never get used.   (In this case, the first 
of my "proposed revision" documents could probably remove all text about 
"anticipation" and EXTENDED-XER, and just openly admit it is a 
non-standard encoding that we are requiring).

	c)	Use my first proposed revision, and then produce a new version that 
formally says that EXTENDED-XER is to be used.  This is what the current 
text of the first proposed revision was targeting ("anticipating"), and 
we should not need to change that text for this option, but XCBF 
decoders would have a harder job in the long-term.

The only disadvantage of a) is that it may not be X9.84 compatible, 
unless X9.84 is changed on public comment.  Does that matter? Can X9.84 
be changed to align with a)?  There are NO technical disadvantages with 
a), as explained above.

The only disadvantage of b) is a "special" encoding, but one that is 
probably fairly easy for tool vendors to support.  This may or may not 
be X9.84 incompatible, depending on whether X9.84 is clarified to say it 
really means EXTENDED-XER, or whether it is clarified to say it is this 
"special" encoding with BASIC-XER.  (Like the text I inherited, X9.84 is 
utterly ambiguous in this regard at present.)

The disadvantages with c) are:  The potential confusion in the 
"anticipating" concept, and in a second OASIS spec that says 
EXTENDED-XER when the first said BASIC-XER;  The extra complexity of 
requiring decoders to suppotr the full range of additional encodings 
(DTDs, comment, etc) of EXTENDED-XER in the long-term.

I am sorry this has been such a long "essay".  I believe what I have 
said is factually correct, but there are clearly subjective judgments to 
be applied.

The bottom-line is that I will personally go for any of a) to c), we 
just need a decision.

John L

-- 
PLEASE NOTE - As an anti-SPAM measure, e-mails will shortly
not be accepted by my machine from an unknown sender unless
the subject contains the phrase "Hi John".

If you pass my e-mail address to others (which I am very happy
for you to do) please tell them to include this phrase in the
subject line of their first mailing to me.  Thanks.

    Prof John Larmouth
    Larmouth T&PDS Ltd
    (Training and Protocol Development Services Ltd)
    1 Blueberry Road
    Bowdon                               j.larmouth@salford.ac.uk
    Cheshire WA14 3LS                    (put "Hi John" in subject)
    England			
    Tel: +44 161 928 1605		Fax: +44 161 928 8069
Follow-Ups:
- Re: [xcbf] An alternative proposal has been uploaded
  - From: "Ed Day" <eday@obj-sys.com>
References:
- Groups - XCBF XML Common Biometric Format CS JLs proposed revision.doc uploaded
  - From: j.larmouth@salford.ac.uk