OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] ISO 14977 EBNF grammar

Patrick Durusau:
> The ISO EBNF grammar: 
> http://standards.iso.org/ittf/PubliclyAvailableStandards/s026153_ISO_IEC_14977_1996(E).zip

Currently, in OpenFormula we use W3C's XML spec for BNFs instead of the ISO EBNF

The ISO spec is not a _bad_ one, but it does have weaknesses for the formula purposes.

When comparing ISO's with the W3C XML spec:
* The ISO spec has no support for character ranges and negated ranges
  (part of regular expressions), while XML's does.
  We use this capability; for some (like SheetName) it's not clear how hard that
  change will be.
* The ISO spec requires the use of "," for concatenation, instead of that
   being the default.  It also requires ";" to terminate every production.
   As a result, ISO's format is much wordier to express the same thing. E.G.:
   ...  SheetLocator "." Column Row (':' SheetLocator "." Column Row )? ..
  would become:
   ...  SheetLocator, ".", Column, Row, (':', SheetLocator, ".", Column, Row )? .. ;
   Not a show-stopper, but I think that's unfortunate.  Concatenation is EXTREMELY
   common in BNF, so failing to have it as a default operator complicates the spec.
* The ISO spec's definition operator is "=", which is easily confused with the "="
   used inside BNFs themselves.  That's not as big a deal.

Historically, the ISO document was expensive, while the XML specification was
freely available.  I think it would have been unconscionable to have referred to the
ISO spec while it was expensive, but now that the ISO document is publicly available
without fee, I think it _could_ be used.  However, its lack of regular expression
support, and unnecessary wordiness, do not give any incentives.

It would take a little time to change to the ISO BNF.  To change this in OpenFormula,
all the productions would have to change, e.g., change "::=" to "=", inserting
commas everywhere, and trying to figure out how to replace the character ranges.

Another BNF format is IETF RFC 4234.  It's kind of ugly; alternatives use "/" instead
of the more common "|", and you HAVE to group them (which is a pain).  Even weirder,
to indicate repetition you PRECEDE the item with "*" instead of follow it.
Every book I've seen has "*" FOLLOW the item to be repeated.
In my mind, IETF's is the worst of the three in terms of clarity.

After looking at the W3C (XML), ISO, and IETF formats for BNFs, we chose the
W3C's XML format. Reasons:
* W3C's format produces the clearest, simplest specs with the same meaning.
  Concatenation is the default, alternatives are "|", the "*" is AFTER the repeated item.
  The resulting spec, with the same meaning, is simpler than with ISO or IETF.
* W3C's format includes character range support.  ISO's does not.
* OpenDocument is itself based on XML, so it made sense to use the same format
   used to spec XML.
* XML's format is publicly available at no charge. At the time I believe that was
   not true of ISO's format.  It appears this point, at least, is moot (hooray!).

XML is a standard, in every reasonable sense of the word, so using its BNF format
is (in my opinion) very defensible.

--- David A. Wheeler

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]