[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [office] Please review syntax of formula work
David, I took a look at chapter 5 of the recent draft. My feedback is below. Please not that I have wrote down everything I noticed while reading the chapter, and the questions I have added below are in most cases not really questions to the TC/SC, but shall indicate that something was not clear to me. Please note further that I'm not a spreadsheet/formula expert. So some questions actually may be caused by that fact, and may not be justified. Please note further that although the list below seems to be long, I in general did like what I read. Best regards Michael General - There are multiple instances of " " (two blank characters). One should be sufficient. - References to external work should make us of the bibliography (see for instance reference to http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-notation. in the introduction to chapter 5. - The OpenDocument specification currently refers to XML 1.0 (Third edition). Is there are reason why OpenFormula refers to XML 1.1? - The OpenDocument specification uses a fixed font for everything that may appear literally in a document and for inline examples. Maybe the formula specification should do the same. I think this may in particular be useful for the symbols of the BNF grammar (a few examples are below). - The draft contains some explanations that help to understand why certain decisions have been made (and that do not have a background), but that are not required to implement the specification. An example is in Section 5.7, the paragraph starting with "This naming scheme enables different applications to innovate without interfering with each other or with standard functions. ...". My suggestion would be to remove these kind information from the normative specification document, but to have an annotated non-normative version that contains this information. - Is it an option to move the test cases into a separate chapter? I think this may be advantageous for to reasons. First, I believe most readers of the specification are not interested in the test cases, but in a compact representation of the topic. Second, Implementors probably want to create test suites that cover all test cases, and therefore benefit from having them in one place, too. Chapter 5 "When this occurs, various characters (such as "<", ">", '"', and "&") must be escaped, as described in the XML specification. In particular, the less-than symbol "<" is represented as <, the double-quote symbol is represented as ", and the ampersand symbol is represented as &." Suggestion: "When this occurs, various characters (such as "<", ">", '"', and "&") must be escaped, as described in section 2.4 of XML specification ([XML1.0]). In particular, the less-than symbol "<" must be represented as < (or as a numeric character reference), the double-quote symbol as ", and the ampersand symbol as &.". Section 5.1 "The optional namespace tells ..." Suggestion: "An optional namespace prefix tells ..." In general, I think most occurrences of "Namespace" in this section actually should be "namespace prefix". Did the SC consider to remove namespace prefix from the grammar, and to describe it instead in a/the section that describes how formulas are embedded into OpenDocument documents? "When used in OpenDocument attributes table:formula and text:formula, applications should not include this namespace, as it is unnecessary. ..." I think this conflicts with the current OpenDocument specification, which says that a namespace prefix should be used. My suggestion would be that we state in the ODF 1.2 spec that a formula that has no prefix is an OpenFormula formula, and to state the same in the OpenFormula specification in a non-normative way. I'm not sure if we can do anything for ODF 1.0/1.1. In general, I think the normative description how formulas are used within OpenDocument should be in the OpenDocument specification itself. "Namespace_in_XML ::= http://www.w3.org/TR/REC-xml-names" Does an URI in the BNF used here has a special meaning? If not, what about "Namespace_in_XML ::= Prefix /* A prefix as declared in section 4 of [xml-names] */ I further would rename "Namespace_in_XML" into "Namespace_prefix" Section 5.2 "simple" -> "simply" "The primary component of a formula is an expression" Suggestion: "The primary component of a formula is an Expression" This makes clear you are not talking about expressions in general, but about the BNF production Expression. You may also use a fixed font in this case to make it look like the BNF. "SingleQuoted ::= " Why does "SingleQuoted" appear here? It's not used in this section. "There is no special syntax for the logical constants for truth and falsity, since this is unnecessary; simply use the ..." I was first confuses to read this in that section, because "constants" have not been introduced before, but noticed later that "Number" and "String" are of course constants. So my suggestion would be rephrase this similar to "While the formula syntax defines literal numbers and strings, it does not define literal string constants. Instead, the standard functions TRUE() and FALSE() [add a reference here] can be used." Section 5.3 Is there a better name for "WrittenNumber" and the the term "written" in the description? Is there a formal reference to "C" or en-US locale. If not, I suggest to remove the reference to locales. Section 5.4 "Note that when a formula is stored as an XML attribute (the usual case), XML quoting rules apply: thus double-quote characters are recorded in the XML such as ", and carriage return characters in the String are recorded as
.A constant string as defined by this syntax, shall be considered to be type Text." Suggestion: "Note that when a formula is stored as an XML attribute, XML escaping rules apply: thus double-quote characters must be escaped as ", and carriage return characters in string constants as
. A constant string as defined by this syntax, shall be considered to be type Text." (It's not essential to change that, but I noticed that the introduction uses the term "escape", and this section the term "quote". Please note the missing space before the last sentence). Section 5.5 "There two predefined" -> "There are two predefined" "written as &" -> "must be escaped using &" Sometimes the operator symbols (+, -, etc) are mentioned in the description and sometimes this is not the case. "Also note that while prefix "+" and "-" are right-associative, because "+" is a no-operation, applications which implement at most these operators, using only the semantics defined by this specification, may implement them as left-associative since the results will be identical." I have to admit that I don't understand that sentence. "Implementations' user interfaces may display these operators differently or with a different precedence, but when exchanging formulas they shall use the precedence rules here." Suggestion: "Implementations' user interfaces may display these operators differently or with a different precedence, but conforming applications must store formulas using the operator names and precedences defined in this specification." In general, I would not use the term "exchange", because this sounds like OpenFormula being only an exchange format, what it clearly is not. Section 5.6 "Functions are called by giving their named" -> "Functions are called by giving their name" "LetterXML ::= ", "DigitXML ::= ", "CombiningCharXML ::=": See my comment on URL's above. "User-defined function names may use an arbitrary identifier," Does that mean that user-defined function names have to follow the production "Identifier"? Or does it mean that they can really use an arbitrary identifier? I assume the first is the case. To make this clearer, I suggest to either format BNF grammar names (for instance by using a fixed font), or to add numbers to the productions and to reference them. The first option probably is much easier to implement, and also easier to maintain. "Applications may (and often do) display a different function name in their user interface than ..." Suggestion: "Implementations' user interfaces may translate function names, may omit application prefixes, or may replace the function names defined in this specification with arbitrary other function names, but conforming applications shall not use these functions names when storing formulas." I further suggest to define what the application prefixes are, or to reference section 5.7 and to use the terminology from that section. Section 5.7 Is there a definition somewhere what the "standard" names are? "Applications that do not support a function should compute its result as some Error value other than NA() when calculating its result." -> "Applications that do not support a function should compute its result as some Error value other than NA()." Section 5.8 "in the storage format": Are there different formats? If not, remove this. "Where possible, applications should use references with embedded ":" separators, instead of the general-purpose ":" operator, when saving files, and where there is a choice of cells to join, and application should choose the leftmost pair. " This sentence becomes only clear then reading the next one. Reference to definitions of the two ":" may help. Are the explanations regarding URIs and relative IRIs/URIs required, or would it be sufficient to state that the could be absolute or relative IRIs? Section 5.9 "Automatic lookup of labels can be enabled or disabled in the document settings." Are these the application specific setting defined in OpenDocument, or something else? A reference may be helpful here. Section 5.10 "Applications supporting named expressions must support named expressions that are global to all the sheets in a (spreadsheet) document in the current document (this is a named expression without a Source, QuotedSheetName, or SubtableCell)." - must is not in our control language any longer. Would it be sufficient to say: "Applications supporting named expressions shall support named expressions that are global to all sheets in the current (spreadsheet) document" Is there a definition of "portable documents"? Would it be possible to distribute the descriptions of the various named expression types and the grammar to the subsection? Section 5.11 "The inline error value NA shall be represented as "#N/A" when represented as an inline error, and this is portable across implementations." My understanding is that "#N/A" is the only portable error value, so my suggestion is to write: "The only portable error values is "#N/A", which represents the NA inline error value." The sentence "Portable documents should not use inline error values other than #N/A, as error values are not necessarily portable between applications." may be omitted then. Section 5.12 "Applications that support inline arrays must accept" -> "Applications that support inline arrays shall accept" Section 5.13 "Whitespace (space, tab, newline, and carriage return) is ignored in the default formulas syntax, except inside the contents of string constants and text surrounded by single quotes" Is there another than the default formula language? Suggestion "Whitespace (space, tab, newline, and carriage return) shall be ignored, except it occurs inside the contents of string constants and text surrounded by single quotes" Would it be an option to extend the grammar with whitespace characters, as it is the case of XML 1.0?