[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [office-formula] Syntax - recent changes, please look over..
Hi David, On Sun, Apr 23, 2006 at 14:20:11 -0400, David A. Wheeler wrote: > The syntax has stayed very stable; I think the primary > reason is that the current syntax handles the "normal" cases > quite nicely. However, we need to handle ALL cases, even the > truly weird ones. Please take a look and kibitz/improve what's there! > I've noted repeatedly that I want this DONE by May 2. As already said, I doubt we'll have the entire syntax done by that date. I have only limited time to spend right now, and with only you and me working on it I don't think we'll have everything clarified by then. Maybe we should first agree on the "normal" cases until then and leave the ALL to be done later? > I've made a few changes to the proposed syntax on the Wiki; > please take a look, and see if you agree with them. Regarding the #REF vs. #REF! change: should we really take care of lex/flex-based lexers there? What about #REF.#REF#REF then? And how does that fit with the Error values below? Error value definitions: IMHO #DIV/0 is not necessary as input, it can always be generated by =1/0. Furthermore, as a formula =#DIV/0 could be ambiguous. > Here are the changes, in a nutshell: > * Restored the ability to use '$' in front of the column or row, > and added it to the sheetname too. Somehow that got removed among > the other changes, it was there before. Seems I removed the '$' from the SheetName when I cleaned out the ASCII-only definition there. > * Bare sheetnames can't include # or $ -- you have to enclose them with '..'. > At the very least, it's a problem if they can START with those characters, > because then you can't tell if you have an error or non-relative sheet, > or a funny sheetname. I'm thinking that perhaps we should be more > restrictive about bare sheetnames anyway, maybe just limit them to > Identifier characters. In fact, that may be important for lexing (I'll see > about that soon). Thoughts, anyone? Restricting bare unquoted sheet names to Identifier characters (like defined by us) in general probably goes into the right direction. However, this may clash with the definition of ODF 8.3.1, in "Absolute and relative cell addressing", which does not restrict bare names in any other way than [^\. '], which IMHO is not sufficient. > * ":" (when outside [..]) is now a top-precedence operator, > to handle stuff like [.A1:B3]:[.X6]. This is required when the > cell ranges are NOT constants but instead are named expressions > or function results (e.g., MYFUNC1():MYFUNC2()). Ok. That made me thinking about the precedence of Range, Union and Intersection, which we currently define in that order (btw, I think we should write the table in the opposite order for clarity on 'highest' and 'lowest' priority). Several sources about Excel (e.g. http://support.microsoft.com/kb/25189/EN-US/ ) claim it to be Range, Intersection, Union instead. Which, given that their Union operator is the ',' comma operator, same as the function parameter separator, somehow makes sense. It needs to try out some combinations to have that verified, which I didn't yet. > * I hooked in "AutomaticIntersection" so it could actually be used. > It had been defined earlier by someone else, but was never used. Erm, pardon? I didn't get that.. hooked in where? I don't see any related change in the diffs. On the other hand that reminds me that I didn't finish that section yet and the implicit intersection is yet to come.. > * I inserted an Array syntax. Think of this more as a stub... we need one. > Comments there, or improvements, would be ESPECIALLY appreciated. First: is there any application that allows Expressions instead of constants in inline-arrays? I think supporting Expressions there would overcomplicate things. > The current syntax for AutomaticIntersection is very peculiar. > It means that this is a string: > "Hello" > But single quotes means it's going to be specially interpreted as > a reference to a value identifying a row or column: > 'Hamburg' !! 'Sales' > So there's a SERIOUS difference between ' and ". Sure, a string is a literal string, and a quoted identifier is not ;-) > Eike, can you explain this? I'd _prefer_ having just String representation, > and then using !! as yet another operator, so it'd look like this: > "Hamburg" !! "Sales" > Can anyone help me understand why that is a bad idea? The easiest is an example without the intersection operator, an implicit intersection: given a column of values labeled XXX on top of all values, placing the formula ='xxx' somewhere beneath that column displays the value of the very same row. Writing that as a string ="xxx" would of course display the literal string xxx instead. A literal string, after having possibly eliminated one of two duplicated quote characters for literal quotes, should never need any further processing before the formula is interpreted and calculated. "Hamburg"!!"Sales" would violate that, whereas 'Hamburg'!!'Sales' fits perfectly well into other addressing schemes like sheet names and external sources. This makes the difference between quoted literal strings and quoted identifiers. > I plan to try to convert this into a flex/bison implementation. > I think it's important that the syntax be easy to implement without > a lot of complex state manipulations using typical tools. IMHO we should not base the syntax on available tools. Being tools friendly is of course nice and tempting, but "sophisticated" features may require solutions easy tools don't fit in.. Eike -- Automatic string conversions considered dangerous. They are the GOTO statements of spreadsheets. --Robert Weir on the OpenDocument formula subcommitee's list.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]