[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Storing sheet-local named expressions in ODF
Some spreadsheet applications (Excel and Gnumeric at least) allow named expressions (which are basically spreadsheet variables) to be locally scoped to a specific sheet instead of being globally scoped. While a vast number of spreadsheet documents don't require local sheet names, there are documents that depend on this. There is a workaround solution in ODF as-is, but it has several weaknesses, so I'd like to standardize a better way to address sheet-local names. In the current draft formula syntax, the formula SC has a syntax for directly representing references to sheet-local names. But for this to be useful, there needs to be a standard way to _store_ sheet-local named expressions in ODF documents. Can we discuss how to handle them? As a starting point, I propose that in the rule <element name="table:table">, between the last (sub)rule <ref name="table-rows-and-groups"/>, and the closing </element>, we add this: <optional> <ref name="table-named-expressions"/> </optional> This would allow a table:table to directly contain table:named-expressions, which would then be interpreted as sheet-local names. Currently, element "table:named-expressions" is only defined as being part of the epilogue of the entire spreadsheet document's "office:spreadsheet", so spreadsheets look like this: <office:spreadsheet> <table:table table:name="Sheet1" table:style-name="ta1" .... (table contents, including its data) </table:table> ... <table:named-expressions> <table:named-range table:name="GETB2" table:base-cell-address="$Sheet1.$B$2" table:cell-range-address="$Sheet1.$B$2"/> <table:named-expression table:name="Makefour" table:base-cell-address="$Sheet1.$A$4" table:expression="2+2"/> </table:named-expressions> </office:spreadsheet> This would still be fine; table:named-expressions that are children of office:spreadsheet would continue to be considered global names. The table:named-expressions that are children of table:table would be sheet-local names. Currently, since named expressions are only allowed in the spreadsheet epilogue, the "usual" way to represent quasi-sheet-local names in ODF is by creating names that are a concatenation of the sheetname and the user's original name. That works in terms of getting calculations correct, so no one needs to worry that ODF "isn't ready for use". ODF _does_ work, as-is. But this IS a little awkward. Users have to use long names instead of the usual short names in a Sheet, e.g., in Sheet1 I have to refer to Sheet1_myname instead of simply "myname". Worse, sheet copying does NOT work as expected with local names; if I copy the sheet and call it "Sheet5", all the named references will still refer to the variables named "Sheet1..." and will NOT automatically switch to Sheet5. So copying such sheets is error-prone, and requires excess work (lots of renaming). A similar problem happens when you use XML tools to extract a particular sheet - you'd like the sheet-local named expressions to be directly accessible. The problem is worse if you allow subtables, something that Excel doesn't permit but IS specifically mentioned in the OpenDocument spec. The workaround is fine for now, and future applications will be able to perpetually read backwards to documents that use the workaround, but there's no reason we can't add improved support for the future. A separate issue that MIGHT be relevant is a technique for improving the performance of loading external values. A spreadsheet formula can refer to external spreadsheet document's names (and in ODF, cells). An application _CAN_ load the entire external document to get those values, but this loading process could become quite lengthy. It would be very nice if the packaged format defined an optional set of files that directly stored the values of named expressions. E.G., perhaps there's a file named "global-named-expressions.xml", and a subdirectory "named-expressions" that contain files for each sheet. Each of these would list the names of named expressions and their values. If the files don't exist, and a named expression is requested, the application just accesses them the usual way; if an application wants such external access to be faster, it just puts them in the zip archive. Now reading them involves unzipping only those files. Since in my conception there is one file per sheet, and a global file, I don't think that these issues conflict with each other. Allowing table:named-expressions as an optional child of table:table seems like a reasonable way to handle sheet-local variables to me. But I'd like to hear others' comments. This is an interface issue between the formula subcommittee and the overall OpenDocument TC, so I'm cross-posting to both lists. Below are relevant extracts from the ODF spec, including the "Spreadsheet Documents" and "Named Expressions" sections. --- David A. Wheeler ==================================== (From section "Spreadsheet Documents":) The content of spreadsheet documents mainly consists of a sequence of tables. Additionally, a spreadsheet document may contain forms, change tracking information and various kinds of declarations that simplify the usage of spreadsheet tables and their analysis. Each of these are contained in either the document prelude, or the document epilogue. <define name="office-body-content" combine="choice"> <element name="office:spreadsheet"> <ref name="office-spreadsheet-attlist"/> <ref name="office-spreadsheet-content-prelude"/> <ref name="office-spreadsheet-content-main"/> <ref name="office-spreadsheet-content-epilogue"/> </element> </define> Spreadsheet Document Content Model The spreadsheet document prelude contains the document's form data, change tracking information, calculation setting for formulas, validation rules for cell content and declarations for label ranges. <define name="office-spreadsheet-content-prelude"> <optional> <ref name="table-tracked-changes"/> </optional> <ref name="text-decls"/> <ref name="table-decls"/> </define> <define name="table-decls"> <optional> <ref name="table-calculation-settings"/> </optional> <optional> <ref name="table-content-validations"/> </optional> <optional> <ref name="table-label-ranges"/> </optional> </define> The main document is a list of tables. <define name="office-spreadsheet-content-main"> <zeroOrMore> <ref name="table-table"/> </zeroOrMore> </define> The epilogue of spreadsheet documents contains declarations for named expressions, database ranges, data pilot tables, consolidation operations and DDE links. <define name="office-spreadsheet-content-epilogue"> <ref name="table-functions"/> </define> <define name="table-functions"> <optional> <ref name="table-named-expressions"/> </optional> <optional> <ref name="table-database-ranges"/> </optional> <optional> <ref name="table-data-pilot-tables"/> </optional> <optional> <ref name="table-consolidation"/> </optional> <optional> <ref name="table-dde-links"/> </optional> </define> (section "Table Element"): The table element is the root element for tables. <define name="table-table"> <element name="table:table"> <ref name="table-table-attlist"/> <optional> <ref name="table-table-source"/> </optional> <optional> <ref name="office-dde-source"/> </optional> <optional> <ref name="table-scenario"/> </optional> <optional> <ref name="office-forms"/> </optional> <optional> <ref name="table-shapes"/> </optional> <ref name="table-columns-and-groups"/> <ref name="table-rows-and-groups"/> </element> </define> (section "Named Expressions"): The named expressions element <table:named-expressions> contains a collection of assignments of names to expressions, so that the names can be use to refer to the expression. The following expression can get names: cell ranges. Other expressions, for example, parts of a formula. <define name="table-named-expressions"> <element name="table:named-expressions"> <zeroOrMore> <choice> <ref name="table-named-range"/> <ref name="table-named-expression"/> </choice> </zeroOrMore> </element> </define> Named Range The named range element <table:named-range> specifies a cell range that has a name assigned. For information on defining a cell range, see section 8.3.1. The table:name attribute specifies the name of the range, and the table:cell-range-address attribute its address. The address can be either absolute or relative. If the cell range address is relative, the table:base-cell-address attribute must exist additionally. It specifies the base cell address for the cell range. This address must be absolute. Therefore a table name in the address is required, but the dollar signs that indicate an absolute address can be omitted. An additional table:range-usable-as attribute specifies whether the name of the range can be used within the specification of a print range, a filter, a repeating row, or a repeat column. The value of this attribute can be either: none, or a space-separated list that consists of any of the values print-range, filter, repeat-row or repeat-column. <define name="table-named-range"> <element name="table:named-range"> <ref name="table-named-range-attlist"/> <empty/> </element> </define> <define name="table-named-range-attlist" combine="interleave"> <attribute name="table:name"> <ref name="string"/> </attribute> <attribute name="table:cell-range-address"> <ref name="cellRangeAddress"/> </attribute> <optional> <attribute name="table:base-cell-address"> <ref name="cellAddress"/> </attribute> </optional> <optional> <attribute name="table:range-usable-as" a:defaultValue="none"> <choice> <value>none</value> <list> <oneOrMore> <choice> <value>print-range</value> <value>filter</value> <value>repeat-row</value> <value>repeat-column</value> </choice> </oneOrMore> </list> </choice> </attribute> </optional> </define> Named Expression The named expression element <table:named-expression> contains an expression with a name, for example, a part of a formula. The table:name attribute specifies the name of the expression, and the table:expression attribute the expression itself. The expressions do not support the equal (=) sign as the first character. If the expression contains a named range or another named expression, the named range or named expression must be specified first, before the containing expression. If the expression contains a relative cell range address, the table:base-cell-address attribute must exist additionally. It specifies the base cell address for the cell range. This address must be absolute. Therefore a table name in the address is required, but the dollar signs that indicate an absolute address can be omitted. <define name="table-named-expression"> <element name="table:named-expression"> <ref name="table-named-expression-attlist"/> <empty/> </element> </define> <define name="table-named-expression-attlist" combine="interleave"> <attribute name="table:name"> <ref name="string"/> </attribute> <attribute name="table:expression"> <ref name="string"/> </attribute> <optional> <attribute name="table:base-cell-address"> <ref name="cellAddress"/> </attribute> </optional> </define> Example: Named expressions element with a named range and a named expression <table:named-expressions> <table:named-range table:name="sample1" table:cell-range-address=".C4" table:base-cell-address="sampletable.F1" table:area-type="none"/> <table:named-range table:name="sample2" table:cell-range-address=".$D$3:.$K$8" table:area-type="print-range filter"/> <table:named-expression table:name="sample3" table:expression="sum([.A1:.B3])"/> </table:named-expressions> (And here's how table:table is defined:) <element name="table:table"> <ref name="table-table-attlist"/> <optional> <ref name="table-table-source"/> </optional> <optional> <ref name="office-dde-source"/> </optional> <optional> <ref name="table-scenario"/> </optional> <optional> <ref name="office-forms"/> </optional> <optional> <ref name="table-shapes"/> </optional> <ref name="table-columns-and-groups"/> <ref name="table-rows-and-groups"/> </element>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]