OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: [OASIS Issue Tracker] Commented: (OFFICE-1895) RIGHTB and friendsis incompletely specified

    [ http://tools.oasis-open.org/issues/browse/OFFICE-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12141#action_12141 ] 

Dennis Hamilton commented on OFFICE-1895:

I did some mining in IS 29500-1:2008 to see if there was any clarity to be gained from inspection of the same-named functions there.  It is a little better and perhaps a little worse, all at the same time.  

There, MIDB, RIGHTB, and LEFTB are described as specifying a byte position and a number of bytes (not characters).  However, the vagueness persists in problematic phrases such as this one for MIDB: "Extracts number-bytes-worth of characters from string, starting at character position start-pos."

There is no indication of what happens when start-pos is not at the first byte of a double-byte character-encoding or the ending position is on the first byte of a double-byte character-encoding.

There are other peculiarities.  There is no help in the IS 29500:1 examples, which are all for trivial cases (except for ASC and JIS where the examples seem to be a tad more helpful).

Finally, IS 29500:1 SpreadsheetML string constants in formulas are defined to support the same Unicode character set as that of XML 1.0 section 2.2, with a pair of double-quote (") characters used as an escape for including a double-quote code in the constant.  The value part of a cell having formula-specified content, the <v> element, carries a text string that also supports the same Unicode character set as string constants (except there is no special double-quote escaping), with yet-another special escape provision for Unicode code points that are excluded from Char in XML 1.0 sedion 2.2.   It is not clear whether this can be used to specify code values for every code point in the Unicode code space.  

How the definition of string constants and direct table-cell content in SpreadsheetML is squared with the definitions of these functions is not addressed anywhere that I could see.  The opportunity for OpenFormula to address this also remains open, I believe.  

There are other questions that I have over the IS 29500-1 treatment, but I think I've gotten all there is to get without fact-checking these definitions, unchanged from ECMA-676, against what Excel 2007 actually does.

I think further exploration might suggest a clean definition of text-manipulation functions in OpenFormula that are consistent with the IS 29500 ones without being so implementation-bound and ill-defined.

> RIGHTB and friends is incompletely specified
> --------------------------------------------
>                 Key: OFFICE-1895
>                 URL: http://tools.oasis-open.org/issues/browse/OFFICE-1895
>             Project: OASIS Open Document Format for Office Applications (OpenDocument) TC
>          Issue Type: Bug
>          Components: OpenFormula
>    Affects Versions: ODF 1.2
>            Reporter: Andreas Guelzow 
> 6.6.6 RIGHTB
> Summary: Return a selected number of text characters from the right, using byte position.
> This description fails to indicate what happens in the likely situation that the selected bytes do not form a complete character sequence. Should that be an error or move to a correct position?
> Similarly for  the otehr ...B functions

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]