office message

Subject: [OASIS Issue Tracker] Commented: (OFFICE-1898) CHAR and CODE areinconsistent
From: OASIS Issues Tracker <workgroup_mailer@lists.oasis-open.org>
To: office@lists.oasis-open.org
Date: Tue, 1 Sep 2009 13:17:15 -0400 (EDT)

    [ http://tools.oasis-open.org/issues/browse/OFFICE-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629#action_15629 ] 

Eike Rathke commented on OFFICE-1898:
-------------------------------------

Historically, the CODE and CHAR functions originated from systems and implementations that used 8-bit character code pages. The constraint therefor should really be 0 <= n <= 255.

In fact, at least for values > 127 but also for values <= 127 if the encoding doesn't represent ASCII in that range, the functions give only meaningful results if they do not operate on the implementation's text encoding (most using Unicode nowadays), but on the system's text encoding the implementation runs on instead, and only if the system a document is loaded on uses an encoding identical to the system the document was generated with. Mainly these functions are only useful if an alien non-ODF document relied on that behavior is read on such system.

In my opinion we should document these functions and their non-deterministic behavior, but strongly deprecate them for ODF interchange and recommend UNICODE and UNICHAR instead.


> CHAR and CODE are inconsistent
> ------------------------------
>
>                 Key: OFFICE-1898
>                 URL: http://tools.oasis-open.org/issues/browse/OFFICE-1898
>             Project: OASIS Open Document Format for Office Applications (OpenDocument) TC
>          Issue Type: Bug
>          Components: OpenFormula
>    Affects Versions: ODF 1.2
>            Reporter: Dennis Hamilton
>            Assignee: Eike Rathke
>            Priority: Blocker
>             Fix For: ODF 1.2
>
>
> In 6.19.2 CHAR in OpenFormula working-draft OpenDocument-formula-20090508.odt, it is recommended that CHAR(n) for n > 255 have an error-return result, while also recommending that the identity transformation n = CODE(CHAR(n)) be preserved.   
> 6.19.4 CODE has an implicit requirement that c = CHAR(CODE(c)) be preserved for c a single-character string although CODE behavior is implementation-defined when c has a code-point greater than 127 in the internal representation of strings (which section 4.1 is not exactly definitive about).   
> There seems to be quiet neglect for the possibility of "code pages" where the code points from 0 to 127 are not those of the ASCII code.  There are other difficulties around strings being used to carry arrays of bytes that are not safe as strings in the implicit character-set encoding.
> I think there needs to be work on clear-cut abstractions for distinguishing internal representations, disguise of a different representation in the stored form of the internal representation, and in the interpretation of string constants and string values as text.
> These proposals are intended to help in movement in that direction.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira