[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: [OASIS Issue Tracker] Commented: (OFFICE-2681) Should openformulaevaluators be *required* to support BMP or all Unicode/10646 characters?
[ http://tools.oasis-open.org/issues/browse/OFFICE-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=19192#action_19192 ] David Wheeler commented on OFFICE-2681: ---------------------------------------- Sorry, I didn't notice that this had already been opened. I'm going to close this, as it is a duplicate (DUP) of the already-opened: http://tools.oasis-open.org/issues/browse/OFFICE-2672 > Should openformula evaluators be *required* to support BMP or all Unicode/10646 characters? > ------------------------------------------------------------------------------------------- > > Key: OFFICE-2681 > URL: http://tools.oasis-open.org/issues/browse/OFFICE-2681 > Project: OASIS Open Document Format for Office Applications (OpenDocument) TC > Issue Type: Bug > Components: OpenFormula > Reporter: David Wheeler > Assignee: David Wheeler > > Part 2 (OpenFormula) section 3.2 "Text:" says: > "A text value (also called a string value) is a sequence of zero or more characters. > Evaluators should accept [UNICODE] strings, but shall accept strings of ASCII (Unicode U+0020 through U+007F, inclusive) characters." > Some commenters on the open comment list believe there should be a stronger requirement: > http://lists.oasis-open.org/archives/office-comment/201005/msg00002.html > Certainly from a *user* point of view a stronger requirement would be nice. > Two basic questions: > 1. Should the required character set supported by the evaluator at run-time be increased, > and if so, to what (BMP or all Unicode/10646)? > 2. Under what conditions should they be increased? > (All implementations? Only those of medium group or up? > Maybe require BMP in medium group, and all characters in large group?) > This is related to: > http://tools.oasis-open.org/issues/browse/OFFICE-2663 > I'd like implementors to briefly respond with comments to *THIS* JIRA comment, > noting what they can support and if there are major "gotchas". > For example, can everyone support evaluating BMP or all Unicode characters > at formula runtime *regardless* of the user's locale setting > (I'm concerned this may be an issue for Excel)? > Can everyone handle arbitrary characters, or is anyone limited to BMP > (our 16-bit-char friends can end up with this problem)? > If anyone is limited, is this a stumblingblock? > I have a particular concern for the implementations that use 16-bit-chars internally. > If you're given a character that is not in the BMP, what do FIND, LEFT, etc. do? > Do they simply presume (incorrectly) that all chars are in the BMP, and thus you can > cut out have a character? Or do they count "correctly" to the right character? > Systems that use UTF-8 internally presumably do this correctly, since they > have to "count" to get to the right characters anyway, but I'd like to know if that's > NOT true for anyone. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]