[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [office-formula] Summary 2010-09-07 of OpenFormula meeting
I don't understand the statement about surrogate pairs. My understanding is that the surrogate code values are not valid Unicode code points. Surrogate pairs are, in my understanding, only meaningful in UTF16 encoding. The surrogate codes are blocked out from Unicode code points as a convenience and for historical reasons. Use of surrogate code values as code points is ill-formed. (You should never see a surrogate value in a UTF8-encoded Unicode text, for example, and not in UTF32 either.) My reading of XPATH section 3.6 is that they are emphatically not assuming UTF16 and there is a warning that when handling UTF16 encodings, implementers should be careful to treat surrogate pairs as the single Unicode code point that is represented by the pair. (The XPATH specification also refers to Unicode code points as Unicode abstract character scalar values.) Although XPATH warns that if two strings do not happen to be normalized the same way, unexpected results may occur when those strings are compared, I don't believe that XPATH requires or provides any normalizing. (Nor should we, IMHO.) - Dennis -----Original Message----- From: David A. Wheeler [mailto:dwheeler@dwheeler.com] Sent: Tuesday, September 07, 2010 09:01 To: office-formula@lists.oasis-open.org Subject: [office-formula] Summary 2010-09-07 of OpenFormula meeting Summary 2010-09-07 of OpenFormula meeting [ ... ] * OFFICE-2663 http://tools.oasis-open.org/issues/browse/OFFICE-2663 Wheeler: MacOS imposes a different normalization than everyone else. Eike: CODE is a bad example, it depends on the code page. Weir: For Unicode, can we just say implementation-defined, but must be first Unicode character or "logical" value? What's that? Wheeler: Could we just say Unicode codepoints? Patrick: No. If you look at XPATH language, it says surrogate pairs should be treated specially. Wheeler: XPATH doesn't require normalization, it just warns about "unexpected results"; can we do the same? [ ... ]
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]