OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [OASIS Issue Tracker] Updated: (OFFICE-1935) Review 1.2specification with respect to Unicode usage



     [ http://tools.oasis-open.org/issues/browse/OFFICE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Weir  updated OFFICE-1935:
---------------------------------

    Issue Type: Bug  (was: Task)

I scanned through the text for problems and generally found that we're doing well.  I've noted a few things that seemed odd to me, but since I am not the text expert, we would benefit from a 2nd set of eyes on the following list:

Part 1, section 6.1.1 -- We say "Text character data contained...".  Does this mean anything different than "character data", which is the term we use elsewhere?

In general, I'm not getting the distinction between "character data", "character content", "text content" and "text character data".

The XML term is "character data" defined as: "All text that is not markup constitutes the character data of the document".

Part 1, section 6.1.2 -- Are we saying that we normalize SPACE to itself?  Is that necessary?  In any case should say (U+0020) for first occurrence of SPACE in that paragraph.

Section 6.1.6 -- can we make that table so it doesn't split across pages?  It messes up the 2nd row

Section 19.135.1 says "this name may contain arbitrary characters".  This is not true.  It must be a well-formed attribute value, so may not contain NULL character, control characters, etc.  Ditto for 19.474.  Should probably search for "arbitrary characters" and correct everywhere.

Section 19.364 -- I think we need a reference for the Unicode data base text file.  

Section 19.598 -- "string comprises one or more characters surrounded by quotation marks."  Double quotes?  single quotes?  angle quotes?  Should be explicit.

Section 19.762 -- I would delete the "reference" column in that table

I don't see a definition anywhere of CJK or CTL?

> Review 1.2 specification with respect to Unicode usage
> ------------------------------------------------------
>
>                 Key: OFFICE-1935
>                 URL: http://tools.oasis-open.org/issues/browse/OFFICE-1935
>             Project: OASIS Open Document Format for Office Applications (OpenDocument) TC
>          Issue Type: Bug
>          Components: Locale, Text
>    Affects Versions: ODF 1.2 CD 05
>            Reporter: Robert Weir 
>            Assignee: Robert Weir 
>             Fix For: ODF 1.2 CD 06
>
>
> We should review the ODF 1.2 specification, in particular for the following:
> 1) Are all character literals specifying their code points, e.g., '1' (U+0030).  Remember, not every reader of the standard will be a native English speaker or even a native user of Latin-1 characters.  Since Unicode defines several characters that may look like a plus sign, or a dash, we need to be explicit.
> 2) Are we crystal clear on whitespace treatment?
> 3) Bidi?
> 4) Whenever we talk about sorting, are we clear on whether this is lexical or a locale-dependent collation order?
> 5) What Unicode version? 
> 6) For most of ODF we can deal with Unicode characters and strings of Unicode characters without discussing encodings.  For serialization we permit whatever XML permits and we don't need to deal with encoded characters.  However there are some exceptions that we need to be more explicit with.  One is passwords entered during encryption.  Since the encryption algorithms work at the bit level, both encoding and byte ordering need to be specified.
> 7) Any functions that deal with upper case/lower case conversions, such as in OpenFormula, need to make sure they are specified correctly with respect to Unicode.  
> 8) Anything else?
> Suggest search phrases are: character*, sort, search, collation, unicode, encod*, encrypt*, string (unless it is xsd:string), *space, dash, hyphen, 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]