OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [OASIS Issue Tracker] Commented: (OFFICE-2207) Whitespaceprocessing [N 1309]



    [ http://tools.oasis-open.org/issues/browse/OFFICE-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=19365#action_19365 ] 

Dennis Hamilton commented on OFFICE-2207:
-----------------------------------------

In reviewing the Errata CD04-rev02 resolution of this, and looking at the original defect text, I think we may be confusing the issue by saying too much about what [XML 1.0] is thought to say, rather than simply referencing what it is that [XML 1.0] says precisely.  

I'm not sure where this should be handled, but I want to record these observations here so they are not lost in the context of Errata discussion:

 1. I think the distinction is between when white-space characters appear in element-content and when white-space characters appear in the PCDATA of mixed content.  (In [XML 1.0], element-content is element-only content.)

 2. In [XML 1.0], it is presumed that all character data that occurs in the content of the root element, directly or indirectly, is character data of the XML document.  

 3. How white-space characters are handled in element-content is determined by the application of [XML 1.0].  The xml:space="preserve" attribute applies, but the attribute value is a recommendation here.  

  3.1 We can simply say that white-space characters encountered as character data in the immediate content of elements having element-content shall be ignored and any setting of xml:space has no effect for those particular occurrences.  

  3.2 I do not see anywhere that [XML 1.0] says such white-space characters are to be ignored.  Instead it is specified that the application be informed which white-space characers are of this kind.  If we want them ignored, we need to make it our rule.

4. With regard to how mixed-content white-space characters are handled by default, it may be better simply to refer to section 5.1 (or 5.1.1).  It is also valuable to assert that xml:space does not over-ride the specified behavior in any case.

5. Since there is no change being made in the [XML 1.0] rules for treatment of white-space in attribute values, and for elimination of carriage-return characters, it is not necessary to say anything about that.  

   5.1 The statement that "conforming to the XML specification, all possible line-ending cases are ..." can be misleading."  To simply say that the only way that a carriage-return character, #xd, can occur as character data is via a character entity .   This should be before the statement that makes reference to section 5.1 of ODF 1.0.

  5.2 Since the treatment of attribute values is mandatory, section 3.3.3 of [XML 1.0] might be referenced, but probably in a note that serves as a reminder, rather than appearing as normative content.

6. We should make sure this is cleaned up in ODF 1.2 regardless of handling this in ODF 1.0 Errata CD05 or later.

> Whitespace processing [N 1309]
> ------------------------------
>
>                 Key: OFFICE-2207
>                 URL: http://tools.oasis-open.org/issues/browse/OFFICE-2207
>             Project: OASIS Open Document Format for Office Applications (OpenDocument) TC
>          Issue Type: Bug
>          Components: General
>    Affects Versions: ODF 1.0, ODF 1.0 (second edition)
>            Reporter: Robert Weir 
>            Assignee: Svante Schubert 
>             Fix For: ODF 1.0 Errata CD 5
>
>
> Submitter ID
>     GB-26300-34
> Nature of defect
>     Technical
> Document
>     ISO/IEC 26300:2006
> Clause
>     1.6
> Page
>     34
> Description of issue
> It is stated that "In conformance with the W3C XML specification [XML1.0], optional white-space characters that are contained in elements that have element content (in other words that must contain elements only but not text) are ignored".
>     * It is not clear what "optional white-space characters" are (the term is not defined in XML 1.0), or how the described behaviour conforms to XML 1.0.
>     * Does the phrase "elements that have element content" mean elements that have only element content? This cannot make sense, as whitespace is itself text content.
>     * Consider the markup <text:p><text:span>Hello</text:span> <text:span>world</text:span></text:p>. If processed according to the text above, the space between the words here would be ignored, yet no known ODF processor actually respects this provision.
> Proposal
> Reform the text to answer the above queries and modify the stated processing behaviour to accord with the existing corpus of documents and processors.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]