OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [OASIS Issue Tracker] Updated: (OFFICE-2207) Whitespace processing[N 1309]



     [ http://tools.oasis-open.org/issues/browse/OFFICE-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Svante Schubert  updated OFFICE-2207:
-------------------------------------

    Resolution: 
1.6 White-Space Processing and EOL Handling
Replace:
"In conformance with the W3C XML specification [XML1.0], optional white-space characters that are contained in elements that have element content (in other words that must contain elements only but not text) are ignored. This applies to the following white-space and end-of-line (EOL) [UNICODE] characters: 
HORIZONTAL TABULATION (0x0009)
LINE FEED (0x000A)
CARRIAGE RETURN (0x000D)
SPACE (0x0020)
For any other element, white-spaces are preserved by default. Unless otherwise stated, there is no special processing for any of the four white-space characters. For some elements, different white-space processing may take place, for example the paragraph element."
with
"An ODF processor shall ignore white-space in element content for elements not declared to contain text content in the ODF schema. This applies to the following white-space and end-of-line (EOL) [UNICODE] characters: 
HORIZONTAL TABULATION (0x0009)
LINE FEED (0x000A)
CARRIAGE RETURN (0x000D)
SPACE (0x0020)
For any other element, white-spaces are preserved by default. Unless otherwise stated, there is no special processing for any of the four white-space characters. For some elements, different white-space processing may take place, for example the paragraph element." 

Remove:
The XML specification also requires that any of the four white-space characters that is contained in an attribute value is normalized to a SPACE character.
One of the following characters may be used to represent line ends: 
LINE FEED
CARRIAGE RETURN 
The sequence of the characters CARRIAGE RETURN and LINE FEED 
Conforming to the XML specification, all the possible line ends are normalized to a single LINE FEED character.
As a consequence of the white-space and EOL processing rules, any CARRIAGE RETURN characters that are contained either in the text content of an element or in an attribute value must be encoded by the character entity 
. The same applies to the HORIZONTAL TABULATION and LINE FEED characters if they are contained in an attribute value.

  was:
1.6 White-Space Processing and EOL Handling
Replace:
"In conformance with the W3C XML specification [XML1.0], optional white-space characters that are contained in elements that have element content (in other words that must contain elements only but not text) are ignored. This applies to the following white-space and end-of-line (EOL) [UNICODE] characters: 
HORIZONTAL TABULATION (0x0009)
LINE FEED (0x000A)
CARRIAGE RETURN (0x000D)
SPACE (0x0020)
For any other element, white-spaces are preserved by default. Unless otherwise stated, there is no special processing for any of the four white-space characters. For some elements, different white-space processing may take place, for example the paragraph element."
with
"In conformance with the W3C XML specification [XML1.0], white-space characters that are contained in elements that have element content (in other words that must contain elements only but not text) are ignored. This applies to the following white-space and end-of-line (EOL) [UNICODE] characters: 
HORIZONTAL TABULATION (0x0009)
LINE FEED (0x000A)
CARRIAGE RETURN (0x000D)
SPACE (0x0020)
For any other element, white-spaces are preserved by default. Unless otherwise stated, there is no special processing for any of the four white-space characters. For some elements, different white-space processing may take place, for example the paragraph element." 

Remove:
The XML specification also requires that any of the four white-space characters that is contained in an attribute value is normalized to a SPACE character.
One of the following characters may be used to represent line ends: 
LINE FEED
CARRIAGE RETURN 
The sequence of the characters CARRIAGE RETURN and LINE FEED 
Conforming to the XML specification, all the possible line ends are normalized to a single LINE FEED character.
As a consequence of the white-space and EOL processing rules, any CARRIAGE RETURN characters that are contained either in the text content of an element or in an attribute value must be encoded by the character entity 
. The same applies to the HORIZONTAL TABULATION and LINE FEED characters if they are contained in an attribute value.


Regarding comment from http://tools.oasis-open.org/issues/browse/OFFICE-2733

"Get rid of 1.6 entirely.  It is merely useless and confusing, since 
it is merely repeats what is normatively specified in XML 1.0.

Moreover, what do you mean by "element contents" here?  
In XML 1.0, this term makes sense only when you have DTDs.
Since no ODF documents have DTDs, this term means nothing 
here."

The XML specification defines element content without the usage of DTD, see http://www.w3.org/TR/REC-xml/#dt-elemcontent.
The whitespace is indeed already declared in XML spec, but as ODF do not use the xml:space attribute, this chapter seems helpful.

Adapted resolution from feedback from Alex Brown on the SC34 WG6 list.



> Whitespace processing [N 1309]
> ------------------------------
>
>                 Key: OFFICE-2207
>                 URL: http://tools.oasis-open.org/issues/browse/OFFICE-2207
>             Project: OASIS Open Document Format for Office Applications (OpenDocument) TC
>          Issue Type: Bug
>          Components: General
>    Affects Versions: ODF 1.0, ODF 1.0 (second edition)
>            Reporter: Robert Weir 
>            Assignee: Svante Schubert 
>             Fix For: ODF 1.0 Errata CD 5
>
>
> Submitter ID
>     GB-26300-34
> Nature of defect
>     Technical
> Document
>     ISO/IEC 26300:2006
> Clause
>     1.6
> Page
>     34
> Description of issue
> It is stated that "In conformance with the W3C XML specification [XML1.0], optional white-space characters that are contained in elements that have element content (in other words that must contain elements only but not text) are ignored".
>     * It is not clear what "optional white-space characters" are (the term is not defined in XML 1.0), or how the described behaviour conforms to XML 1.0.
>     * Does the phrase "elements that have element content" mean elements that have only element content? This cannot make sense, as whitespace is itself text content.
>     * Consider the markup <text:p><text:span>Hello</text:span> <text:span>world</text:span></text:p>. If processed according to the text above, the space between the words here would be ignored, yet no known ODF processor actually respects this provision.
> Proposal
> Reform the text to answer the above queries and modify the stated processing behaviour to accord with the existing corpus of documents and processors.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]