office message

Subject: [OASIS Issue Tracker] Commented: (OFFICE-3706) Possibleclarification in office/v1.2/cos01/part1/6.1.2 about whitespace...

From: OASIS Issues Tracker <workgroup_mailer@lists.oasis-open.org>
To: office@lists.oasis-open.org
Date: Fri, 22 Jul 2011 21:16:39 -0400 (EDT)


    [ http://tools.oasis-open.org/issues/browse/OFFICE-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=26551#action_26551 ] 

Ben Martin  commented on OFFICE-3706:
-------------------------------------

Dennis: To clarify your last paragraph from your first comment, I assume that the following, using "^" to explicitly show the space (U+0020) character visibly:

bar^<text:s/>^^^<text:s/>^^foo

When loaded would logically become:
bar^^^^^foo

Where the text:s are converted to ^ space (U+0020) characters and there are two of the original (U+0020) SPACE characters collapsed.

This might be written to ODF again as:
bar^<text:s/><text:s/><text:s/><text:s/>foo

It might be fairly unlikely that applications would produce the original content. 

I like to think of how ODF fragments will converge when read/written multiple times. In this case, I am thinking that a reader of the original content might collapse the "^" space (U+0020) characters as XML text nodes are seen, keeping track explicitly if the last XML element was a text:s or itself contained text content that ended in an explicit space (U+0020) character.


> Possible clarification in office/v1.2/cos01/part1/6.1.2 about whitespace...
> ---------------------------------------------------------------------------
>
>                 Key: OFFICE-3706
>                 URL: http://tools.oasis-open.org/issues/browse/OFFICE-3706
>             Project: OASIS Open Document Format for Office Applications (OpenDocument) TC
>          Issue Type: Bug
>          Components: Text
>            Reporter: Ben Martin 
>            Priority: Minor
>
>   I was recently hacking on some ODT import code an was clarifying white
> space handling with respect to text:p in the spec. 
> Looking at the steps shown in 6.1.2:
> 2) The character data of the paragraph element and of all descendant
> elements for which the OpenDocument schema permits the inclusion of
> character data for the element itself and all its ancestor elements up
> to the paragraph element, is concatenated in document order.
> 4) Sequences of " " (U+0020, SPACE) characters are replaced by a single
> " " (U+0020, SPACE) character.
> Consider the following contrived example:
> <text:p>Hi there <text:span>foo </text:span> bar</text:p>
> This would seem to mean that (2) would give
> "Hi there foo  bar"
> and the application of (4) would then make
> "Hi there foo bar"
> If so, logically is the space in the text:span to be removed or the one
> before the "bar". It seems OpenOffice 3.3.0 removes the second of the
> two spaces. That is, if the span containing foo is bold, then the single
> remaining space is bold too in an ODT file saved out of OO again.
> I assume this is the desired behaviour?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

References:
- [OASIS Issue Tracker] Created: (OFFICE-3706) Possible clarificationin office/v1.2/cos01/part1/6.1.2 about whitespace...
  - From: OASIS Issues Tracker <workgroup_mailer@lists.oasis-open.org>