[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: [OASIS Issue Tracker] Commented: (OFFICE-3706) Possibleclarification in office/v1.2/cos01/part1/6.1.2 about whitespace...
[ http://tools.oasis-open.org/issues/browse/OFFICE-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=26551#action_26551 ] Ben Martin commented on OFFICE-3706: ------------------------------------- Dennis: To clarify your last paragraph from your first comment, I assume that the following, using "^" to explicitly show the space (U+0020) character visibly: bar^<text:s/>^^^<text:s/>^^foo When loaded would logically become: bar^^^^^foo Where the text:s are converted to ^ space (U+0020) characters and there are two of the original (U+0020) SPACE characters collapsed. This might be written to ODF again as: bar^<text:s/><text:s/><text:s/><text:s/>foo It might be fairly unlikely that applications would produce the original content. I like to think of how ODF fragments will converge when read/written multiple times. In this case, I am thinking that a reader of the original content might collapse the "^" space (U+0020) characters as XML text nodes are seen, keeping track explicitly if the last XML element was a text:s or itself contained text content that ended in an explicit space (U+0020) character. > Possible clarification in office/v1.2/cos01/part1/6.1.2 about whitespace... > --------------------------------------------------------------------------- > > Key: OFFICE-3706 > URL: http://tools.oasis-open.org/issues/browse/OFFICE-3706 > Project: OASIS Open Document Format for Office Applications (OpenDocument) TC > Issue Type: Bug > Components: Text > Reporter: Ben Martin > Priority: Minor > > I was recently hacking on some ODT import code an was clarifying white > space handling with respect to text:p in the spec. > Looking at the steps shown in 6.1.2: > 2) The character data of the paragraph element and of all descendant > elements for which the OpenDocument schema permits the inclusion of > character data for the element itself and all its ancestor elements up > to the paragraph element, is concatenated in document order. > 4) Sequences of " " (U+0020, SPACE) characters are replaced by a single > " " (U+0020, SPACE) character. > Consider the following contrived example: > <text:p>Hi there <text:span>foo </text:span> bar</text:p> > This would seem to mean that (2) would give > "Hi there foo bar" > and the application of (4) would then make > "Hi there foo bar" > If so, logically is the space in the text:span to be removed or the one > before the "bar". It seems OpenOffice 3.3.0 removes the second of the > two spaces. That is, if the span containing foo is bold, then the single > remaining space is bold too in an ODT file saved out of OO again. > I assume this is the desired behaviour? -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]