OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: [OASIS Issue Tracker] Commented: (OFFICE-2707) ODF 1.2 Section 5.1No Longer Differentiates In-Line Text

    [ http://tools.oasis-open.org/issues/browse/OFFICE-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=21676#action_21676 ] 

Dennis Hamilton commented on OFFICE-2707:

Here is what I am struggling with:

In ODF 1.2 CD05 Part 1 5.1.1 it says

"The <text:h> and <text:p> elements represent headings and paragraphs, respectively.  Headings and paragraphs are collectively referred to as *paragraph elements*. (* ... * indicate italics.)

In ODF 1.2 CD05 Part 1 6.1.1 it says

The paragraph elements <text:p> and <text:h> and their descendant elements contain the text content of any document. The character content of a paragraph consists of the character data of the paragraph element and the character data of its descendant elements concatenated in document order, with the following exceptions:

where the exceptions are cases of descendant elements that, although they have text children and descendants don't count.

The problem is that the effect is that *any* <text:p> and <text:h> carry the text content of the document, except for the text in the named descendants.

There are places beside <text:p> and <text:h> that have paragraph-content (according to the schema).  These include <text:a>, <text:meta>, <text:meta-field>, and <text:span> but it appears these are always within <text:p> or <text:h> elements so they are covered in the definition.

There are places that have <text:p> and <text:h> that do not provide text content of a document (at least not in the sense of the in-line content of the document as accepted).  One place is <office:change-info> (see OFFICE-3383  about that as well).  Another is <text:delete> and also header-footer-content in <style:header*> and <style:footer*> elements.  There seems to be some qualification needed about these.  This is also true of <text:section> in occurrences in <text:delete>, header and footer styles,, and perhaps in places like <text:index-title>.


I originally noticed this as a breaking difference from the text of ODF 1.1.  Here is the relevant text of ODF 1.1 Section 5.1 Basic Text Content, which attempts to define what is the in-line text.  Compare with CD05-1 6.1.1.  (The emphasis is mine).

Paragraph element's children make up the text content of any document. All text contained in a paragraph element or their children is text content, WITH FEW EXCEPTIONS DETAILED LATER. This should significantly ease transformations into other formats, since transformations may ignore any child elements of paragraph elements and only process their text content, and still obtain a faithful representation of text content.
Text content elements that do not contain IN-LINE TEXT CHILDREN are:
•(foot- and end-)notes (see section 5.3)
Foot- and endnotes contain text content, but are typically displayed outside the main text content, e.g., at the end of a page or document.
•rubies (see section 5.4)
Ruby texts are usually displayed above or below the main text.
•annotations (see section 5.5)
Annotations are typically not displayed.

The problem with the statement in ODF 1.1 is the assumption made that all Paragraph elements contribute to in-line text.  

ODF 1.2 CD05 removes discussion of in-line text (and that should be noted in the Appendix on Changes) but I think it still needs to identify when <text:p> and <text:h> elements should not be considered to be providing "[in-line] text of the document."

> ODF 1.2 Section 5.1 No Longer Differentiates In-Line Text
> ---------------------------------------------------------
>                 Key: OFFICE-2707
>                 URL: http://tools.oasis-open.org/issues/browse/OFFICE-2707
>             Project: OASIS Open Document Format for Office Applications (OpenDocument) TC
>          Issue Type: Bug
>          Components: Part 1 (Schema), Text
>    Affects Versions: ODF 1.2 CD 05
>         Environment: This change applies to ODF 1.2 Part 1 CD04 through ODF 1.2 Part 1 CD04-rev09.  Previous versions distinguished in-line text in some manner and would have been subject to OFFICE-2706
>            Reporter: Dennis Hamilton
>            Assignee: Dennis Hamilton
>             Fix For: ODF 1.2 CD 06
> There is no longer any differentiation between in-line text and text elswhere (that is, character data content of <text:h> and <text:p> elements), as is done in ODF 1.0/1.1/IS 26300 and, in modified form, ODF 1.2 Part 1 CD03 and earlier.  The informational indication of text which might be extracted as in-line text content is removed.  
> There is no argument that the differentiation between in-line text and text of other kinds is not that simple.  However, this change is not noted as something of material importance that differs between ODF 1.2 and earlier versions.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]