Hi,
As next feature set, I would like to provide support for the
insertion/deletion of the following components:
- Paragraphs and descendants of text and images (at first only
those with external reference and positioned as character)
- Style properties to the above components via
<text:span/> and ECT support for image properties
Below I arranged some notes into chapters, neither yet included all
previous mails and nor yet mapped the content back to the Wiki.
Especially the operation definition in general are on my priority to
be improved in the next iteration.
---
Purpose
The purpose of change-tracking is to identify changes made to the
document and the ability to undo them.
Earlier ODF versions have been using the design of inserting ODF
snippets within the document, representing the state before and
after the change. Another even worse way could have been to save a
complete document to represent the previous state.
Operations
The most efficient way of a change representation is to apply the
convention over configuration paradigm. To identify repeating
pattern of XML change and to define such change pattern as
operations in the ODF specification (e.g. defining the XML change
pattern of the movement of a table row as a parametrized move-row
operation on a table). By this approach only the parametrized
operation, the placeholder for the ODF XML change is being used.
The explicit definition of changes is especially helpful as not
arbitrary changes are accepted by current ODF applications. The
subset of possible ODF changes for change-tracking can now be
precisely named.
Components
To ease the identification of ODF XML changes, we map the ODF
document to a tree of components. A component is a theoretical
construct, which consists in general of one ore more XML elements
and represents a basic logical piece of the document to the user.
The XML might be from different files from the ZIP. Aside of XML,
every single character is representing a component. Every
component consisting of XML does have a "head element" marking the
begin of the component, such as <table:table>. The
construction of the component tree helps us to identify the
location of a change.
Referencing Operations
Earlier a document had to be parsed in full length to identify all
potential changes as changes were represented by ODF XML snippets.
Nowadays the location of a change represented by the location of the
component given by an operation parameter represented by a list of
numbers.
For instance: "/1/3" is the third child of the first component of
the document. The type is not of importance, the addressed character
might be anything, for instance a character or an image.
By referencing the change, the change can be dispatched without the
content, which is a mandatory requirement of efficient real-time
collaboration.
(Text) Operations
Addition
In general the most important part of a text document is the text.
Text can only exist in ODF within a text container as a paragraph.
(Note for later: To simplify operations a List and a heading are as
well paragraphs only with additional attributes, such as list and
outline level).
The default paragraph component is represented by an empty
<text:p/> element.
Operations do not redefine the schema. Therefore inserting a
paragraph at the first position of a document "/1", adds the
<text:p/> element in the content.xml as child of
"/office:document/office:body/office:text/" after any potential
elements declared as "office-text-content-prelude" in the RelaxNG (http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-schema.rng).
NOTE: As the ECT proposal only covers paragraphs within the text
flow, no nested paragraphs (paragraph as descendant of another) will
be recognized as component, as those are not within the text flow,
e.g. annotations.
If there is already a component at the given position, the new one
is inserted in front of the existing one.
As basic convention of avoiding unnecessary text style information
the behavior of most ODF application is taken as default, where the
character style is overtaken from the preceding one. For instance,
if a character is inserted in between two characters, the style is
overtaken from the previous one. From the XML view it means that if
there are two characters in a paragraph and each one in a span, the
character inserted in the middle is appended at the end of the first
span.
The serialized form of the operation to insert a paragraph at the
beginning of the document would be
<add type="paragraph" s="/1"/>
Adding text content would be
<add type="text" s="/1/1">Hello World!</add>
NOTE: When transforming an ODF XML document into operations
(producing operations), whitespace handling have to be applied to
the (text) content of a paragraph - http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html#White-space_Characters
Styles are properties of components and will be added by a selection
pointing out the first and the last component being selected.
<add type="property" style:name="important" s="/1/1"
s="/1/5" />
<add type="style" family="text" style:name="important">
<properties type="text" background-color="yellow"/>
</add>
(Note: It is likely that when creating a style in a document nested
elements become necessary due to the existence of equal style
properties. For instance, a table-cell style might contain three
background colors, one for the cell, one as default for all
containing paragraphs and one as default for all containing text:
<add type="style" family="cell" style:name="important">
<properties type="cell" background-color="black"/>
<properties type="paragraph" background-color="red"/>
<properties type="text" background-color="yellow"/>
</add>
In addition to text our first set of operations should allow to
create an image within a paragraph, which is positioned as a
character and is referring to an external graphic.
<add type="image" s="/1/6" url=""
class="moz-txt-link-rfc2396E" href="http://some-domain/fancy.png">"http://some-domain/fancy.png"
width="10cm" height="112cm" description="Group picture of TC
members" />
(To Evaluate: To avoid complexity (or a Babylon of different lengths
units) and the continuous checking in sources, it proofed worth to
normalize length units to an integer representing a 100ths of
millimeter when creating an operation from XML, which would result
into:
<add type="image" s="/1/6" url=""
class="moz-txt-link-rfc2396E" href="http://some-domain/fancy.png">"http://some-domain/fancy.png"
width="10000" height="112000" description="Group picture of TC
members" /> )
(To Evaluate: The XML design is still a draft. Validation Question:
Is it possible to validate the existence and content of attributes
and child elements by the existence of one of the attributes in our
case the different values of "type" would require different
attributes.)
<add type="image" s="/1/6">
<properties type="image" url=""
class="moz-txt-link-rfc2396E" href="http://some-domain/fancy.png">"http://some-domain/fancy.png"
width="10000" height="112000" description="Group picture of TC
members"/>
</add>
Interesting XML design question: When is a property part of style
and when part of the element?
)
An image is represented by a <draw:frame> and a
<draw:image> child.
(ToDo: The mapping of the attributes and a precise definition will
follow. Shall the mechanism of replacement images be covered?)
Deletion
Deletion is far easier.
The deletion of the image would be
<del s="/1/6"/>
The deletion of the Word "Hello" would be done by:
<del s="/1/1" e="1/5"/>
(NOTE: When the last formatted character is deleted, the
<text:span> is being deleted as well)
The deletion of the complete paragraph is accomplished by:
<del s="/1"/>
So much for tomorrow..
Svante
|