OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-collab message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office-collab] Immutable Change Tracking


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Svante,

I'm preparing for the ODF teleconference but I can quickly answer your
question about XPath:

On 01/22/2016 02:06 PM, Svante Schubert wrote:
> I created a simple example document for an offline discussion with 
> Patrick about XPath I would like to share. While Patrick is quite
> in favor of XPath, I am opposing XPath for the use of referencing
> into ODF XML for change-tracking.
> 
> Do not get me wrong, XPath is a great technique, but it was build
> for general purpose, arbitrary XML. The proposed referencing model
> for change-tracking is only focused on office documents,
> abstracting from XML and therefore of advantage.
> 
> One problem of XPath is that its references are not normalized. In
> other words, there are multiple ways to reference to the same XML
> element. Implementations have to implement all possible cases and
> normalize XPath references to become aware that the same XML
> element is being pointed to. The same problem applies to ODF XML,
> which is neither normlized. The semantic identical document can be
> shown in different ways in XML.
> 

Err, no.

XPath defines a number of ways to access an element is true, but for
change tracking purposes, I would suggest we use only the descendant
axis. Which is much the same as your "components."

Except that XPath descendant axis doesn't make presumptions about:

> AFAIK for the application models of MS Office and LibreOffice the
> list level are only properties of the paragraph, so it is for the
> component. Written in as references of the new change-tracking
> operations
> 

You are incorrect in saying:

> Most important we can apply OT easily. By looking only at the
> integers we are aware if the operations influence each other.
> Whenever an insertion/deletion occurs before or above.

If and only if the "application models" are in fact the same for all
ODF applications and applied the same way, something that has yet to
be shown or defined.

Any deviation between application models and your change tracking
would fail to support OT and change tracking itself.

BTW, no marks for failing to realize that OT is meant to minimize the
impact of changes so that the greatest number of changes can be
applied without user intervention. Think of it as auto-coordination.

If the tree/model doesn't change at all, my immutable suggestion, then
we lose the overhead of OT and as now, users choose the changes they
wish to accept, or not.

I am about to extract the OOXML change tracking part of 25900 and will
be posting it as a separate document later this week. Runs about 100
pages or so.

Hope you are at the start of a great week!

Patrick


> Our proposed component model for change-tracking abstracts from
> those variations.
> 
> Let me give the ODT document example.
> 
> In the ODF example document there are two letters "A" and "B". How
> easy is it for one of you to provide the XPath references to A and
> B?
> 
> Let me give a general description of the example document: The
> letter "A" has been inserted as part of a paragraph within a 
> list-item of list level 9. AFAIK for the application models of MS
> Office and LibreOffice the list level are only properties of the
> paragraph, so it is for the component. Written in as references of
> the new change-tracking operations
> 
> A is located at /9/12
> 
> B is located at /12/2/2/2/2
> 
> In detail for A: The 9th top level component (paragraph) and the
> 12th character NOTE: List items are just paragraph properties.
> Similar span are only boilerplate to attach properties to
> characters.
> 
> In detail for B: The 12th top level component (table), 2nd row, 2nd
> cell, 2nd paragraph and 2nd character
> 
> Most important we can apply OT easily. By looking only at the
> integers we are aware if the operations influence each other.
> Whenever an insertion/deletion occurs before or above. This
> evidence would not be the automatically the case for XPath
> references.
> 
> Best thing of all, the same references apply for OOXML, they are
> again:
> 
> A is located at /9/12
> 
> B is located at /12/2/2/2/2
> 
> Take a look at the DOCX and document.xml created by MS Office 2016.
>  The mapping from OOXML to component is surely different (so the
> XPath), but the high level structure is identical.
> 
> From this example, it seems to me that components imposing far
> less a model for change tracking to be used by implementers than
> XPath would do.
> 
> What you think?
> 
> Have a nice week-end, Svante
> 
> 
> 
> On Fri, Jan 15, 2016 at 10:00 PM, Camilla Boemann <cbo@boemann.dk 
> <mailto:cbo@boemann.dk>> wrote:
> 
> Hi Patrick
> 
> Because in order to "translate" what such xpath statements would 
> mean we would have to implement the editing in xml to (if only to 
> understand the xpath)
> 
> Let me try and illustrate
> 
> We have a basic document:
> 
> <a>< c/></a)
> 
> And two pseudo xpath operations describing accepted changes: Second
> to last operation)  delete <d/> placed left of <e/> Last operation)
> delete <e/> placed left of <c/>
> 
> Now in order for me to translate the last operation) into my 
> internal representation I would also have to implement how the 
> second to last operation would produce an xml document (and not
> just what it does to my internal representation) If those
> operations were expressed in a more generic way not tied to a
> specific syntax I would be able to translate it directly into my 
> internal representation.
> 
> Bottom line is if we go through xpath I would have to implement
> all operations on xml format as well as on my internal
> representation
> 
> Not quite sure if it makes sense to you?
> 
> 
> -----Original Message----- From:
> office-collab@lists.oasis-open.org 
> <mailto:office-collab@lists.oasis-open.org> 
> [mailto:office-collab@lists.oasis-open.org 
> <mailto:office-collab@lists.oasis-open.org>] On Behalf Of Patrick 
> Durusau Sent: 15. januar 2016 21:22 To:
> office-collab@lists.oasis-open.org 
> <mailto:office-collab@lists.oasis-open.org> Subject: Re:
> [office-collab] Immutable Change Tracking
> 
> Camilla,
> 
> What I fail to understand is the revulsion against XPath?
> 
> Doesn't Calligra in fact import the ODF XML format?
> 
> How is that any different from reading an XPath statement for the 
> location where insertion of a paragraph should occur?
> 
> That is to say that Calligra is already converting XML into its 
> internal representation, including in-line change tracking if you 
> support that now.
> 
> Even if implementers prefer a "component" approach, that is
> nothing more than translating XPath statements into a different
> vocabulary.
> 
> If I am correct, that components are the equivalents of steps in
> an XPath statement, why the opposition to XPath? It may not be as 
> compact as components and so if we derive a compact representation 
> that is called components but is in fact a simplified version of 
> XPath, where the beef?
> 
> Truly I am failing to recognize why serializing changes in the
> same format as we use for documents is such an issue? I'm not
> expecting anyone to process ODF documents as XML, but to use the
> XML representation as an interchange format. Personally I think
> that would work for changes as well, whatever applications choose
> to call them.
> 
> So, can you help me here? What is the difficulty with XPaths
> whether they are called XPath and written in XPath syntax or they
> are called components and we have to invent a syntax.
> 
> Thanks!
> 
> Hope you are looking forward to a great weekend!
> 
> Patrick
> 
> PS: To be fair, my misgivings about components are because they 
> impose a model for change tracking (as does character offset) to
> be used by implementers. I would deeply prefer that implementers
> have the freedom to choose whatever internal models they choose.
> From trees, to tables, to graphs, it makes no difference to me so
> long as they serialize to ODF as specified by the standard.
> 
> 
> 
> 
> 
> On 01/15/2016 02:52 PM, Camilla Boemann wrote:
>> Actually I can safely say that if we go with XPath Calligra will
>> NOT implement changetracking
> 
> 
> 
>> *From:*office-collab@lists.oasis-open.org
> <mailto:office-collab@lists.oasis-open.org>
>> [mailto:office-collab@lists.oasis-open.org
> <mailto:office-collab@lists.oasis-open.org>] *On Behalf Of *Svante
>> Schubert *Sent:* 15. januar 2016 20:47 *To:* Patrick Durusau 
>> <patrick@durusau.net <mailto:patrick@durusau.net>> *Cc:*
> office-collab@lists.oasis-open.org 
> <mailto:office-collab@lists.oasis-open.org>
>> *Subject:* Re: [office-collab] Immutable Change Tracking
> 
> 
> 
>> Patrick,
> 
> 
> 
>> On Wed, Jan 13, 2016 at 4:27 PM, Patrick Durusau
> <patrick@durusau.net <mailto:patrick@durusau.net>
>> <mailto:patrick@durusau.net <mailto:patrick@durusau.net>>>
>> wrote:
> 
>> Svante,
> 
>> I'm sorry you are missing the call today because I think the
>> XPath / no XML model at run time is an important issue to
>> discuss. Perhaps we can get started today and over time iron it
>> out.
> 
>> I say that because your right, no ODF application is required to
>> have an XML model.
> 
>> ODF applications can have any internal model they care to have,
>> but, they are required to *read* ODF XML and to *write* ODF XML
>> from their purely internal representations.
> 
>> That is to say that the XML file format of ODF *is* the
>> abstraction layer that enables many ODF applications to have
>> varying internal models .
> 
>>> ODF is indeed a very good abstraction to exchange serialized
>>> XML models.
> 
>>> Still there is room for further simplification by doing some
>>> *further *abstraction upon the ODF XML to logical objects
>>> (components).
> 
> 
> 
>>> This has advantages, for instance:
> 
>>> If I say, I will change the 2nd character of the 3rd paragraph,
>>> I might apply the change to ODF or OOXML, as they can be
>>> addressed similar for this subset.
> 
>>> In the XPath view we are quite lost to see that we are talking
>>> of the same logical change and/or the same logical document but
>>> in a different XML representation.
> 
> 
> 
>>> Not saying that we can map and abstract all ODF & OOXML on
>>> this level, but comparison gets easier and how did Einstein
>>> said, we should take the easiest solution that works, but
>>> nothing easier..
> 
> 
> 
> 
> 
>> So, my choice of XPath was very intentional because it too is an 
>> abstract representation of the change, against the abstraction 
>> that the ODF application has already read into its internal 
>> structure.
> 
>> The same process of reading the ODF document would be applied to 
>> reading the change into the internal representation of the ODF 
>> application.
> 
>>> XPath is very powerful, but it has many feature we do not
>>> need. Also does it depend on an XML that does not exist at
>>> run-time, but the positions have to exist and handled during
>>> run-time, as whenever something is inserted ahead (or above)
>>> the position is increased (or decreased when its being
>>> deleted), positions have to be updated. Much harder to evaluate
>>> and adopt these with XPath.
> 
>>> XML and XPath are wonderful technologies, but these are not
>>> the hammers we are looking for...
> 
> 
> 
> 
> 
>> Likewise, when an ODF application serializes changes, however it 
>> stores them internally, it serializes them against the immutable 
>> XML ODF file which it read when it loaded the file.
> 
>> The abstraction of XPath to "logical identities" maybe how many 
>> ODF applications choose to go from the XPath representation to 
>> their internal model but that's an application's choice and I
>> would prefer that we not dictate to applications their
>> abstractions.
> 
>> Using XML for both the file format and changes allows ODF to 
>> remain above the choices made by ODF applications.
> 
>>> Again, I am curious if any ODF application develop would like
>>> to use XPath. At least nobody at Open-XChange was interested in
>>> it and was happy on doing an abstraction. Perhaps because they
>>> not only imported ODF, but as well OOXML, which is quite common
>>> for office applications and they wanted to use the same
>>> mechanism for both formats.
> 
>>> Because when the office XML is being abstracted to logical 
>>> objects - I usually call components - the referencing is are
>>> very similar for both formats and a lot can be reused.
> 
>>> Try to map the position of a character within a paragraph of
>>> ODF to OOXML..  To me it was a nightmare..
> 
> 
> 
>>> Aside of my love to XPath, I do not see any use here...
> 
>>> I suggest we are not using this hammer, for this time, but
>>> wait for a nail instead..
> 
> 
> 
>>> Regards,
> 
>>> Svante
> 
> 
> 
>> Hope everyone is having a great day!
> 
>> Patrick
> 
> 
> 
> 
> 
>> On 01/13/2016 06:55 AM, Svante Schubert wrote:
>>> The immutable change-tracking is indeed very useful for the 
>>> scenario of commenting and editing a signed document. In this 
>>> scenario the XML can not be changed, as otherwise the sign
>>> would be broken. Every comment/edit would be saved aside the
>>> signed content XML and might be signed again for each author,
>>> ensuring the validity of the complete content.
> 
>>> That the changes will in the future refer to the position of 
>>> change into the content instead of embedding it as prior we
>>> agree on. XPath is just a possible choice of implementation
>>> for referencing. From my observation Patrick's ideas are not
>>> based on XPath, he just took it as example. I rather would
>>> avoid XPath as ODF application do not require to have an XML
>>> model representation at run time, in contrary to the file model
>>> related DOM run-time API of browsers. In addition ODF XML has
>>> no normalized representation, which make XML references more 
>>> difficult. Therefore the abstraction from XML to logical 
>>> identities, which are known to users and referencing to those 
>>> will be easier to handle by general run-time model related to
>>> ODF and works well for applications without ODF XML awareness
>>> even at run-time.
> 
>>> I have experienced this in my work on a browser based office 
>>> with Open-XChange in the past years. For example, the
>>> reference of the 3rd character within the 2nd paragraph might
>>> be written as /2/3 which can be seen as a simplification of
>>> XPath and was handled by the browser office I have been working
>>> with as simple integer array, making things easy for the office
>>> at run-time.
> 
>>> Kind regards, Svante
> 
>>> On Jan 8, 2016 9:58 PM, "Patrick Durusau" <patrick@durusau.net
> <mailto:patrick@durusau.net>
>> <mailto:patrick@durusau.net <mailto:patrick@durusau.net>>
> 
>>> <mailto:patrick@durusau.net <mailto:patrick@durusau.net>
> <mailto:patrick@durusau.net <mailto:patrick@durusau.net>>>>
>>> wrote:
> 
>>> Greetings!
> 
>>> I have been following discussions of immutable data
>>> structures, mostly in Clojure for several years and it recently
>>> occurred to me that if the starting state of an ODF document
>>> were immutable and changes are expressed against that immutable
>>> state, then many of the problems and issues that have bedeviled
>>> the change tracking TC simply disappear.
> 
>>> First, since we have an immutable starting state, then changes 
>>> expressed against that state, for example in XPath (there are 
>>> ways to default large parts of path statements), represent 
>>> changes that can be accepted or rejected when producing either
>>> a visual, print and/or new version of the document.
> 
>>> A "new" version of the document has a new starting state for 
>>> change tracking and therefore does not reflect the change 
>>> history of the previous version of the document.
> 
>>> A visual or print version of the document would have,
>>> expressed as an XPath as well, list of changes that were
>>> accepted for that particular visual or print version. Which
>>> would mean you could create another visual or print version
>>> with different changes reflected. Which would be a separate
>>> XPath statement. Enabling you to go back through versions
>>> and/or any changes.
> 
>>> Second, an immutable starting state and expressions of changes 
>>> as XPath statements means we can detect when there are 
>>> conflicting changes, without those changes ever stepping on
>>> other changes.
> 
>>> For example, assume that we have three paragraphs in the 
>>> starting state of the document and I delete text:paragraph #2. 
>>> Since that is recorded as an XPath statement and the original 
>>> state of the document does not change, you can record changes
>>> to text:paragraph #2 without fear of your changes being lost.
>>> And you can continue to edit the rest of the paragraphs in the 
>>> document because to you they have (and do have) the original 
>>> paragraph numbering.
> 
>>> Moreover, if you want to express changes on changes, which are 
>>> themselves stored in an XML document structure, unlike present 
>>> applications you can make changes to changes, which while 
>>> immutable, can have changes specified that point into those 
>>> changes.
> 
>>> Third, and this reaches into the future collaboration sphere
>>> of activity, having immutable documents and changes expressed
>>> as XPaths, will enable the detection of when branches occur
>>> that impact the visual, print or new version, enabling the
>>> author to make choices about which branch in the document to
>>> accept for that particular version .
> 
>>> Moreover, immutable change tracking will enable classic 
>>> collaboration around a server but also enable collaboration
>>> with specified others or within specified groups, such as an 
>>> authoring group in a highly secure environment.
> 
>>> Permissions could also determine what changes could be seen by 
>>> particular users and where they could suggest changes.
> 
>>> I realize this is in stark contrast to the minimal document by 
>>> default architecture of present change tracking in ODF. That
>>> was a good design decision some twenty years ago, facing
>>> unreliable networks and a stand alone orientation to document
>>> authoring.
> 
>>> But twenty years ago isn't where we are in 2016. There are 
>>> "collaborative" environments already, although I'm not
>>> impressed with their capabilities when compared to applications
>>> based on ODF.
> 
>>> What I am proposing isn't that different from Svante's
>>> original proposal except that I propose to solve the problem
>>> of coordination between systems by making documents and the
>>> changes to be applied to them immutable. Ultimately, serious
>>> conflicts must be solved by an author's choice and what I have
>>> proposed here will give every author exactly that choice.
> 
>>> On the up side, having immutable change tracking the enables 
>>> applications to have traditional collaboration hubs (think of 
>>> servers with big targets painted on them), to have
>>> collaboration between individual clients at no extra effort,
>>> save for receiving the changes, and to have group change
>>> tracking for highly secure environments.
> 
>>> Oh, I know Svante hasn't pushed this very hard but having 
>>> immutable change tracking will also enable a variety of 
>>> platforms to all work on the same ODF document. I may be
>>> editing in a desktop application while Svante is editing on a
>>> smartphone, which doesn't support styles or svg graphics. All
>>> that means is that Svante won't be submitting changes for what
>>> his platform doesn't support. He can submit changes for text
>>> without any difficulty.
> 
>>> Lest that get lost in all my verbage, the "text" is what we
>>> say it is when we "accept" changes for the production of a
>>> visual, print or new edition. Others may choose differently, as
>>> may we at some later point in time. To capture a particular
>>> version, create a new edition with no change history. Then it
>>> becomes a frozen artifact in time.
> 
>>> I suspect this will be of interest to a number of security 
>>> conscious entities, just for the varieties of collaboration 
>>> alone. Add in the other capabilities and I think it could be
>>> the next jump in collaborative word processing.
> 
>>> Hope everyone is at the start of a great weekend!
> 
>>> Patrick
> 
> 
> 
>>> --------------------------------------------------------------------
- -
>
>>> 
>>> 
> 
>> To unsubscribe from this mail list, you must leave the OASIS TC 
>> that
>>> generates this mail.  Follow this link to all your TCs in
>>> OASIS at: 
>>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.p
h
>
>>> 
p
> <https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.ph
>
> 
p>
> 
>>> 
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
>
> 
To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS
> at: 
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>
> 
> 
> 
> ---------------------------------------------------------------------
>
> 
To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS
> at: 
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>
> 
> 

- -- 
Patrick Durusau
patrick@durusau.net
Technical Advisory Board, OASIS (TAB)
OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor 13250-5 (Topic Maps)

Another Word For It (blog): http://tm.durusau.net
Homepage: http://www.durusau.net
Twitter: patrickDurusau

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJWpg+jAAoJEFPGsgi3MgycHFwQALD9OFUgnfGMvtvykWC/sAk2
F4lJKAeXJa6x7jsqFosISbWrqyKFNs4c3U5dABdF3WPXKfXsKQDqXnd3zfPVGIYG
KYaRABSowtIdhHm4H/aVESYRloQAPVGKLTEvLmHw0331wYldOaPSqAj9Pxc7XfoB
ZBXrMyQ5c28tPjDkwr0qGX6O93GgcM53jlyAmauCA9vcj5F9JP4mWsW5XjiE3V0R
LWDsJ67g7iCyQt68vRHw7BoTqOAcm8wFVCBcUveaJijqcM6q2e78qnvkU21AyC2h
sW6OQGrXd//oZXPeFHYEJS/FPAl6gZnDTPx+wzN5NhBOfidjwIWT0KwQWtriQKuQ
ZI3Q9EWVGG4Od1gNcd6XZyJK301y1RU6ekut3siSkM/SgqafXiaknjPr+Xi7U4uE
y4JkGUzN7bOL6HGoYyzwAx19x+6wVvG7957KWKDkux2cZ/a3irNmsaOxsm0QOvlX
ymunqlJ9MlydXJVh/EsoKnjTstTGtMSrB9ObLrkHwbbmMphbg5SPTD0+RxflNq1w
81c4eFD6V2Nx85DPGDWcfckrxJxlWzC/B+aecOY6r6J2XEqUkgs088jqu6fRn9UM
x/huM/rSuk8ZnOLY2fdkYvsgb/sjphFGr1Hqt4n75y2stoO/docAWP31A7Sy5EIY
1i5b0VcWNkPbaV+YF9z7
=MpQ9
-----END PGP SIGNATURE-----


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]