OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-collab message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: GCT delta:merge and paragraphs


On Wed, 2011-06-22 at 13:58 +1000, monkeyiq wrote:
Hi,
  I've been considering paragraph deletion and merging in the context of the GCT. In particular, the abiword implementation and how to track document changes in the data model to allow simpler IO paths. I would imagine the abstract details to be of interest to others, especially corner cases which I've not considered yet, thus this email.

  Replying to my previous email regarding markup for a delta:merge at the time when a selection exists and delete/backspace is pressed. This is mainly a brain dump to help others who might take the same implementation route in the future.

  One might also like to read the method documentation for pt_PieceTable::deleteSpanChangeTrackingAreWeMarkingDeltaMerge() in the future in case new cases are discovered and added to the code. My github has the current state of things, but in the somewhat near future one should also check git master for abiword.
https://github.com/monkeyiq/odf-2011-track-changes-git-svn/blob/2fbefdb33b9b957302b7e23947ae0362a65bc8c7/src/text/ptbl/xp/pt_PT_DeleteSpan.cpp#L255

Also perhaps of interest, the abiword gct test suite contains a bunch of documents which can be used to verify we are both doing what we should. The repository contains abw format documents which can be turned into ODF using a build of abiword with GCT enabled (the above git repo or git master when merged). The command to use is:
$ abiword -t odt -o /tmp/output.odt  input-abiword-document.abw
Documents of interest:
https://github.com/monkeyiq/odf-2011-track-changes-tests/tree/master/para-split-and-merge
There are other directories to test other functionality such as ac:change etc.

  To lift a more abstract description than the header comments of the method cited above offers, I consider the range (startpos,endpos) in the document for the selection to see if the deletion of this range would constitute a delta:merge being used if the document were serialized as ODF. If it does then suitable markup is added during the deletion.

The basic rules;

(1) It is not a delta merge if startpos and endpos are both fully contained in the same paragraph.

(2) It is not a delta merge if the paragraph(startpos) is not in the same table cell as paragraph(endpos). If both these paragraphs are in no cell then it might yet be a delta merge. See point (6) if you think this is strange.

(3) If deleting from the start or right to the very end of a paragraph X and the process results in deleting paragraph X entirely, I prefer to serialize using removed-content rather than using a delta:merge, thus these two positions form a special case and their appearance means it is not a delta:merge.

(4) Some of the office apps I've worked on use a special in document marker to delimit paragraphs and other content. Abiword uses what is logically a 1 byte marker for the start of a paragraph which lives at the end of the last line of the previous paragraph. For example the ($) position is not shown visually and would be the marker to indicate the second paragraph begins.

This is para1$
This is the second paragraph, with only one sentence too.

In this case, if one is at the start of the second line or the end of the first line and presses backspace or delete respectively then it is a paragraph merge and should use a delta:merge. Likewise if a selection starts at the $ and extends into (but not to the end of) the paragraph then this is also a delta:merge.

For example, if the bold and red is the selection,

This is para1$
This is the second paragraph, with only one sentence too.

One gets the result:
This is para1 with only one sentence too.

And the ODF might be (without newlines and added whitespace):
<text:p delta:insertion-type="insert-with-content" delta:insertion-change-idref="1">
  This is para1
  <delta:merge delta:removal-change-idref="2">
       
    <delta:leading-partial-content></delta:leading-partial-content>
    <delta:intermediate-content></delta:intermediate-content>
    <delta:trailing-partial-content>
      <text:p delta:insertion-type="split" delta:insertion-change-idref="1">
	This is the second paragraph,
      </text:p>
    </delta:trailing-partial-content>
       
  </delta:merge>
  with only one sentence too.
</text:p>



(5) One might consider the case where the selection extends between two (or more) cells in a table as seen below. The selection might be formed as the red and bold text. This can be treated as two cases of (3) because to get out of the first cell we have selected "right to the end" of the last paragraph and to get into the second cell we have selected "right from the start" of the first paragraph in cell2.
cell1 para1
c1para2
c1para3
cell2 para1
cell2p2
c2endpara





(6) If the two ODF XML elements can't legally coalesce then it is not a delta merge. For example, somehow selecting from a paragraph into the caption of a subsequent image.

As always, the code is the true expression of things and apologies if I missed something in this overview. FWIW I find attaching start, end, and whole deleted markers and the change-id that these occurred in to the 1-byte paragraph markers quite effective for serialization to/from ODF.

Maybe some of this can be rolled into the GCT document to help other implementers. Of course I'd have to clean it up for readability and drop in some more ODF fragments. At least it's on a public mailing list already ;)



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]