OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [docbook] XML/XSL Revision Control/ Source Code Versioning: Ideas, Methods, Tools for Specific scenario as a Content Writer?

Hi Alex,

I've just been catching up on my email list reading and noticed you've raised this issue here and also in opendocument-users and xsl-list. I'll answer here, with a DocBook specific response, but the same arguments apply to other documentation formats - but perhaps this discussion belongs in docbook-apps and should be continued there?

We (DeltaXML) currently provide a general purpose/well-formed xml compare and a DocBook specific compare product. We are also working on 3-way or more generally n-way 'Merge' products that will have more relevance when used with revision control systems. However, Thomas has already given you some good advice - an existing VCS system, perhaps with additional normalization steps, may meet your requirements.

You asked do people use software revision control for documents - the answer is certainly yes. As a company, we do this; we version our product documentation with our source code. Until around 2 years ago it was subversion (svn) and its now mainly mercurial (hg), git would have been an appropriate alternative. Other approaches are to use a content management system (CMS) or a filesystem with webdav/deltaV support. I was recently at the DITA Europe 2012 conference (yes I know! - but the answer I suspect is equally applicable to DocBook) and tried to asses the use of software version control - when talking to people I often asked the question - "do you use a CMS or software version control systems" - the result was around 50/50. I suspect that software/IT companies have the expertise readily available, and like us, find it easier to adapt to using software version control.

Your follow-on question/email asked about line-based vs xml-aware comparison:

On 08/12/2012 16:25, Alex S wrote:
Thank you for your time & responses. I remember reading somewhere that a pure text/ linear comparison based tool/ system may not be ideal to compare & merge XML tree structure based documents.

When using an XML-aware algorithm as part of the merge/update process there is a possibility to get better results. For example, consider a user on one branch using an editing or authoring tool which mixes up attributes, for example reordering them or re-indenting them over multiple lines. When the branches are merged you are likely to get a "false conflict" with a line based algorithm, whereas an XML-aware algorithm shouldn't identify a change.

Taking this a step further, you can do more if the tools/algorithms understand the grammar or XML format being processed. Here's a DocBook 5 example, in the ancestor revision there is a section with a title and an itemized list:

        <title>Merge example</title>
        <para>In this example...</para>...

In one branch a user adds an indexterm, in another branch a revhistory is added.

The ancestor revision used in 3 way diff or merge algorithms allows them to work out that different sets of lines have been inserted at the same point (relative to the ancestor) and that they are not identical, and hence gives a conflict. This is the conflict from mercurial:

        <title>Merge example</title>
<<<<<<< mine
        <indexterm><primary>Revision Control</primary></indexterm>
<date>2012-12-12</date><revdescription><simpara>Testing hg</simpara></revdescription>
>>>>>>> theirs
        <para>In this example...</para>

However, the DocBook 5 grammar allows "one or more of" revhistory/indexterm and so you could argue that this isn't really a conflict here. Conversely, there are places in DocBook where you have a choice of elements without a one-or-more (+), zero-or-more (*) repetition qualifier and adding both of the choices from different branches is definitely a conflict irrespective of how they are represented as lines. We propose that in an XML 'grammar aware' system, conflict can and should be related to the grammar rules.

We are addressing the software version control use-case with these enhanced types of conflict detection in our upcoming 'merge' products. Integrations with hg, git and svn (probably in that order) are planned. One of the problems we found in the past was that software version control usually handles just binary or text files. svn allowed you to plug-in alternative merge or diff tools, but only for all types of text file. We are planning to take another look at the interfaces to see if there are any ways in which we can plug-in our algorithms only for specific types of file.



Nigel Whitaker, Software Architect, DeltaXML Ltd. "Experts in information change"
nigel.whitaker@deltaxml.com   http://www.deltaxml.com   +44 1684 869035
Registered in England: 02528681 Reg. Office: Monsell House, WR8 0QN, UK

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]