OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [docbook-apps] Automatically detecting which PDFs are affected by changes in a DocBook Git commit

Title: NeoDoc


we have implemented in the Calenco CMS a method based on the IF FOP intermediary XML Format, which is an XML description of each resulting page. It is based on the following simple algorithm:

1. Generate the original document IF XML
2. Generate the IF XML based on new sources
3. Compare with an XSLT the 2 IF documents to extract whatever information we need

We have been using this successfully in production to generate PDF with only the modified pages between 2 different versions of a document.


Camille Bégnis
789, rue de la gare
F-13770 Venelles
Le 01/11/2016 à 15:42, Bergfrid Skaara a écrit :
I´d like to know the overall difference between versions X and Y of a single PDF - automatically so we don't need to inspect and compare page by page manually (simplifying release process).

I need to know all PDFs (and what content within them) that have changed as a result of commits X,Y, Z (simplifying review and release process).

I´d like notifications if a commit targeted only at feature A ends up changing PDFs unrelated to feature A. This would typically indicate profiling errors or misplaced includes that need to be fixed.

Inserting build numbers or some similar ID from the CI environment into the PDF metadata would also help.

The challenge with comparing PDFs is all the noice you get from layout changes (white space) and info in headers and footers such as dates and version numbers.

And if you go the convert-PDFto-text-before compare route, would it not be better to compare the intermediate FO files rather than waste time going through the entire publishing pipeline first?

We are not looking to replace our current CI environment. Extending the current build-logic is not a problem, but I´m not sure what the new logic should look like.

Bergfrid Skaara Dias

On Wed, Oct 26, 2016 at 6:48 PM, Stefan Seefeld <stefan@seefeld.name> wrote:
On 26.10.2016 11:04, Bergfrid Skaara wrote:
> Hi,
> We use Git to version control our modular DocBook XML code base. I´d
> like to enforce stricter change management than what simply inspecting
> the Git log manually offers. Specifically, I want to trace each
> modular DocBook XML fie that has been changed up to the PDFs that will
> be changed as a result.
> Tracing the ancestor files through a sequence of xi:includes is
> trivial. My challenges are:
> 1. Profiling. I need to trace ancestor elements taking profiling into
> consideration.
> 2. Entities. We use entities extensively for both aliases and reused
> text. Is there a way to track effects of changed entities without
> starting with a brute force search of all DocBook XML files using that
> entity?
> Are there any tools, standalone or add-ons to oXygen, that support
> this or similar behavior, or am I better off writing my own script? In
> case of script, which option is better: XSLT or any scripting language
> facilitating text parsing?

I'm not quite sure what you mean by "change management", and what it is
that you want to enforce, and neither what exactly you want to trace.

Generating a PDF from XML sources typically requires some build logic,
so I think the best you can do is use that very build logic and then
compare (or validate) the generated PDF (or any intermediate formats,
such as FO). That can easily be done in a CI environment (such as
Travis-CI), so you can fully automate that such that the same process is
executed for each push.



      ...ich hab' noch einen Koffer in Berlin...

To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]