[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] Find unused XML files in a project
Maybe this Python script would be of some use: https://github.com/fbolton/sibin My doc library does lots of re-use with xi:includes and images are also referenced all over the place. To interface with 'publican' (Red Hat publication tool), though, I need to have all my XML files under one directory and all my image files under one images/ subdirectory. The sibin tool consolidates all of my books (under the publican/ directory) so that they have the tidy structure that 'publican' likes. Sounds like a similar kind of problem that you are dealing with. Oh! But one thing to watch out for: the script also converts 'olink' elements into plain HTTP 'link' elements. You will probably want to disable that part of the script. Cheers, Fintan On 31/07/2014 07:20, Nordlund, Eric wrote: > Hello docbook-apps. > > I have a large set of projects that I am looking to scrub for unused > graphics and XML files prior to sending off to localization. > > Some of my colleagues have created some very basic bash and batch > scripts to scan through the folders and find files that aren’t > referenced in any of the source files so we can delete them, but I worry > that these scripts don’t catch everything (unused XML files in the base > directory that reference images will ‘bless’ this images) and we could > still have extraneous files left over or accidentally delete important > ones unknowingly. > > Each project has a book.xml file that is the gold master for the > outputs. If the book.xml file or any of its includes doesn’t reference a > file in the project, it’s safe to delete. I was hoping that I could use > xmllint to tell me which files are loaded when I try to validate the > book.xml, but I haven’t found the magic formula yet. > > I’ve tried the following command to reference all of the loaded files > during a pass, but it doesn’t seem to list the image files referenced, > which is mostly the point of this exercise, and I get a lot of noise > from the module files for the DTD on every include. > $ xmllint --load-trace book.xml --xinclude --noout &> test1 > > Has anyone had a similar problem to solve? Am I going about this the > right way? > > Thanks, and I’m open to any suggestion. If bash and xmllint don’t work > here, I am partial to Python as an alternative. Just saying. > > *Eric Nordlund* > > Senior Technical Writer > > Amazon Web Services > > Ph: 206-266-8048 | ericn@amazon.com > <applewebdata://542D1E87-0A8D-4B5A-A2DC-DE8204C46879/ericn@amazon.com> > > > > Description: Description: New Picture > > > -- Fintan Bolton Content Services | Red Hat, Inc. fbolton@redhat.com home office. +49-89-14347132 blog: http://docinfusion.blogspot.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]