OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

opendocument-users message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [opendocument-users] Automated link checking


Something like the quick and dirty script in attachment. It uses lpOD 0.9.2, and
I've also created a small testfile.

Usage: python odflinkchecker.py test.odt


Of course it isn't perfect, writing code on Monday morning while having a quick
coffee is hardly bullet-proof, that's an exercise for the reader :-)

Anyway, some things to keep in mind

- if you're behind a proxy, you may need to set http_proxy environment variables
- OOo creates empty Configurations2/...xml files, your XML parser may not like it
- ODF documents can be embedded (think charts in spreadsheets, spreadsheets
in presentations and text documents etc)
- lpOD's getpart uses "content" as an abstraction for "content.xml" (odf in a zip)
or the  office:document-content part (flat xml), but not for the embedded parts
(Objects 1/content.xml)

Best regards

Bart

________________________________________
From: Hanssens Bart [Bart.Hanssens@fedict.be]
Sent: Sunday, August 08, 2010 3:47 PM
To: Chris Puttick; opendocument-users
Subject: RE: [opendocument-users] Automated link checking

Chris,

mm, I think it shouldn't be that hard to write one, using the lpOD (python) library,
the odf_xmlpart class has a get_element_list(xpath_expr) method, so that might
do the trick

http://docs.lpod-project.org/level0.html

Bart

________________________________________
From: Chris Puttick [chris.puttick@thehumanjourney.net]
Sent: Sunday, August 08, 2010 10:02 AM
To: opendocument-users
Subject: [opendocument-users] Automated link checking

Hi

Does anyone know of an automated link checker for ODF files? Use case scenario: materials for a lecture that contain large numbers of hyperlinks, which were obviously (!) verified when the material was prepared, but a year later online resources might have moved or disappeared altogether; so need something like http://linkchecker.sourceforge.net/ but with the ability to feed in an ODF file as the starting point.

Ideally, obviously, there would be a plugin for OpenOffice, KOffice, etc., but something that could batch run on a directory of ODF files would be great. I've come up with a semi-manual process that satisfies the immediate need ("odt2html *.odt", "linkchecker *.html"), but it would be more broadly useful and neater if there was a tool that could just strip the URLs straight from the content.xml of each ODF file and had a GUI.

Cheers

Chris

--
Chris Puttick
CIO
Oxford Archaeology: Exploring the Human Journey
Direct: +44 (0)1865 980 718
Switchboard: +44 (0)1865 263 800
Mobile: +44 (0)7908 997 146
http://thehumanjourney.net


------
Files attached to this email may be in ISO 26300 format (OASIS Open Document Format). If you have difficulty opening them, please visit http://iso26300.info for more information.


---------------------------------------------------------------------
To unsubscribe, e-mail: opendocument-users-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: opendocument-users-help@lists.oasis-open.org


---------------------------------------------------------------------
To unsubscribe, e-mail: opendocument-users-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: opendocument-users-help@lists.oasis-open.org

odflinkchecker.py

test.odt



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]