[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] tagged and accessible PDF document with DocBook
On 24/04/2017 17:24, Holger Bast wrote:
I started on writing a specification document that maps the DocBook elements to the necessary PDF structure elements. Well, I started with FOP (based of PDF v1.5) as basis, but I think that the other processors act the same way. It would be great if someone familiar with XEP, Antenna, DocMill, etc. could have a look at the document and give some feedback. The document can be found over here: https://docs.google.com/document/d/1DrADbdNjFEeXBv5n7LNIBrKsI1Kl-oqY5jkN-U8K5p8/edit?usp=sharing
1. In Section 4.1.4.1, Strong vs. weak block-level structures in PDF files, you say 'DocBook already provides a very strong structure' [1], but then your examples use 'H1' to 'H5' PDF tags. The definition of 'Strongly structured' that you copied from the PDF 1.5 reference says to use 'H' in strongly structured PDF files. Section 7.4.4, Unnumbered headings', of ISO 14289 includes both "'H' ... should be used in strongly structured documents" and "Documents that are strongly structured may use numbered headings.", so ISO 14289 would also rather that you use 'H' in strongly structured PDF. 2. It really shouldn't be necessary to specify that 'fo:static-content' is tagged as 'Artifact'. It should just happen, as specified in Section 7.8, Page headers and footers, of ISO 14289. 3. Similarly, it shouldn't be necessary to supply PDF tags for 'fo:list-block', 'fo:list-item', 'fo:list-item-label', and 'fo:list-item-body': the XSL-FO Formatter should be providing the right tags for those FOs. AH Formatter will do it, and it seems from the code in Section 3.1.2, Automatic tagging by Apache FOP, that FOP will do it. 4. AFAICT, PDF tags are case-sensitive, so you probably should use the specified forms in your examples, e.g., 'Document' instead of 'DOCUMENT'. 5. Within your 'TOCI' for a ToC entry, you should use 'Lbl' for the entry's title, 'NonStruct' for the leader, and 'Reference' for the page number citation. 6. My understanding of the 'NonStruct' tag changes on alternate days, but you might be able to use it on some of the 'fo:block' (that you can't magic away with your post-processing) to indicate that the 'fo:block' has 'no inherent structural significance'. 7. Putting PDF tag names in @role won't do anything in AH Formatter. If you need to override the default PDF tag [2] for a particular FO, you should use @axf:pdftag [3]. Regards, Tony Graham. -- Senior Architect XML Division Antenna House, Inc. ---- Skerries, Ireland tgraham@antenna.co.jp [1] Though true, I think that strongly/weakly is a binary distinction, so I don't think you can have a 'very strongly' or a 'mildly strongly' document. [2] https://www.antennahouse.com/product/ahf64/ahf-pdf.html#taggedpdf [3] https://www.antennahouse.com/product/ahf64/ahf-ext.html#axf.pdftag
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]