[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: GSoC Project Idea: integrated LaTeX output support for the stylesheets
Hi,I plan to apply to this year's Summer of Code and I wanted to share my idea with you for discussion. I've been using DocBook both for personal documents (CV, Bachelor and Master Thesis, papers, etc.) and in the FreeBSD Project. I'm usually satisfied with the output that Apache FOP creates but I believe the PDF generation should be better supported. The only usable open source XSL FO renderer is Apache FOP (xmlroff is very immature and is not actively developed) so we have no alternatives. Besides, I had problems with it with large documents (it crashes). As a result, in FreeBSD, we are still using DocBook 4.2/4.5 with DSSSL stylesheets to render PDF since we are not satisfied with this situation. The lack of alternatives could mean technology/vendor lock-in, it does not work with some of our big documents and Java is a heavy dependency. I've worked with XSL FO and quite like how it works but when it comes to these factors I feel afraid of basing a serious project on XSL FO and FOP. Because of the complexity of XSL FO, it is not easy to write a renderer so this situation will probably not change in the near future. In turn, there's been LaTeX for a long time and people use it, they are familiar with and it is very well supported. Probably noone would think it is a risk basing a documentation project on LaTeX.But DocBook is more semantic and I like it much more. Also, the XHTML and EPUB generation of DocBook is of a really high quality. So I'd like to combine the advantages of DocBook and LaTeX and create stylesheets that produce LaTeX output that can be used for printable formats. I think that probably more people thinks like me and having this functionality would improve DocBook's acceptance in the industry.
There are two projects, db2latex and dblatex, which provide such functionality but they do not integrate well with the existing stylesheets and dblatex also introduces a new dependency, Python. I'd like to create a new solution, that is purely XSLT-based and integrates with the existing DocBook XSL facilities (titlepage, I18N, etc.). My idea is to first create an XML serialization of TeX (TeXML or a slightly revised version of that), which actually has the same abstract syntax as TeX but is an XML document. Then I'd create actual stylesheets that transform DocBook documents into this XML TeX and then it would be transformed into real TeX in a second pass. So the first (more complex) pass would produce XML and only the second pass would output plain text. XSLT is not the best tool to produce plain text but this approach would mitigate this problem and at the same time avoid having to introduce new dependencies.
This is a big project but I believe that the summer is enough to create a stable and useful stuff even if does not have the same feature-completeness as XHTML and XSL FO output formats. During the last summer I ported DocBook Slides to DocBook 5.0 and created stylesheets for XHTML and XSL FO output. I also have DocBook and XML/XSLT experiences from the documentation of FreeBSD.
Do you also think it is useful? Any potential mentors? Please share your thoughts.