OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook] Braille


I have doing some research into Braille and found many resources on the 
Internet that have been most helpful in giving me some insight into the scale 
of the problem and technical challenges posed.

While doing my research I encountered this gem, XML TO BRAILLE, from Computers 
to Help People, Inc. (CHPI) [http://www.chpi.org/]. It is free and open 
source, a tarball is available from http://www.chpi.org/xml2brl-0.2.tar.gz. 
Currently xml2brl only works on Linux, but press releases indicate that they 
are working on a Windows port. Simple tests on my box have given great 
results. I have yet to get the time to perform tests on complex documents. 
What I like is that xml2brl is easy to use and optimized for technical 
literature.

For your convenience I attached the README file for xml2brl. I hope 
attachments can go to this list. If they do not, either download the tarball 
or send me mail and I will send them to you off list.

-- 
Sean Wheller
Technical Author
sean@inwords.co.za
http://www.inwords.co.za
Registered Linux User #375355

                              THE xml2brl PROGRAM

   This  is Release 0.2 of the xml2brl program. Changes from the previous
   release are listed in the file ChangeLog. This README file details any
   changes  in  usage.  The  most notable is that the program now handles
   some MathML.

   If  you are reading the plain-text README file, you may find it useful
   to  load  README.html  into  your  browser. This will enable you to go
   directly  to the sites where you can download the libraries upon which
   this  software  depends. Once you have installed xml2brl, you can also
   get  a  braille copy by running README.html through the program. It is
   written in xhtml, which is an xml flavor.

   The  braille  translation  part  of  the  xml2brl  program is based on
   BRLTTY.  All  the necessary BRLTTY files for use in the U.S. have been
   included  in  the  package. However, if you need different contraction
   tables or different text tables, you must obtain them from BRLTTY. You
   can     download     the     latest    version    of    BRLTTY    from
   [1]http://dave.mielke.cc/brltty.

   Besides BRLTTY, this software depends on the following libraries:
   glib [2]ftp://ftp.gtk.org/pub/gtk/v2.4
   libxml2 [3]http://www.xmlsoft.org
   gdome2 [4]http://gdome2.cs.unibo.it

   You  must  download the latest versions of these libraries and install
   them in the above sequence.

   The  program  accepts  input  files in either xml or plain text and in
   many  natural languages (which may be in UTF-8 Unicode) and produces a
   brf  file  suitable for printing directly on an embosser. The brf file
   has  the  same  format  as  the files on Web-Braille and should behave
   exactly the same.

   xml  files must be well-formed. They are transcribed as specified by a
   semantic-actions  (.sem) file. If no such file exists for a given root
   element,  a  prototype  file  is created. Its name is formed by adding
   ".sem"  to  the  name of the root element, for example, "dtbook3.sem".
   The  user must then edit this file to obtain phoper transcription. The
   program  will  print a warning message if the editing step is omitted.
   Instructions  on  how  to  do  this  editing  are given in the section
   "SEMANTIC-ACTIONS FILES" in this document.

   The  program  tests  whether a file is xml. If not, it assumes a plain
   text file. In this file, lines may be of any length. Paragraphs should
   be  separated by blank lines. Lines within paragraphs are concatenated
   before  translation, with blanks in place of newlines. If a blank line
   is desired in the output, use three blank lines.

   Whether  the  file  is  xml or plain text, paragraphs are indented two
   spaces.  There is a braille page number in the lower right-hand corner
   of each page. If an xml file contains print page numbers, and this has
   been specified in the semantic-actionss file, a page-separator line is
   placed  between  print pages, and the print page number appears in the
   upper right-hand corner, proceeded by the letters a, b, etc.

   The command line is:
   xml2brl inputfile outputfile
   If  you  omit  both  inputfile  and  outputfile  the program acts as a
   filter,  taking input from stdin and delivering output to stdout. This
   enables  xml2brl to be used in a chain of printer drivers, with output
   directly  to an embosser, if desired. If you wish to specify an output
   file  but  take  input  from  stdin,  use  a  minus  sign  in place of
   inputfile.  Options  are  set in a configuration file discussed in the
   section "CONFIGURATION FILE".

   The author wishes to acknowledge his debt to the BRLTTY team. to learn
   more     about     BRLTTY     go     to    its    official    website,
   [5]http://dave.mielke.cc/brltty. The section "FILES" below tells which
   files have been copied from BRLTTY.

   Like  BRLTTY, this software is under the Gnu Public License (GPL). The
   non-BRLTTY  portions  are  copyright  by  the  author,  John J. boyer,
   director@chpi.org  .  The  libraries listed previously are all part of
   the  GNOME project and are under the Lesser Gnu Public License (LGPL).
   Details are given in the file C COPYING

INSTALLATIoN

   This  is  an  alpha  release. Therefore, it is best to install it in a
   subdirectory  of  your  home  directory.  To  do  this,  download  the
   distribution tarball into your home directory, then type
   tar xfz xml2brl-xxx.tar.gz
   This  will  create  the  directory xml2brl-xxx, where xxx is a version
   number.   After   installing   any  necessary  libraries,  go  to  the
   xml2brl-xxx  directory  and  type "make". This will create the xml2brl
   program. If you wish to re-create the program, first type "make clean"
   and then "make".

   Before  you try to run the program, execute the following statement at
   the command prompt:
   export LD_LIBRARY_PATH='/usr/local/lib'
   You may wish to add this command to your .bashrc script.

SEMANTIC-ACTIONS FILES

   These  files  tell  xml2brl how to handle your documents. Whenever the
   program encounters a new root element, it creates a prototype semantic
   actions file. Each line in this file has two columns. The first column
   is  the  word  "no",  signifying  that  no  semantic  action  has been
   specified.  The  second  column  may  contain one of the following: an
   element  name;  an  element  name, followed by a comma, followed by an
   attribute  name;  an element name, followed by a comma, followed by an
   attribute  name,  followed  by  a  comma,  followed  by  the first few
   characters  of an attribute value. The program prints a message saying
   it  is  creating this file, then terminates. Semantic files have names
   composed of the root element name and '.sem'.

   To  get  xml2brl  to transcribe your document correctly, you must edit
   the semantic file, replacing the word "no" in the first column with an
   appropriate  semantic action, such as "para" for paragraph, "heading1"
   for  the  main  heading,  etc.  The file sem-enum.h contains a list of
   valid  semantic  actions, most of which should be self-explanatory. If
   you  rerun  the  program without editing the semantic-actions file, it
   prints  a  message saying that the output will be unformatted. You can
   add  comments  to  the  file  by  using a number sign (#) as the first
   non-blank character in a line.

   If  you transcribe a new document with the same root element, but with
   additional  element  names,  attribute  names or values, these will be
   added  to  the  end  of  the  semantics-action  file, proceeded by the
   comment "#appended entries". You may then edit the new entries. If you
   wish  the program to continue to take no action for an entry, leave it
   unchanged.  Do  not comment it out. This will cause the program to add
   it to the end of the file as a new entry.

   Several semantic-actions files are provided with the program. There is
   one  for  dtbook3  files, such as those produced by Bookshare.org, for
   xhtml files, with or without included MathML, for Microsoft Word files
   exported as xml, and for docbook files.

CONFIGURATION FILE

   As   mentioned   previously,   options   for  xml2brl  are  set  by  a
   configuration  file.  This  file is called "xml2brl.cfg" and resembles
   the  semantics-actions  files.  Each  line has two columns, a keyword,
   such  as  CellsPerLine, and a value such as 40. Comments are proceeded
   by "#". The keywords should be self-explanatory.

FILES

   The  following  files  have  been  copied  without change from BRLTTY:
   brldefs.h  brl.h  countries.cti  ctb_compile.c ctb_definitions.h ctb.h
   ctb_translate.c  en-us-g2.ctb  misc.h tbl.c tbl.h text.nabcc.tbl. Note
   the   following   exceptions:  The  line  "include  countries.cti"  in
   us-en-g2.ctb  has  been  changed  to "include specsym.cti". The misc.c
   file  was  cut  down to only the functions needed by xml2brl and these
   functions were considerably modified.

   The following files were produced by the author:

   brffilt.c:  A  small filter for viewing brf files on a braille display
   with  translation  mode in BRLTTY turned off. It can also be used as a
   prototype  for  writing  other filters. To compile it, use the command
   line "gcc -o brffilt -O2 -Wall brffilt.c"

   ChangeLog: log of changes made from release to release

   COPYING: Detailed license

   dtbook3.sem: Semantic-actions file for books from Bookshare.org

   en-us-mathtext.ctb: Translation table for math documents

   examine_document.c:    Traverse2s    the   DOM   tree   to   determine
   characteristics  of  the  document,  such as whether it contains math.
   Also does preprocessing.

   html.sem: Semantics-action file for xhtml documents

   Makefile: For compiling the whole program.

   readconfig.c: Reads and processes the configuration file

   readconfig.h: Header file for above

   README: plain-text version of the folling

   README.html: This file.

   read_TextTable.c: Basically a wrapper for the functions in tbl.c

   semantics.c:  Contains  functions  for handling semantics-action files
   and tables

   semantics.h: Header file for semantics.c; includes sem_enum.h

   sem_enum.h:  list  of valid semantic actions. ºNote that if you change
   this file you must recompile the entire program.

   sem_rout.c: Contains non-trivial semantic routines or rutines that may
   vary with natural language

   sem_rout.h: Header file for above

   specsym.cti: Special symbols needed in translation of xml files

   transcribe_chemistry.c: Handles chemical formulas in DOM tree

   transcribe_document.c:  This  is the basic transcription routine which
   traverses    the    DOM    tree    and   calls   transcribe_paragraph,
   transcribe_math, etc., as needed.

   transcribe_graphic.c: Handles SVG graphics in the DOM tree

   transcribe_math.c: Handles MathML and other xml math notations

   transcribe_music.c: transcribes music notation expressed in xml

   transcribe_paragraph.c:  Handles  "paragraphs", including headings, in
   the DOM tree

   transcriber.c:   Contains   the   low-level   transcription  routines,
   including the routine for transcribing plain text.

   transcriber.h: Header file for above

   w_wordDocument.sem: semantics-action file for Microsoft Word documents
   exported as xml

   xml2brl.c: The main program.

   xml2brl.cfg: Configuration file

   xml2brl.h: Header file for main program

References

   1. http://dave.mielke.cc/brltty
   2. ftp://ftp.gtk.org/pub/gtk/v2.4
   3. http://www.xmlsoft.org/
   4. http://gdome2.cs.unibo.it/
   5. http://dave.mielke.cc/brltty
Title: The xml2brl Program

THE xml2brl PROGRAM

This is Release 0.2 of the xml2brl program. Changes from the previous release are listed in the file ChangeLog. This README file details any changes in usage. The most notable is that the program now handles some MathML.

If you are reading the plain-text README file, you may find it useful to load README.html into your browser. This will enable you to go directly to the sites where you can download the libraries upon which this software depends. Once you have installed xml2brl, you can also get a braille copy by running README.html through the program. It is written in xhtml, which is an xml flavor.

The braille translation part of the xml2brl program is based on BRLTTY. All the necessary BRLTTY files for use in the U.S. have been included in the package. However, if you need different contraction tables or different text tables, you must obtain them from BRLTTY. You can download the latest version of BRLTTY from http://dave.mielke.cc/brltty.

Besides BRLTTY, this software depends on the following libraries:
glib ftp://ftp.gtk.org/pub/gtk/v2.4
libxml2 http://www.xmlsoft.org
gdome2 http://gdome2.cs.unibo.it

You must download the latest versions of these libraries and install them in the above sequence.

The program accepts input files in either xml or plain text and in many natural languages (which may be in UTF-8 Unicode) and produces a brf file suitable for printing directly on an embosser. The brf file has the same format as the files on Web-Braille and should behave exactly the same.

xml files must be well-formed. They are transcribed as specified by a semantic-actions (.sem) file. If no such file exists for a given root element, a prototype file is created. Its name is formed by adding ".sem" to the name of the root element, for example, "dtbook3.sem". The user must then edit this file to obtain phoper transcription. The program will print a warning message if the editing step is omitted. Instructions on how to do this editing are given in the section "SEMANTIC-ACTIONS FILES" in this document.

The program tests whether a file is xml. If not, it assumes a plain text file. In this file, lines may be of any length. Paragraphs should be separated by blank lines. Lines within paragraphs are concatenated before translation, with blanks in place of newlines. If a blank line is desired in the output, use three blank lines.

Whether the file is xml or plain text, paragraphs are indented two spaces. There is a braille page number in the lower right-hand corner of each page. If an xml file contains print page numbers, and this has been specified in the semantic-actionss file, a page-separator line is placed between print pages, and the print page number appears in the upper right-hand corner, proceeded by the letters a, b, etc.

The command line is:
xml2brl inputfile outputfile
If you omit both inputfile and outputfile the program acts as a filter, taking input from stdin and delivering output to stdout. This enables xml2brl to be used in a chain of printer drivers, with output directly to an embosser, if desired. If you wish to specify an output file but take input from stdin, use a minus sign in place of inputfile. Options are set in a configuration file discussed in the section "CONFIGURATION FILE".

The author wishes to acknowledge his debt to the BRLTTY team. to learn more about BRLTTY go to its official website, http://dave.mielke.cc/brltty. The section "FILES" below tells which files have been copied from BRLTTY.

Like BRLTTY, this software is under the Gnu Public License (GPL). The non-BRLTTY portions are copyright by the author, John J. boyer, director@chpi.org . The libraries listed previously are all part of the GNOME project and are under the Lesser Gnu Public License (LGPL). Details are given in the file C COPYING

INSTALLATIoN

This is an alpha release. Therefore, it is best to install it in a subdirectory of your home directory. To do this, download the distribution tarball into your home directory, then type
tar xfz xml2brl-xxx.tar.gz
This will create the directory xml2brl-xxx, where xxx is a version number. After installing any necessary libraries, go to the xml2brl-xxx directory and type "make". This will create the xml2brl program. If you wish to re-create the program, first type "make clean" and then "make".

Before you try to run the program, execute the following statement at the command prompt:
export LD_LIBRARY_PATH='/usr/local/lib'
You may wish to add this command to your .bashrc script.

SEMANTIC-ACTIONS FILES

These files tell xml2brl how to handle your documents. Whenever the program encounters a new root element, it creates a prototype semantic actions file. Each line in this file has two columns. The first column is the word "no", signifying that no semantic action has been specified. The second column may contain one of the following: an element name; an element name, followed by a comma, followed by an attribute name; an element name, followed by a comma, followed by an attribute name, followed by a comma, followed by the first few characters of an attribute value. The program prints a message saying it is creating this file, then terminates. Semantic files have names composed of the root element name and '.sem'.

To get xml2brl to transcribe your document correctly, you must edit the semantic file, replacing the word "no" in the first column with an appropriate semantic action, such as "para" for paragraph, "heading1" for the main heading, etc. The file sem-enum.h contains a list of valid semantic actions, most of which should be self-explanatory. If you rerun the program without editing the semantic-actions file, it prints a message saying that the output will be unformatted. You can add comments to the file by using a number sign (#) as the first non-blank character in a line.

If you transcribe a new document with the same root element, but with additional element names, attribute names or values, these will be added to the end of the semantics-action file, proceeded by the comment "#appended entries". You may then edit the new entries. If you wish the program to continue to take no action for an entry, leave it unchanged. Do not comment it out. This will cause the program to add it to the end of the file as a new entry.

Several semantic-actions files are provided with the program. There is one for dtbook3 files, such as those produced by Bookshare.org, for xhtml files, with or without included MathML, for Microsoft Word files exported as xml, and for docbook files.

CONFIGURATION FILE

As mentioned previously, options for xml2brl are set by a configuration file. This file is called "xml2brl.cfg" and resembles the semantics-actions files. Each line has two columns, a keyword, such as CellsPerLine, and a value such as 40. Comments are proceeded by "#". The keywords should be self-explanatory.

FILES

The following files have been copied without change from BRLTTY: brldefs.h brl.h countries.cti ctb_compile.c ctb_definitions.h ctb.h ctb_translate.c en-us-g2.ctb misc.h tbl.c tbl.h text.nabcc.tbl. Note the following exceptions: The line "include countries.cti" in us-en-g2.ctb has been changed to "include specsym.cti". The misc.c file was cut down to only the functions needed by xml2brl and these functions were considerably modified.

The following files were produced by the author:

brffilt.c: A small filter for viewing brf files on a braille display with translation mode in BRLTTY turned off. It can also be used as a prototype for writing other filters. To compile it, use the command line "gcc -o brffilt -O2 -Wall brffilt.c"

ChangeLog: log of changes made from release to release

COPYING: Detailed license

dtbook3.sem: Semantic-actions file for books from Bookshare.org

en-us-mathtext.ctb: Translation table for math documents

examine_document.c: Traverse2s the DOM tree to determine characteristics of the document, such as whether it contains math. Also does preprocessing.

html.sem: Semantics-action file for xhtml documents

Makefile: For compiling the whole program.

readconfig.c: Reads and processes the configuration file

readconfig.h: Header file for above

README: plain-text version of the folling

README.html: This file.

read_TextTable.c: Basically a wrapper for the functions in tbl.c

semantics.c: Contains functions for handling semantics-action files and tables

semantics.h: Header file for semantics.c; includes sem_enum.h

sem_enum.h: list of valid semantic actions. ºNote that if you change this file you must recompile the entire program.

sem_rout.c: Contains non-trivial semantic routines or rutines that may vary with natural language

sem_rout.h: Header file for above

specsym.cti: Special symbols needed in translation of xml files

transcribe_chemistry.c: Handles chemical formulas in DOM tree

transcribe_document.c: This is the basic transcription routine which traverses the DOM tree and calls transcribe_paragraph, transcribe_math, etc., as needed.

transcribe_graphic.c: Handles SVG graphics in the DOM tree

transcribe_math.c: Handles MathML and other xml math notations

transcribe_music.c: transcribes music notation expressed in xml

transcribe_paragraph.c: Handles "paragraphs", including headings, in the DOM tree

transcriber.c: Contains the low-level transcription routines, including the routine for transcribing plain text.

transcriber.h: Header file for above

w_wordDocument.sem: semantics-action file for Microsoft Word documents exported as xml

xml2brl.c: The main program.

xml2brl.cfg: Configuration file

xml2brl.h: Header file for main program



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]