[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: DOCBOOK-APPS: sgml auto-indenter
Known problems: will break line-specific enviroments. So far, the script
is quite general--it does not recognize
specific tags and so could be used for any xml or sgml, not just docbook.
Is there any way to recognize literal text
independent of DTD? Leading whitespace, for example? Trailing whitespace?
Or I could indent tags only, and leave
all non-tag text unjustified and unindented.
----Cut Here------
#!/usr/bin/perl -w # # sb: the sgml beautifier # indents non-empty sgml tags # usage: sb filename or sb < filename or | sb # author: Kevin M. Dunn (kdunn@hsc.edu) # license: anyone is free to use this for any purpose whatever # $jl = 80; #text will be justified to 80 characters/line $nl = 0; $sp = 0; $newline = ""; # hack to prevent extraneous blank first line $space[0] = ""; separate_tags(); get_tags(); indent_tags(); unlink ("$$.tmp"); # remove temporary file print "\n"; # add final newline to output sub separate_tags { open(FILETMP, ">$$.tmp"); while (<>){ $_ =~ s/</\n</g;
$_ =~ s/>/>\n/g; print FILETMP "$_"; } close(FILETMP); } sub get_tags { open(FILETMP, "$$.tmp"); while (<FILETMP>){ $word = $_; $word =~ s/[> ].*//; chomp($word); if ( $word =~ /^<\/.*/ ){; $tag2{$word} = 1; $word =~ s/\///; $tag1{$word} = 1; } } } sub indent_tags { open(FILETMP, "$$.tmp"); while (<FILETMP>){ chomp($_); $word = $_; $word =~ s/[> ].*//; if ( $tag1{$word} ){ print "\n$space[$sp]$_"; $nl = $jl; # force new line on next line of input $sp++; if ( ! $space[$sp] ){ $space[$sp] = $space[$sp-1] . " "; } } elsif ( $tag2{$word} ){ $sp--; print "\n$space[$sp]$_"; $nl = $jl; # force new line on next line of input } elsif ( $word =~ /<.*/ ) { print "$newline$space[$sp]$_"; $newline = "\n"; # hack to prevent extraneous blank first line $nl = $jl; # force new line on next line of input } elsif ( length($_) > 0 ) { justify();
} } } sub justify { @words = split; $nw = @words; for ($i = 0; $i < $nw; $i++ ){ $ll += length($words[$i]) + 1 + $nl; # line length if this word is added if ($ll < $jl){ # if short enough, print it print "$words[$i] "; $nl = 0; } else { # if line is too long, start a new one print "\n$space[$sp]$words[$i] "; $nl = 0; $ll = length($space[$sp] . $words[$i]) + 1; } } }----Cut Here------
-- Kevin M. Dunn kdunn@hsc.edu Department of Chemistry Hampden-Sydney College HSC, VA 23943 (804) 223-6181 (804) 223-6374 (Fax)
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC