[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] Stripping comments
Here's a quick perl solution that doesn't read everything into
memory and seems to handle some of the edge cases. Try it out on a
few things to verify that everything is okay before completely
trusting it, though. :)
Copy the lines between '------------' into a file (say
strip_xml_comments.pl).
(if on Unix do this step first)
chmod 755 strip_xml_comments.pl
Make a backup copy of any and all files that you'll be using. (The
script should work fine as is, but it's *MUCH* better to be safe than
sorry. :)
Now you should be able to run the script on a copy of your input file.
strip_xml_comments.pl my_xml_input_file.xml
The script will make a backup copy of its own with '.orig' at the
end of the name. (Please don't just rely on this feature -- make your
own backup.)
Verify that everything looks okay and integrate it into your
application stream.
Here's the script
----------------------
#!/usr/bin/perl -w -i.orig
#
# NB: Delete the '.orig' portion if backup copies are not desired
#
#
# Delete XML comments.
#
#
# Go through every file given on the command line
#
$in_comment= 0;
while( <> ) {
#
# Match inline comments
#
s {
<!-- # Match the opening delimiter.
.*? # Match a minimal number of characters.
--> # Match the closing delimiter.
} []gsx;
#
# Match multi-line comments
#
if( /<!--/ ) {
$in_comment= 1;
next;
}
#
# Find the end of a multi-line comment and remove everything to that point.
# NB: All other in-line comments have already been removed
#
if( /-->/ ) {
s/.*-->//;
$in_comment= 0;
}
#
# Ignore every line in the comment
#
if( $in_comment ) {
next;
}
print; # Print everything on the current line
}
----------------------
Note that the code is a simple modification of one of the examples
from the perlre man page (http://perldoc.perl.org/perlre.html).
Hopefully this will suit your purposes!
kells
>
> ----- Original Message -----
> From: "Paul Moloney" <paul_moloney@hotmail.com>
> To: <docbook-apps@lists.oasis-open.org>
> Sent: Thursday, March 29, 2007 6:45 AM
> Subject: [docbook-apps] Stripping comments
>
>
> >
> > One task I have it to package our source XML files for use by
> > integrators;
> > one thing I'd like to do is first strip the comments from these files as
> > they may contain sensitive information.
> >
> > I was thinking that this could be done by processing each file through
> > Saxon
> > using a stylesheet which strips out comments and outputs the XML again.
> > But
> > rather than risk reinventing the wheel, I was wondering if anyone out
> > there
> > has implemented a DocBook comment stripper in their build process?
> >
> > Thanks,
> >
> > P.
> > --
> > View this message in context:
> > http://www.nabble.com/Stripping-comments-tf3486783.html#a9734912
> > Sent from the docbook apps mailing list archive at Nabble.com.
>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]