[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] Stripping comments
Here's a quick perl solution that doesn't read everything into memory and seems to handle some of the edge cases. Try it out on a few things to verify that everything is okay before completely trusting it, though. :) Copy the lines between '------------' into a file (say strip_xml_comments.pl). (if on Unix do this step first) chmod 755 strip_xml_comments.pl Make a backup copy of any and all files that you'll be using. (The script should work fine as is, but it's *MUCH* better to be safe than sorry. :) Now you should be able to run the script on a copy of your input file. strip_xml_comments.pl my_xml_input_file.xml The script will make a backup copy of its own with '.orig' at the end of the name. (Please don't just rely on this feature -- make your own backup.) Verify that everything looks okay and integrate it into your application stream. Here's the script ---------------------- #!/usr/bin/perl -w -i.orig # # NB: Delete the '.orig' portion if backup copies are not desired # # # Delete XML comments. # # # Go through every file given on the command line # $in_comment= 0; while( <> ) { # # Match inline comments # s { <!-- # Match the opening delimiter. .*? # Match a minimal number of characters. --> # Match the closing delimiter. } []gsx; # # Match multi-line comments # if( /<!--/ ) { $in_comment= 1; next; } # # Find the end of a multi-line comment and remove everything to that point. # NB: All other in-line comments have already been removed # if( /-->/ ) { s/.*-->//; $in_comment= 0; } # # Ignore every line in the comment # if( $in_comment ) { next; } print; # Print everything on the current line } ---------------------- Note that the code is a simple modification of one of the examples from the perlre man page (http://perldoc.perl.org/perlre.html). Hopefully this will suit your purposes! kells > > ----- Original Message ----- > From: "Paul Moloney" <paul_moloney@hotmail.com> > To: <docbook-apps@lists.oasis-open.org> > Sent: Thursday, March 29, 2007 6:45 AM > Subject: [docbook-apps] Stripping comments > > > > > > One task I have it to package our source XML files for use by > > integrators; > > one thing I'd like to do is first strip the comments from these files as > > they may contain sensitive information. > > > > I was thinking that this could be done by processing each file through > > Saxon > > using a stylesheet which strips out comments and outputs the XML again. > > But > > rather than risk reinventing the wheel, I was wondering if anyone out > > there > > has implemented a DocBook comment stripper in their build process? > > > > Thanks, > > > > P. > > -- > > View this message in context: > > http://www.nabble.com/Stripping-comments-tf3486783.html#a9734912 > > Sent from the docbook apps mailing list archive at Nabble.com. >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]