Today someone asked in the Usenet group comp.lang.perl.modules for examples of using XML::Parser. I gave a link to Finding the number of unique XML elements which has a small example on using the XML::Parser module. I also recommended to have a look at the other XML related Perl modules available on CPAN.
The original poster posted a follow-up to my reply, wondering "how to weed out processing statements and comments" so I posted a small working example using XML::Parser:
use strict;
use warnings;
use XML::Parser;
my $parser = new XML::Parser(
Style => 'Stream',
Handlers => {
Comment => \&xml_comment,
Proc => \&xml_pi
},
);
$parser->parse( <<'XML' );
<foo>
<?some_processing_instruction?>
<bar>
some text
<!-- comment -->
</bar>
</foo>
XML
sub xml_comment {
return;
}
sub xml_pi {
return;
}
The Perl program gives the following output:
<foo>
<bar>
some text
</bar>
</foo>
I also recommended to have a look at XML::Twig, since the constructor (new) allows for specifying for comments using the comments option: 'drop' (the default), 'keep', or 'process' and for processing instructions using the pi option: 'drop', 'keep' (the default), or 'process'.