Perl programmer for hire: download my resume (PDF).
John Bokma Perl
freelance Perl programmer

Element id for given parent id

using XML::Parser | 0 comments

Some time ago someone asked on Usenet for a Perl program that would report the value of a given attribute (id) for a given XML element (Comp), but only if this element had a given parent element (entities). I decided to reply with a solution using XML::Parser, but had back then not enough time to program the solution and test it.

Much later I wrote and tested the following Perl program:

# element-id-for-given-parent.pl
#
# reports the value of the id attribute for  each Comp element if and
# only if the element has the entities element as a parent.
#
# $Id$ 

use strict;
use warnings;

use XML::Parser;

my @element_stack;
my $parser = XML::Parser->new(

    Handlers => {

        Start => \&start,
        End   => \&end,
    }
);

$parser->parse( <<'XML' );
<entities>
  <Comp id="123">
    <Description max_length="" reference_to="" type="multiline_string"/>
    <Clone max_length="" reference_to="Comp" type="reference_list">
      <Comp id="129" />
    </Clone>
  </Comp>
  <Comp id="124">
    <Description max_length="" reference_to="" type="multiline_string"/>
  </Comp>
</entities>
XML

exit;


sub start {

    my ( $expat, $element, %attrval ) = @_;

    if ( $element eq 'Comp' and $element_stack[ -1 ] eq 'entities' ) {

        print "$attrval{ id }\n";
    }

    push @element_stack, $element;
}


sub end {

    my ( $expat, $element ) = @_;

    pop @element_stack;
}

Explanation of how the Perl program works

First, the program declares an element stack. Each time the start of an element is encountered (the start tag), the name of the element is pushed onto the stack, and each time the end of an element (the end tag) is encountered, the name of the element is removed from the stack. This pushing and pop-ing is done by the Start and End handler respectively.

So next, the program creates a new XML::Parser object with a Start handler set to a reference to the start subroutine, and an End handler set to a reference to the end subroutine.

In the start subroutine the name of the current element is checked before it's actually pushed onto the stack. If it's 'Comp', the current element on the stack is checked, and if that one is named 'entities' the value of the 'id' attribute is printed.

Note that the testing of the element name happens before the test of the last element on the stack due to the left to right evaluation order. Since the root element of the XML file is 'entities', if the current element is 'Comp' there is always at least one element on the stack.

Finally the parse method of the XML::Parser object is called with some well-formed XML given in a so called here-document as an example.

Output of the program

123
124
Please post a comment | read 0 comments | RSS feed