Perl recursive sub lister

Thursday, March 1, 2007 | 2 comments

I needed a small Perl program to provide me with a list of all subs defined in several modules in order to plan code refactoring of the project I was working on. And so I wrote a small Perl program which I'll explain in parts. At the end of this blog entry you'll find the complete program.

The script starts with the classic shebang to specify which program has to interpret the file and is followed by two compiler pragmas. One to enable strictures and one to enable warnings. Sometimes you see the -w switch at the shebang but I recommend to use use warnings instead because it's much more flexible:

#!/usr/bin/perl

use strict;
use warnings;

If you're looking for help with your Perl program, on Usenet for example, be very aware that most experienced Perl programmers refuse to look at the rest of your program when both these pragmas are missing.

Traversing a directory tree

Since I wanted the list to include all subs for all modules I needed a way to traverse a specified directory and all subdirectories. I decided to have the start directory specified on the command line. By using the shift function, which defaults to @ARGV at file scope, the first argument can be obtained. If none is given, the program uses die to report a simple usage message:

my $dir = shift;
defined $dir or die "usage: subscan.pl [DIR]\n";

For the traversing itself I used the File::Find module. The traverse is started by calling the find function with the first parameter a reference to a subroutine, and the second parameter the directory to start the traverse with. A reference to a subroutine is the name of the subroutine prefixed with \&. Note that the & is not part of the subroutine name. Moreover don't prefix subroutine calls in general with & unless you require the side-effects the & causes (and most of the time you don't).

use File::Find;

find( \&show_subs, $dir );

Reading an entire file into a scalar (slurping)

In order to find the names of all subs in a Perl module (or Perl program) I decided to read the entire file into a scalar and do a global matching over all lines inside the show_subs subroutine. There are several ways to do slurping in Perl, even in a single line. However, I strongly recommend to use the File::Slurp module, which you probably have to install via CPAN first. If you are using Active Perl the following line will install File::Slurp:

ppm install File-Slurp

Add to the top of the Perl program use File::Slurp; and you can use read_file to slurp an entire file.

    my $perl = read_file( $_ );

Global matching (g) treating the scalar as containing multiple lines (m) is used to capture the names of the subs. If there are no subs found, the show_subs returns early. Otherwise, the name of the file (stored in $_) is printed, followed by an overview. The map function is used to put four spaces in front of the name and a new line character after it, resulting in each sub name printed indented with four spaces on a line of its own.

    my @subs = $perl =~ /^\s*sub\s+(\S+)/gm;
    @subs or return;

    print "  $_\n", map { "    $_\n" } @subs;

Returning early

I prefer to return early from a subroutine if there is no need to stay longer in it. Some people prefer a single exit point but in my experience such a very limiting rule results in a huge clutter of if and else statements.

If the object is an directory I wanted the pathname - stored in $File::Find::name - to be printed. Since I use subversion for version control I don't want to traverse into subversion related directories. This is achieved by setting $File::Find::prune to true. Hence the code became:

    if ( -d ) {

        if ( $_ eq '.svn' ) {

            # don't traverse into subversion related directories
            $File::Find::prune = 1;
            return;
        }

        # for directories, only print path
        print "$File::Find::name\n";
        return;
    }

Note that -d defaults to $_, which contains the name of the current file object, and that the current directory is set to the directory containing this object.

Two more tests are done: one to check if the file object is a file, and one to check if the file has one of the following Perl related extensions: pl, pm, or plx.

    -f $_         or return;  # if not a file
    /\.p(m|lx?)$/ or return;  # if not the right extension

The complete Perl program

The complete code for the recursive sub lister follows below, enjoy!

#!/usr/bin/perl

use strict;
use warnings;

use File::Slurp;
use File::Find;

my $dir = shift;
defined $dir or die "usage: subscan.pl [DIR]\n";

find( \&show_subs, $dir );

sub show_subs {

    if ( -d ) {

        if ( $_ eq '.svn' ) {

            # don't traverse into subversion related directories
            $File::Find::prune = 1;
            return;
        }

        # for directories, only print path
        print "$File::Find::name\n";
        return;
    }

    -f            or return;  # if not a file
    /\.p(m|lx?)$/ or return;  # if not the right extension

    my $perl = read_file( $_ );

    my @subs = $perl =~ /^\s*sub\s+(\S+)/gm;
    @subs or return;

    print "  $_\n", map { "    $_\n" } @subs;
}

Also today

Code refactoring (overview)

Please post a comment | read 2 comments, latest by John Bokma | RSS feed