Hand coding an RSS 2.0 feed in Perl
October 9, 2019
One requirement I have for tumblelog is that
everything the Perl version generates is identical to everything the
Python version generates. This means that sometimes I have to hand
code a function that is available in a library for, say Python, but
works differently in a Perl library.
When I was working on adding an RSS feed to the Perl version I decided
to use XML::RSS at first. But I didn't like the generated output,
also because I had little to no control over it. Next I have
XML::Writer a spin, but couldn't match the output of Python, using
lxml.etree. So I decided to hand code both the Perl and the Python
version as it's a simple feed, and XML::Writer just added a few
wrapper functions.
my @MON_LIST = qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec );
my @DAY_LIST = qw( Sun Mon Tue Wed Thu Fri Sat );
sub create_rss_feed {
    my ( $days, $config ) = @_;
    my @items;
    my $todo = $config->{ days };
    for my $day ( @$days ) {
        my ( $url, $title, $description )
            = get_url_title_description( $day, $config );
        my $end_of_day = get_end_of_day( $day->{ date } );
        # RFC #822 in USA locale
        my $pub_date = $DAY_LIST[ $end_of_day->_wday() ]
            . sprintf( ', %02d ', $end_of_day->mday() )
            . $MON_LIST[ $end_of_day->_mon ]
            . $end_of_day->strftime( ' %Y %H:%M:%S %z' );
        push @items, join( '',
            '<item>',
            '<title>', escape( $title ), '</title>',
            '<link>', escape( $url ), '</link>',
            '<guid isPermaLink="true">', escape( $url ), '</guid>',
            '<pubDate>', escape( $pub_date ), '</pubDate>',
            '<description>', escape( $description ), '</description>',
            '</item>'
        );
        --$todo or last;
    }
    my $xml = join( '',
        '<?xml version="1.0" encoding="UTF-8"?>',
        '<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">',
        '<channel>',
        '<title>', escape( $config->{ name } ), '</title>',
        '<link>', escape( $config->{ 'blog-url' } ), '</link>',
        '<description>', escape( $config->{ description } ),'</description>',
        '<atom:link href="', escape( $config->{ 'rss-feed-url' } ),
        '" rel="self" type="application/rss+xml" />',
        @items,
        '</channel>',
        '</rss>',
        "\n"
    );
    my $path = $config->{ 'rss-path' };
    path( "$config->{ 'output-dir' }/$path" )
        ->append_utf8( { truncate => 1 }, $xml );
    $config->{ quiet } or print "Created '$path'\n";
    return;
}
The code is quite straightforward. The calculation of the publication date is explained in RFC #822 and RFC #3339 dates in Perl.
The escape function is also hand coded to make sure its output is
identical to the escape function in Python's html module:
sub escape {
    my $str = shift;
    for ( $str ) {
        s/&/&/g;
        s/</</g;
        s/>/>/g;
        s/"/"/g;
        s/'/'/g;
    }
    return $str;
}
It uses a for loop to alias $str to $_. As the s operator
defaults to $_ this saves some typing and in my opinion is more
clear.
Note that a single quote is replaced with a hexadecimal code instead
of ' for maximum compatibility, see also Character entity
references in HTML
4.
If you are interested in the rest of the source code you can download it from GitHub.