Perl programmer for hire: download my resume (PDF).
John Bokma's Hacking & Hiking

Fixing broken links in my RSS feed

August 28, 2016

Today, when reviewing the HTML site stats report made using GoAccess of this web site's current access log I noticed blog related URLs to the /blog/ part in the URL. Oops.

So I checked the Perl program to which I added the feed creation code yesterday and found the issue in this code snippet:

    $rss->add_item(
        title       => $_->{ title },
        link        => URI->new_abs( $_->{ url }, $home )->as_string(),
        description => $_->{ desc },
        dc          => {
            date => "$_->{ date }T23:00Z",
        }
    ) for @$entries;
}

which is part of the write_rss sub in my program.

Since $home has the value of http://johnbokma.com/ assigned to it, and the value associated with the url key of each item in the array $entries is a reference to is relative to http://johnbokma.com/blog/ the blog part is not present in the value associated with the link key. In short, I messed up, for which my apologies.

I fixed this code by introducing a new scalar variable, $blog, and assigning to it an absolute URL referring to the location of my blog, to which the value associated with the url key is relative. Converting this to absolute value using `URI->new_abs()' provides the correct value.

The modified code, which uses the XML:RSS module, is as follows:

sub write_rss {

    my ( $filename, $entries ) = @_;

    my $home = 'http://johnbokma.com/';
    my $blog = URI->new_abs( '/blog/', $home );
    my $rss = new XML::RSS( version => '1.0' );
    $rss->channel(
        title       => 'John Bokma - freelance Perl Programmer',
        link        => $home,
        description => 'John Bokma, a freelance Perl programmer'
            . ' living in Mexico',
    );

    $rss->add_item(
        title       => $_->{ title },
        link        => URI->new_abs( $_->{ url }, $blog )->as_string(),
        description => $_->{ desc },
        dc          => {
            date => "$_->{ date }T23:00Z",
        }
    ) for @$entries;

    $rss->save( $filename );
    return;
}

To be honest, the actual problem is that I store relative URLs and pass those around. It's likely better to always use absolute URLs, and make them relative where it makes sense. I plan to change this when I rewrite the program, as what I currently use is mostly an ugly wrapper around Pandoc, which converts the Markdown files I write to HTML for this blog.