I am currently available as a freelance Senior Perl Programmer. Download my up-to-date resume (PDF)
John Bokma MexIT
freelance Perl programmer

Using Perl to process fields

Tuesday, February 22, 2005 | 0 comments

Kees de Koster asked in the Dutch newsgroup nl.comp.os.linux.programmeren a solution to his problem. Given the following text input:

19970522000000=1.68,1.68,1.63,1.66,50384
		19970523000000=1.66,1.68,1.63,1.68,51213
		19970526000000=1.68,1.68,1.61,1.68,28505
		19970527000000=1.68,1.68,1.63,1.68,52228
		19970528000000=1.66,1.68,1.63,1.68,118430
		19970529000000=1.66,1.68,1.61,1.68,260400
		19970530000000=1.66,1.66,1.59,1.63,151400
		19970602000000=1.63,1.68,1.59,1.68,408740

multiply the four numbers after the '=' character on each line with the number 4.

An awk solution was posted by Eric Moors and a Perl solution by Kees Pol. I replied to the latter posting with some recommendations / critique, and a very short Perl program based on the awk solution by Eric:

#!perl -naF[=,]
		printf "%14s=%0.02f,%0.02f,%0.02f,%0.02f,%d\n",
		    $F[ 0 ], ( map { $_ * 4 } @F[ 1..4 ] ), $F[ 5 ];

How the Perl program works

At the first line of the Perl program are three options specified that do most of the work, explained below:

The -n switch makes Perl read the contents of the file(s) specified after the script, i.e. it assumes the following loop around the script:

LINE:
		while ( <> ) {
		
		    ...  # the script goes here
		}

The -a switch turns on autosplit mode (works only in combination with -n or -p). The implicit split to the @F array is done as the first thing inside the implicit while loop as given above:

LINE:
		while ( <> ) {
		
		    @F = split( ' ' );
		    ...  # the script goes here
		}

The -F specifies the pattern to split on when the -a switch is used. In the above script I split on , or = using a character class. This breaks the lines, as given above, in 6 elements of which the second up to and including the fifth are multiplied by the number 4.

Note: I used %14s instead of %14i since the first value on each input line doesn't fit in an integer suitable for %d on the platform I use; Windows XP. When I used %14i, the number -1 was displayed in the first column of the result.

Related Usenet postings

Also today

Please post a comment | read 0 comments | RSS feed