Can One Write Readable and Maintainable Perl?

Perl’s flexibility helps you avoid writing superfluous code.

The answer to this simple but somehow controversial question is an emphatic yes! Unfortunately, there is a lot of bad Perl out there owing to Perl’s history of being the language of getting things done in the 90s. It is easy for a newcomer to feel overwhelmed by such examples.

One can avoid that feeling by basically only learning from Perl that does not look like gibberish.

I decided to learn Perl a little late. Or, maybe just at the right time. I had all the tools to learn good habits right from the get go.

This was soon after 5.8 was released. It was a nice coincidence, because I avoided most of the bad habits one might have picked up by first getting things done in Perl 4 and earlier.

By that time, we had modules, lexical filehandles, an object system, and many other niceties I take for granted today.

Looking back at history, it was the right time to try out Perl. As Larry says, Perl 5 introduced everything else, including the ability to introduce everything else.

That is, CPAN was already there.

Searching the web to find out how to parse CGI parameters would invariably lead one to awful, nasty, cargo-cult code such as:

# Don't use: Bad Perl example
for (split /&/, $ENV{QUERY_STRING}) {
   ($key,$val) = split /=/;
   $val =~ s/+/ /g;
   $val =~ s/%([0-9a-fA-F]{2})/chr(hex($1))/ge;
   $arg{$key} = $val;
}

Members of comp.lang.perl.misc pointed out I should use CGI.pm. The benefits of being able to use $cgi->param('username') were obvious.

When it came to generating HTML, one could find examples such as:

print <<HTML;
<table>
<tr><td>$key</td><td>$val</td></tr>
<table>

or, even

print table({-border=>undef},
           caption('When Should You Eat Your Vegetables?'),
           Tr({-align=>'CENTER',-valign=>'TOP'},
           [
              th(['Vegetable', 'Breakfast','Lunch','Dinner']),
              td(['Tomatoes' , 'no', 'yes', 'yes']),
              td(['Broccoli' , 'no', 'no',  'yes']),
              td(['Onions'   , 'yes','yes', 'yes'])
           ]
           )
        );

but, again, the advantages of being able to write

$template->param(vegetable_schedule => [
    {
        name => 'Tomatoes',
        breakfast => 'no',
        lunch => 'yes',
        dinner => 'yes',
    },
]);

that is, separating content from code, seemed clear.

When people claim Perl is basically -f>@+?*<.-&'_:$#/%!, they are not thinking of the Perl I learned with the guidance of people who had already solved many problems and shared them with the rest of us.

There have been vast improvements in the underlying machinery of Perl along with amazing new modules being released every day. Perl itself is up to 5.18. But, the fact remains that if you took advantage of all the ways Perl5 gave you to write maintainable code, this is just a natural progression.

The main principles of writing maintainable code in Perl are essentially the same as the principles of writing maintainable code in any other language. Start by avoiding global variables, and naming your variables, objects, functions, methods in meaningful ways. Declare and use everything in the smallest applicable scope.

Perl can be terse. I tend to think of that as the language mostly staying out of my way, stepping in only to do what I want.

For example, compare the following Java snippet:

Multiset<Integer> lengths = HashMultiset.create();
for (String string : strings) {
  if (CharMatcher.JAVA_UPPER_CASE.matchesAllOf(string)) {
    lengths.add(string.length());
  }
}

with the Perl code:

my $lengths = Set::Bag->new;
for my $string (@strings) {
    if ($string =~ m{A p{Uppercase_Letter}+ z}x) {
        $lengths->insert(length($string) => 1);
    }
}

The only real difference between them is the use of the regular expression match m{A p{Uppercase_Letter}+ z}x. I could have presumed the existence of CharMatcher->upper_case->matches_all_of and written something in the same spirit, but somehow, that felt similar to writing Integer.One.addTo(counter).

I could have stored lengths in a hash instead of using Set::Bag, but if I am going to carry out set theoretic operations on sets of frequencies of all caps words from other documents, I can write more maintainable code this way than if I had replicated methods from Perl’s FAQ list.

You could also write:

$lengths->insert($_ => 1) for
    map length,
    grep m{A p{Uppercase_Letter}+ z}x,
    @strings

which uses the much maligned topical variable $_ but arguably is still more readable than:

Function<string, integer=""> lengthFunction = new Function<string, integer="">() {
  public Integer apply(String string) {
    return string.length();
  }
};
Predicate allCaps = new Predicate() {
  public boolean apply(String string) {
    return CharMatcher.JAVA_UPPER_CASE.matchesAllOf(string);
  }
};
Multiset lengths = HashMultiset.create(
  Iterables.transform(Iterables.filter(strings, allCaps), lengthFunction)
);

As the source of the Java snippets also notes, you are better off with the for loop version. My point is that sometimes explicitly writing out everything does not help with readability. To be able to read the Perl version, you only need to know that:

  1. map transforms
  2. grep selects
  3. m{…} matches.

This is one example where having more than one way to do it works to improve readability of your code.

Perl is not the best language for everything. No language is perfect. What you use depends only partially on the actual features of a language. I only want to point out that well written Perl is readable.

When one is first learning a language, extra-verbosity might be useful. Some people might more readily understand a code example if everything is spelled out. Consider this C# snippet for reading a file line-by-line:

string line;

// Read the file and display it line by line.
System.IO.StreamReader file =
   new System.IO.StreamReader("c:\test.txt");
while((line = file.ReadLine()) != null)
{
   Console.WriteLine (line);
}

file.Close();

The equivalent Perl code is:

use autodie;
open my $file, '<', 'c:\test.txt';

while (my $line = <$file>) {
    print $line;
}
close $file;

System.IO.StreamReader might seem clearer than using '<' as the second argument to open. The IDE will help you write that code with the fewest possible keystrokes. But, when you are looking at a wall of code consisting of such class and method names, will you be able to pick the parts of the code that really matter?

Perl’s flexibility helps you avoid writing superfluous code.

How can you write readable Perl?

Spend time on examples you can read. It can be fun to decipher cryptic code, but it is easier to gain good habits by emulating what you find readable. Just as you wouldn’t want to learn good C style from IOCCC entries, JAPHs are not good examples to use in production code.

Do try Perl::Critic. Do read Perl blogs, check out the Perl area on Stackoverflow and think about how to improve others’ code. Follow PerlbuzzPerl WeeklyEffective Perl, and other sources of timely Perl related information.

When in doubt, ask. Learn from others’ experiences.

tags: ,