Perl code statistics with PPI

Sometimes there’s nothing better to do on a Sunday morning than read the Google Testing Blog. A recent post suggested that methods (or functions, subroutines, etc.) should be made shorter to make testing easier- because a short method does less than a longer one, it’s usually easier to test.

Normally I would cite improving readability and flexibility as the main reasons to prefer short methods, but ease of testing seems just as good. Code that’s difficult to read or difficult to test will always be difficult and costly to maintain. In fact right now, all over the world, there are thousands of maintenance programmers bent over in prayer reciting their litany: Please, write code that can be read rather than deciphered…

The post made me speculate on how long my methods are. The only code I’ve written recently that’s wholly my own work is a test harness in Perl. This is the confidential intellectual property of my employer and is busy delivering sustainable competitive advantage, so I can’t post it on this blog. Hopefully some statistics on the code won’t dilute shareholder value too much.

It turns out there’s a Perl module for doing just this sort of thing: PPI.There’s a cool introduction to PPI by Adam Kennedy (the module author) on perl.com. PPI makes doing something basic like finding the average length of subroutines very easy, so I hacked up a script to do just that.

My test harness script is a bit less than 1000 lines:

$ wc -l foo.pl
914 foo.pl

Running the sublength.pl script on it gave the following results:

$ ./sublength.pl foo.pl
Number of subroutines: 46
Length of longest sub: 50
Length of shortest sub: 4
Average length of sub: 15.76

My subroutines are pretty short with an average length of just 15.76 lines, which includes the header and the opening and closing braces on separate lines. As I’ve said before, I find long, deeply-nested methods difficult to understand, so I just don’t write them like that.

More generally, I like the idea of measuring the complexity or readability of a codebase with metrics. You could even analyse a repository commit-by-commit and see whether each commit has increased or decreased readability. Then you can have graphs showing the change in readability over time. Awesome! Sure, it sounds like overkill, but folks much smarter than me have emphasised the importance of readability:

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

This article has 2 comments so far!

  1. Brock says —

    Looks like the link to your code isn’t quite right.

  2. Carl says —

    Fixed, thanks!

Leave a Comment