rroller 0.1.0 released

Way back in 2001 I was in the third year of my Computer Engineering degree, and one of the better courses that year was on network programming. The lectures were kinda lame- I don’t need someone to explain to me how to use the socket API- but the course textbook was W Richard Stevens’ classic Unix Network Programming, Vol 1. Cool! Unlike most textbooks, this was much used in the first couple of jobs I had after uni.

Also cool was the major assignment we had to complete for the course. I don’t remember the exact requirements, but I think it boiled down to “write a program that uses TCP/IP”. Around the same time I’d stumbled across Jeff Lander’s demo billiards game that accompanied his article in the September 1999 issue of Game Developer Magazine. I and a few other guys hacked some very dodgy multiplayer networking into it. There were still only two balls in the game (it’s just an example, remember) but the networking worked well enough for a 2 minute demo in front of the lecturer so I think we got pretty good marks.

I recall the project being a great learning experience as one of the guys in the group was Simon Ratner, a very smart guy and great programmer and I certainly picked up a few things by comparing the code I wrote and the code he wrote to replace it.

After recovering from the semester I started thinking it would be fun to write a proper pool game with advanced features like 3D physics and more than two balls. I made a start on the physics code using the techniques described in the CMU lecture notes on physically based modeling, but abandoned the project in early 2002.

A few months ago I thought it would be worth putting together whatever I had and releasing it as a GPL physics demo- no pool game yet unfortunately. This ended up taking longer than the original work in 2002, but at least I have something to show for it. I’ve put up some screenshots, tarballs and RPMs here.

Writing rroller was fun, but it’s going to be the last totally useless coding side-project I do. The next one will be more ambitious and hopefully good for more than keeping me amused on Sunday afternoons.

If only lawyers could write Perl modules

With so many powerful programming languages freely available, it’s very common for large software systems to use more than one. Write some C, use a scripting language and do some database access and there’s three already. Even if the deployed code is only in one language, test scripts and harnesses often use another. Multiple languages are a good thing if it means the right tool is used for the right job.

But there are annoyances. In particular, violations of the Single Point of Truth (SPOT) rule are common. For example, here’s a C++ enum containing error codes:

enum FooErrors
{
    FOOERR_OK = 0,
    FOOERR_FILE_NOT_FOUND = 1,
    FOOERR_IT_JUST_BORKED = 2,
    // further constants follow
};

If the same constants are used by a part of the system written in a different language, the cheap and cheerful solution is just to declare them again:

class FooErrors
{
    public static final int FOOERR_OK = 0;
    public static final int FOOERR_FILE_NOT_FOUND = 1;
    public static final int FOOERR_IT_JUST_BORKED = 2;
    // contains all the same constants as in C++
}

This is fine as far as it goes, and I’ll admit that most developers are dealing with bigger problems than a few duplicate declarations. But if the constants are used by more than two languages, keeping everything in sync become a maintenance burden.

I’d like a simple script that acts as a sort of poor-man’s IDL compiler, reading a text file containing names and values and spitting out nicely-formatted declarations in a variety of languages. A new constant would be added by updating the text file, running the script and committing the modified source files to the nearest version control system.

This seems like such an obvious thing to do that I was sure I’d find a Perl module or seven on CPAN to do it. But I couldn’t find anything.

What I did find were two patents describing almost exactly this idea. The first, US Patent 7143400 titled Configuration description language value management method and system, contains this in the summary:

… the present invention fills this need by providing a method and a system for centralizing the maintenance of name value pairs for defining constants and properties used by different portions of a program, where the different portions are of a different programming language type.

The second, US Patent 6964038 titled Constant values in mixed language programming environments, is described as:

a method of and apparatus for maintaining consistency between header files for differing computer program languages. More particularly, the invention relates to automatically generating one or more header files in a programming language based on a header file in a different programming language.

The assignee of the first patent is Sun Microsystems, Inc and the assignee of the second is the Hewlett-Packard Development Company, L.P..

I haven’t read the patents thoroughly, so I guess there could be some patent-worthy ideas in them. Maybe. The thing that irks me is that for what it cost in lawyers to file these two patents you could build the finest constant-generating system the ‘Net has seen, supporting a bunch of languages with all the bells and whistles. And you might just get something useful for your money.

If you know of a good free tool- potentially infringing or not- for this simple problem, please comment.

Perl code statistics with PPI

Sometimes there’s nothing better to do on a Sunday morning than read the Google Testing Blog. A recent post suggested that methods (or functions, subroutines, etc.) should be made shorter to make testing easier- because a short method does less than a longer one, it’s usually easier to test.

Normally I would cite improving readability and flexibility as the main reasons to prefer short methods, but ease of testing seems just as good. Code that’s difficult to read or difficult to test will always be difficult and costly to maintain. In fact right now, all over the world, there are thousands of maintenance programmers bent over in prayer reciting their litany: Please, write code that can be read rather than deciphered…

The post made me speculate on how long my methods are. The only code I’ve written recently that’s wholly my own work is a test harness in Perl. This is the confidential intellectual property of my employer and is busy delivering sustainable competitive advantage, so I can’t post it on this blog. Hopefully some statistics on the code won’t dilute shareholder value too much.

It turns out there’s a Perl module for doing just this sort of thing: PPI.There’s a cool introduction to PPI by Adam Kennedy (the module author) on perl.com. PPI makes doing something basic like finding the average length of subroutines very easy, so I hacked up a script to do just that.

My test harness script is a bit less than 1000 lines:

$ wc -l foo.pl
914 foo.pl

Running the sublength.pl script on it gave the following results:

$ ./sublength.pl foo.pl
Number of subroutines: 46
Length of longest sub: 50
Length of shortest sub: 4
Average length of sub: 15.76

My subroutines are pretty short with an average length of just 15.76 lines, which includes the header and the opening and closing braces on separate lines. As I’ve said before, I find long, deeply-nested methods difficult to understand, so I just don’t write them like that.

More generally, I like the idea of measuring the complexity or readability of a codebase with metrics. You could even analyse a repository commit-by-commit and see whether each commit has increased or decreased readability. Then you can have graphs showing the change in readability over time. Awesome! Sure, it sounds like overkill, but folks much smarter than me have emphasised the importance of readability:

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

Springysim 0.1.0 release

Springysim is an interactive 2-D physics simulation that I wrote in 2000 for a university assignment. I did a bit more work on it in 2003 after I graduated, always intending to make it publicly available. I finally pulled my finger out and got it done. If you do nothing else, wander over to the screen shots page and check out what it can do.

Springysim is wholly useless for any practical activity, but I think most programmers can relate to the flash of inspiration where you think: “Holy Crap! I just have to write a program that does that!”

How to really write an assert macro

Charles Nicholson has a great post up over at Power of Two Games on how to write a proper assert macro in C++.

I’m a big fan of C++, but the fact that Charles has managed to write a 1500 word article on how to write an assert macro is kinda worrying. It never occurred me before, but maybe programmers like C++ because its complicated, not in spite of it.

I hate nesting (not the bird kind)

I’ve been re-reading a few of my favourite programming books over the last couple of months, and I just have to mention this part of The Practice of Programming, by Brian Kernighan and Rob Pike. Not because it’s particularly earth-shattering, but simply because it’s the best explanation I’ve read of how to layout if/else control structures properly.

Sound trivial? It should be, but poor control structure layout is one of my pet hates. I see this sort of thing so often in real code and it grates every time:

if(argc == 3)
if((fin = fopen(argv[2], “r”)) != NULL)
if((fout = fopen(argv[2], “w”)) != NULL) {
while((c = getc(fin)) != EOF)
putc(c, fout);
fclose(fin); fclose(fout);
}
else
printf(”Can’t open output file %s\n, argv[2]);
else
printf(”Can’t open input file %s\n”, argv[1]);
else
printf(”Usage: cp inputfile outputfile\n”);

Actually, I’m being too generous. Imagine the same control structure spread over 200 lines and your’re getting close to the monsters I’ve slain in the past. Kernighan and Pike have this to say about the code:

The sequence of ifs requires us to maintain a mental pushdown stack of what tests were made, so that at the appropriate point we can pop them until we determine the corresponding action (if we can still remember) .

The authors suggest that the correct way to lay out the code for maximum readability is instead:

if(argc != 3)
printf(”Usage: cp inputfile outputfile\n”);
else if ((fin = fopen(argv[1], “r”)) == NULL)
printf(”Can’t open input file %s\n”, argv[1]);
else if ((fout = fopen(argv[2], “w”)) == NULL) {
printf(”Can’t open output file %s\n”, argv[2]);
fclose(fin);
}
else {
while((c = getc(fin) != EOF)
putc (c, fout);
fclose(fin);
fclose(fout);
}

Note how inverting the boolean tests (== becomes !=) allows the code to be greatly simplified. I find that if this technique is used in conjunction with short, well contained functions, it’s often not necessary to use else at all. Instead, if a test fails, the function returns immediately.

Kernighan and Pike provide a well-worded summary of the principle:

The rule is to follow each decision as closely as possible by its associated action. Or, to put it another way, each time you make a test, do something.