Web app performance tuning

I’ve done a fair bit of performance optimization on server applications and device drivers, but not much with web applications, since I don’t normally do this type of development. Today was different.

The problem was that Roundup, our issue tracking system, was slow. Really slow. Loading a simple page could take up to 10 seconds. When it’s a tool you use all day, this starts to be annoying very quickly.

One of the many cool things about Roundup is that it can use a number of different storage backends- sqlite, mySQL, PostgreSQL, etc. We are using sqlite, and I suspected that since the number of issues is growing rapidly, we had outgrown it and needed to move to a “proper” database.

I decided to migrate to the MySQL backend, as it’s the database I’m most familiar with. I duly exported all the data from sqlite, set up a test installation of MySQL, and then imported everything into MySQL as described in the Roundup documentation.

I rushed to my browser to check the speed improvement, but it was pretty much the same. More thought was required.

I hypothesized that since Roundup is running on same server as our version control system, the slow page loads were being caused by the heavy disk IO associated with checkouts, updates, tagging and the like. I begged my friendly neighborhood system administrator for a spare machine and installed Roundup and MySQL on that.

The spare machine has dual 2.80GHz Xeons with Hyperthreading, 512KB cache and 2GB of memory, so it should have plenty of grunt for a little issue tracker like ours. Unfortunately, it didn’t improve the page load times either.

Hmmm…..

I installed the wonderful htop on the spare machine to get a better picture of what was going on. Requesting a page in my browser, I could watch the Roundup server process and mysqld using up massive gobs of CPU over the full 5-10 seconds that the request took to run. I’m thinking: what the hell is it doing?

I spent some time wandering around the Roundup source tree, but nothing seemed to leap out at me. I checked out the database using phpMyAdmin, but all the tables seemed to have indexes where they should be. Not that I would really know…

I decided it would help if I knew what queries were running on the database. A few minutes looking at the MySQL manual led me to this page that describes how to activate the general query log. When enabled, all queries run by msqld will be logged to a specified file. For some reason this must be done with a command-line parameter rather than using the configuration file, so I edited /etc/init.d/mysqld to pass the -l flag to mysqld when it starts.

This was a good trick. I requested one page. The resulting query log file was was 1.4 MB in size and full of select statements. Lots of select statements:

$ grep -i select query.log | wc -l
15492

So rendering a single page required the database to run 15,000 queries! That might, just might, explain the slow page loads.

Once I had the log, it was fairly easy to figure out what was generating all the queries. On the “issue” page, there is a drop-down menu that allows specifying the issue’s parent issue. Creating a parent/child relationship betwen issues is often useful to break down large units of work into smaller, more managable tasks. The problem here was that the drop-down menu was populated with the “title” of every issue in the system - all 2,500 of them. Due to the way the database abstraction works in Roundup, around 6 queries are necessary to retrieve each issue.

Commenting out the offending menu in the page template (Roundup uses the Template Attribute Language from Zope) reduced the number of queries required to display a page from 15,000 to 200. Page load times were correspondingly improved.

If I has more experience with web applications I would have been able to sort this problem out much faster. Using the MySQL query log was the key step that allowed me to understand the problem. It’s my own fault though- I disobeyed the first rule of performance optimization - measure first, then make changes. Don’t assume you know where the problem is.

UPDATE: Looks like this is a known issue.

Leave a Comment