Day 3, LCA 2008

linux.conf.au 2008 is now old news. But I’ve got all these notes lying around, and I’m not letting them go to waste.

Wednesday at linux.conf.au started off with Bruce Schneier’s keynote on Reconceptualising Security. Since the talk hit Slashdot about 3 hours later, I won’t rehash the contents, which you can now watch online. As an avid reader of Bruce’s blog, I was already familiar with some of the ideas he mentioned, such as the very cool lemon market explanation for poor quality security products.

The main theme of the talk was that there are conceptually two parts to security - there’s the feeling of security, and then there’s the reality of security. Although Bruce has spent plenty of time denigrating security products that provide the feeling and not the reality, he presented a more balanced view this time. He made a good case that just making people feel more secure is often important, mostly because humans can be pretty poor at evaluating the seriousness of threats, and overestimating dangers can cause as many problems as underestimating them.

The first session after the keynote on both Wednesday and Thursday was a tutorial. I’ve seen Shane Stephens present at SLUG before, and the stuff that he and others have been doing on Annodex is very cool so I fronted up to his tute on “Building a video remixing web-site using Annodex“.

Shane demonstrated how to build a simple web app that allows non-destructive editing of video using the Annodex browser plugin. He concentrated on the client side combining CSS, HTML and Javascript to implement transport and editing controls - playing clips, marking the end points of clips, creating sequences of clips and displaying a progress bar. Shane did a great job of making the presentation easy to follow, even for someone as web developmentally-challenged as me. The server side was implemented using Pylons and Mako templates. Easily-digestible portions of the Python code were shown at strategic junctures which helped in understanding how the whole application hung together.

One thing I particularly liked was the separation of concerns used in the design of the Annodex plugin. For example, it doesn’t provide any controls for the user, but any kind of crazy control can be created by a designer. An audience member asked if a default set of controls would be provided for developers who don’t want to build their own- Shane said this may happen, but it would simply be a Javascript library, not a part of the plugin itself.

On a slight tangent, towards the end of the tutorial this very cool SVG demo from Chris Double was mentioned. You’ll need to download a developer release of Firefox from here if you want to check it out.

After lunch it was back to kernel-land with Jonathan Corbet of LWN presenting the now-traditional Kernel Report. A lot of the material would be familiar to regular LWN readers, including work done by Greg Kroah-Hartmann and Jonathan looking at the people and companies contributing patches to the kernel. A good summary of this can be found here.

What blew me away were the numbers describing the pace of kernel development over 2007. In that year just passed, 30,100 changesets were merged into the kernel that changed over 2 million lines of code. 750,000 lines of code were added, a rate of over 2,000 lines per day. It would be interesting to know how those numbers stack up against other large open source codebases (and certain proprietary operating systems), but I doubt there are many projects sustaining such a rapid pace.

Jonathan gave a brief summary of the experimentation with the kernel development process that’s occurred in recent years, and I don’t think ther’s much argument that process improvements are one of the main factors allowing such a high rate of change. From time to time it is still claimed that for Linux to become a real contender it needs a stable binary module API, and I think Linux pays price for not having one - but not having one is another reason why things can move fast.

That, and the total 1337ness of the core kernel developers :-)

During question time, it was asked if the rate of patches meant that another “Linus doesn’t scale” problem may be imminent. Jonathan thought not: “Linus at this point seems to be living a pretty easy life to the point where he will go off and flame people on the Git list instead.” However, he thinks it’s possible that there may be “Andrew Morton doesn’t scale” issues because he does a lot of the integration work. Apparently some people are researching cloning to deal with this.

After Andrew Tannenbaum’s fantastic keynote on Minix 3 last year, it was good to see that microkernels did not slip from the agenda in 2008. Gernot Heiser from Open Kernel Labs took the opportunity to present a talk entitled Do Microkernels Suck?. This was a response to a paper [pdf] presented by Christoph Lameter at the Ottawa Linux Symposium in 2007: Extreme High Performance Computing or Why Microkernels suck.

The title of Christoph’s talk is quite glib and it’s not a general denunciation of microkernels. The most important claim in the paper is that a microkernel architecture will prevent an OS from scaling to the very large systems of up to 4000 processors on which Linux now runs. Christoph was intimately involved in work done at SGI to allow Linux to scale to these machines, and in the conclusion of his paper says that:

A monolithic operating system such as Linux has no restrictions on how locking schemes can be developed. A unified address space exists that can be accessed by all kernel components. It is therefore possible to develop a rich multitude of synchronization methods in order to make best use of the processor resources. The freedom to do so has been widely used in the Linux operating system to scale to high processor counts…

It seems that microkernel based designs are fundamentally inferior performance-wise because the strong isolation of the components in other process contexts limits the synchronization methods that can be employed…

Linux would never have been able to scale to these extremes with a microkernel based approach because of the rigid constraints that strict microkernel designs place on the architecture of operating system structures and locking algorithms.

To me this seems like a pretty strong claim in a paper that was written after working on Linux. Gernot rebutted the claim first with a series of graphs showing the Tornado microkernel scaling very well up to 16 processors. These results were presented in a paper published in 1999, so they’re somewhat dated now. I don’t think that showing a microkernel that scales to 16 processors necessarily proves anything about whether it could scale to 4096. YMMV. Gernot also argued that

  1. Synchronization in a well-designed system is local to subsystems
  2. There is no reason why subsystems can’t share memory, even if microkernel-based

Although I guess a system that employed (2) would no longer be a “strict microkernel design”.

My take on this dustup is basically this: Christoph describes the process of making a operating system scale as one of incremental improvement - find a bottleneck, fix it, repeat. Since the most recent paper Gernot had on this was from 1999, it seems that no-one has put a whole lot of effort into doing this with a microkernel. Once they have, we’ll know the answer either way. Right now the arguments are just a bit too hand-wavy on both sides.

Still, it was a fun presentation and although there was quite a bit of giggling from the back of theater throughout the talk (some people apparently find microkernels vastly amusing), the applause at the beginning and end of the talk was very enthusiastic. Although it’s a Linux conference, people seem to appreciate talks on more general OS topics as well.

For the last session of the day I really wanted to see Timothy Terriberry explain the inner workings of the Ogg Theora video codec, but a fire alarm went off about 5 minutes in. After evacuation I wandered over to hear Carl Worth and his unscheduled co-presenter Eric Anholt talk on X Acceleration that finally works.

The first part of the talk outlined work done over the past few years to allow the X Render extension to take advantage of the features provided by graphics hardware for accelerating 2D operations such as fills, alpha blending and scaling. 2D acceleration was originally provided in X by XAA, the XFree86 Acceleration Architecture, but this didn’t provide the operations that modern desktop applications require. Later, the X Render Extension was written by Keith Packard in an afternoon to provide image compositing operations but did not allow for hardware acceleration.

Acceleration designed to support X Render was first implemented in KDrive, an experimental X Server as KAA- Keith’s Acceleration Architecture. This was ported to the standard (then XFree86) X server by Eric to become EXA - Eric’s Acceleration Architecture. Carl complained that he was spending “Way too much time talking about history”, but I found it interesting and definitely required to put the later parts of the talk in context.

After the background material, the talk covered recent work to improve the performance of the Intel i965 graphics chipset driver. Intel has really come to the party on this one, releasing the required technical documents to Redhat under an NDA.

This was very fortunate as a ton of work has been done on this driver since. It was based on earlier Intel graphics drivers, but apparently the differences are large enough that good performance requires a quite different approach. To this end, Dave Airlie converted the i965 driver composite operation to use TTM, an in-kernel memory manager for graphics device memory, and batch buffers, a method of queuing multiple commands for efficient execution by the graphics device. This involved major modifications to the driver but unfortunately resulted in a major decrease in performance. However, Eric realized recently that this is caused by overly enthusiastic cache-flushing and we should see major performance increases soon.

Other operations apart from composite are already performing well. Carl showed a simple demo application written by Keith in his very own Nickel language that showed a big increase in speed for rescaling an image of a Penguin.

Overall this talk was entertaining and useful, although I found it difficult to keep up at times since my knowledge of graphics hardware and X server architecture is limited. At least I now know what a GART is.

Leave a Comment