Day 2, LCA 2008

Full on. Today felt like about 3 three conference days in one. Between the Distro Roundup, the Kernel Mini-conf Lightning talks, the Kernel Panel discussion and other sessions I must have heard close to 20 people speak.

A lifetime ago at 8:30 this morning I sat down at breakfast across the table from Paul McKenney. Now to me he seemed like just a J. Random Bearded Hacker, but he’s actually the main guy behind the RCU implementation in the Linux kernel. Val Henson introduced him this afternoon at the kernel lightning talks as “one of the best computer science researchers I know”. Apparently he’s already done too many talks on RCU, so to get on the conference schedule this years he’s talking on his involvement in adding concurrency to the terribly exciting C++0x standard. Now that’s one talk I will be attending. Don’t you wish you were at LCA now?

I was thus slightly late for the first session of the day as I lost track of time chanting “We are not worthy” while Mr McKenney was trying to eat his cereal. I wandered into the Distro Roundup where community members representing various distros gave an overview of the history and current status of their distribution. Representatives from Oracle, Mandriva and Gentoo gave useful reports in the time that I was present. Mr Debian spent some time talking about the difficult political/ideological issues that have caused friction within the Debian community - how to deal with firmware “binary blobs” and the status of documentation covered by the GNU Free Documentation License. Binary blobs are not just an issue for Debian, but because of the project’s strict adherence to the Debian Free Software Guidelines, they have taken the problem very seriously and now will not ship such non-free firmware. Similarly, Debian regards the GNU FDL as a non-free license. It was clear from the talk that not all members of the community agree with these decisions, so the controversy could continue in spite of the current policies.

After morning tea I stuck with the Distro summit to hear Shane Owenby, Senior Director for Linux and Open Source at Oracle talk on “Why would a large corporate create their own distro?” I should probably have migrated to the Kernel Mini-conf at this point but Shane was an engaging speaker and it was interesting to hear about Oracle’s goals for their Linux products apart from making money. Oracle wants to promote the adoption of Linux in the data centre by lowering the barriers to entry, which given the size and scope of their customer base they’re uniquely positioned to do. Shane engaged in some lively discussion with Bdale Garbee on Oracle’s Premier Backporting service. Bdale’s question, I think, concerned how Oracle can backport fixes to stable releases when other ISVs will only guarantee their applications on certain (unpatched) Oracle Enterprise Linux versions. No clear answer was given to this.

These and other discussions made Shane’s talk go overtime, so Jonathan Oxer didn’t have time for the full version of his very useful talk on Release Monkey. Simplifying, this is a set of scripts to help build packages for more than one distribution. This is a very common problem for small ISVs who want to distribute their products for Linux, as the time and cost in building for multiple distros can be prohibitive. I’ve stumbled over Release Monkey before when I was looking for a solution to just this problem for one of my previous employers. We were attempting to distribute a single product for Suse, Redhat 9, Debian 3.0, etc, etc and it was not a pleasant experience. James cooked up a system that worked pretty well, but I think there is a real need for a ready-made, full-featured tool for this task.

Jonathan emphasised that one of the main problems when packaging for multiple distros is that there’s no good way to capture the metadata required- stuff like package dependencies, version numbers, build instructions, etc. Release Monkey has adopted the (hackish) solution of using the Debian metadata and munging it for other distros. In our case, we maintained separate files for each type of package - .spec files for building RPMs and control/rules files for building Debian packages. This obviously introduced some maintenance overhead. Jonathan suggested that the ideal solution would be to define a distro-agnostic metadata format, but little progress has been made on this so far.

At this point I’d had my fill of distro-talk so I wandered over to the Kernel Mini-conf hoping to hear Arnd Bergmann talk on “How not to invent kernel interfaces”, but his talk had been moved to 9:15 so I lucked out. Instead I listened to Jörn Engel speak on “Cache-efficient Data Structures”. This is a very interesting topic but since I missed the start of the talk, I couldn’t quite follow the comparative performance numbers he had on his slides. There were a few interesting comments from the audience, including from Dave Miller and Linus (no link required). Dave is the kernel networking maintainer and knows a few things about hash tables as they are used extensively in the network subsystem for stuff like holding socket descriptors. Discussion followed on the problems involved in resizing hash tables. Currently several (large) hash tables are allocated at kernel boot time in one of two sizes, depending on the memory installed on the system. Some thought has been given to making these re-sizeable at runtime to allow for both minimal memory usage and best performance, but synchronization issues make this very difficult. It sounds like there’s a fun project here for anyone who’s game enough.

After lunch I stuck with the Kernel Mini-conf to hear Jesse Barnes from Intel’s Open Source Technology Center talk on “Enhancing Linux Graphics”, or alternatively “Why Graphics on Linux suck and what we are doing about it”. Jesse described some of the major enhancements that are taking place to rationalize the motley assortment of software components involved in graphics on a Linux system- the kernel fb layer, DRM, X, Mesa, DirectFB, etc. This work (described here) will enable graphics without X, since things like modesetting will be handled by the kernel. From comments made by Dave Airlie, this is something of a holy grail for the graphics guys. Perhaps more importantly, Jesse’s work will finally allow displaying a “Blue Penguin of Death” when a kernel oops occurs, the absence of which has long hampered Linux’s ability to compete with rival operating systems.

Next up was Joshua Root from Gelato UNSW talking on “The state of the Elevator I/O scheduling in Linux”. The Gelato guys want to create documentation to help system administrators choose and tune an IO scheduler. Obviously, the performance of the 4 different schedulers in the kernel varies greatly with different load profiles. In particular Gelato have been looking at IO scheduler performance when software and hardware RAID are in use. Along the way they have found (and fixed) a number of bugs in the schedulers.

One thing I didn’t realize is the number of tools available for doing this kind of performance analysis on Linux. The blktrace tool (built into the kernel) can record everything that is happening in the block layer for later analysis using btt, the block trace timeline tool. btreplay can replay an event trace recorded with blktrace, or iomkc can be used to generate a Markov chain model of the trace so that workloads can be reproduced (or emailed) in kB rather than GB. Joshua showed some graphs (Yay!) of his performance results. Interestingly, while the more complex schedulers (anticipatory and CFQ) give better throughput in most situations, the simpler schedulers can give much lower average latency in some tests. As with much performance analysis, “it depends”.

This blog post has now dragged on far too long, and I still haven’t covered the very interesting kernel lightning talks or the kernel developer’s panel. I’ve got extensive notes on both, but they’ll have to wait.