Oregon Summer School: Further Notes
I've been putting off writing up the rest of the lectures from the Oregon Summer School because I seem to have lost my trusty notebook where all the brilliant things were intoned. But I guess it's gone on long enough; the world must know what happened at Oregon Summer School 2006.
Some of the presenters I mentioned before talked on more than one topic. Michael Hicks, after talking about futures, went on to discuss some work on "dynamic software update" or DSU. Hicks has developed a set of techniques for dynamically loading a new version of an existing program, and switching it over while it runs, with zero downtime. This includes translating old data representations to new ones. As I listened to the talk, I kept thinking, "OK, sure, we're going to design our code in this highly-restricted way, and then as long as we're willing to bend over backwards to do some tricky retrofitting, then hypothetically we might be able to update the code without downtime."
I was wrong! Hicks and his team have actually taken three years' worth of updates to sshd and vsftpd (amounting to a couple dozen updates, including some whole-number revisions) and updated running instances with each successive version, all without crashing or taking the server down. I was quite astonished that these techniques could be applied to changes that have already been made in the wild. Of course, they had to write some code to translate in-memory data structures on the fly—but they didn't have to re-architect the application to make it fit. Everyone in the seminar room woke up when Hicks showed the slide showing all the versions, with their dates, that had been dynamically loaded into these servers.
I would be interested to see whether these DSU techniques turn out to be a good software-engineering tradeoff in the long run. Most of the time, just having an extra machine to handle load while you bounce individual servers to the new version is a cheap way to get the same result. And you still have the challenge of writing your updates so that they're compatible on the wire: you can update sshd's internal structures on the fly, but updating the protocol might be more challenging. Also, to be slightly critical, sshd and vsftpd together make a pretty constrained class of software: slow-changing servers that mainly wait for connections and spawn off processes to handle them. Would this work for a more sophisticated system like a fancy real-time game system, where the gamers are actively interacting through the system?
Matthew Flatt argued for programming-language features inspired by OS features. The case was reasonably compelling: an IDE like DrScheme needs to run user programs in a special bomb-proof box, so that user programs can't impinge on the workings of DrScheme itself. This extends to lots of issues: device ownership, memory consumption, non-termination. Flatt argued for an abstraction called a "custodian" that manages all those resources together; killing the custodian frees up all the resources it manages. At the same time, he wants to enable sharing of data between programs, as an OS might allow. This makes the memory-management problem much harder, of course, since you need a policy for determining which custodian is "charged" for a block of memory, when it's shared between many. Flatt outlined a policy, whose details I didn't get, which apparently works in his setting.
Sandhya Dwarkadas talked about transactional memory from the hardware point of view. Unfortunately, her talk was pitched in the vocabulary of computer architects, so I didn't understand any of it! At a high level, what I took away was that transactional memory might be easy for processor makers to provide, by taking advantage of the cache-coherency systems that are already being included in multiprocessor machines.
Jeff Foster talked about another system for statically detecting race conditions, like Flanagan's for Java, but this time for C code. It amounts to a kind of pointer alias analysis, and the details are very complicated. A question that wasn't raised, which just occurred to me: Why was alias analysis necessary in C but not in Java? I think the answer will be that the Java system may assume that most access to data members are from within the class definition (and thus are not by reference).
Shaz Qadeer had the true misfortune of presenting last, after we'd patiently sat through 48 hours of lectures. For myself, I know I didn't follow his (or Jeff Foster's) presentation as closely as most of the others. Someone has to go last, I guess. Qadeer's presentation was on model-checking concurrent software. Some of the material he presented was basic model-checking stuff (like "What is LTL?") but he quickly jumped ahead to cover fancy techniques for state-space reduction. I'm always surprised when speakers do that. If you assume that I don't know the basics, then why do you expect me to absorb those and with some advanced material in one lecture? If you want to focus on the advanced stuff, then why not just say, "This is for people who already know X," and just give a quick refresher for X? The advanced students were probably bored while us newbies asked questions about LTL, and us newbies got bored once our intuition had been outstripped and we couldn't follow the lecture closely anymore.
All in all, the quality of the presentations at the Summer School was quite high. I was surprised that I could follow about 40 of the 48 hours of lectures, and got something out of almost every one (the previous 48 seminars I'd attended didn't have half that hit rate).
We also had a great time: Jim Allen's nightly walks around Eugene were quite nice, and we always ended up at a pub (if you like beer, they have lots of good ones in Oregon [my favorite: the Black Butte Porter, not to everyone's taste]). I met loads of people there and really enjoyed it. To PhD students in the US and abroad, I'd suggest you go to the Oregon Summer School in future years.