[UPDATE: The broken link to "The Essence of Form Abstraction" has been fixed. Sorry!]
For the first time in three years of research, I'm really proud of a paper that I worked on with my research group: The Essence of Form Abstraction, by Ezra Cooper, Sam Lindley, Philip Wadler and Jeremy Yallop, to be published in APLAS 2008. The paper shows how you can create compositional HTML forms which package up their data in any desired type. It shows an OCaml implementation but we also use the idea in our web-centric language, Links.
Though this is the first good group work that I've contributed to, the others have been doing some great work. In particular, Jeremy, Sam, and Phil did some nice work characterizing the expressive power of idioms, arrows, and monads. They first defined a metalanguage, like Moggi's monadic metalanguage, which can encompass all three notions, in The Arrow Calculus, by Sam Lindley, Philip Wadler and Jeremy Yallop (unarchived). Then they showed the hierarchy of the three, under which idioms are the loosest interface but therefore their language is the strictest to program with, and monads are the strictest interface, therefore their language is the most free, and arrows can be seen as lying in between. The work is described in Idioms are Oblivious, Arrows are Meticulous, Monads are Promiscuous (ibid.), MSFP 2008.
To summarize this, you can see the arrow calculus as a syntax like the do notation of Haskell:
do y <- f x
z <- g y x
return (h x y z)
Now to put the result crudely, monads allow expressions like the above, where the value bound at each step can be used in all future steps (this makes them "promiscuous"). The use of arrows restricts this so that the value bound at each step can be used only at the next step, as in:
do y <- f x
z <- g y x
return (j x z) -- y cannot be used here
You can still use arrow operations to manually pass data forward through each step, but this is a nuisance. Hence Lindley, Wadler and Yallop coined the term "meticulous" to describe arrows.
The idiom laws restrict this further so that each value can be used only in a final return clause (the return clause is "oblivious"):
do y <- f x
z <- k x -- y cannot be used here
return (h x y z)
An astute reader will note that return is not required in do syntax. What happens if there is no return clause? Then you can add one:
do y <- f x
z <- k x
q
is equivalent to
do y <- f x
z <- k y x
result <- q
return result
But either way the idiom laws require that q cannot depend on y or z.
This shows that the three notions progressively restrict (and fairly naturally) the ways that you can program with them. But, in a familiar appearance of contravariance, the notions are reverse-ordered in how restrictive they are to create. An idiom, being the most restrictive on its user, is least restrictive to its writer. Thus you can model some computational notions as idioms which you could not model as monads. This fact came in handy in our latest paper, where we used idioms to model form abstraction.
I encourage you all to read about idioms, arrows and monads, as it's turning out to be a very nice theory.
SQL keeps turning out to be more sensible than I thought it was.
If you're like me, you thought that GROUP BY clauses partitioned your columns into "grouping columns" and "non-grouping columns" and that the MUST to be processed with aggregate functions (count, sum, max, min, average) and that the second MUST NOT be processed that way. This would have been a problem for converting Peyton Jones and Wadler's Comprehensive Comprehensions to SQL, since that technique uses a ncie uniform treatment of group-by clauses: they transform all the query's bound variables to list type for the remainder of the query. Example—here's a simple query that takes a table, groups it on values from columna a, and returns pairs of the value from a and the sum of the list of corresponding column b values:
select a, sum(b) from t group by a
Fine and simple. And as you may know, the following produces an error:
select a, b from t group by a
Because we've grouped on a, we can't just select column b within each grouping because there are multiple column b values. The DBMS will insist that we use an aggregate function to convert this collection to a primitive-type value. Such are the restrictions of the relational model.
But Peyton Jones and Wadler's proposal simply treats all columns, when grouped, as lists. So
[(a, b) | (a, b) <- t, group by a]
is a perfectly legal expression of type [([A], [B])] (letting A and B be the raw column types of a and b, resp.). What they expect you to do, to make SQLizable expressions, is to apply some other function to reduce each column, a typical example being
[(the a, sum b) | (a, b) <- t, group by a]
which corresponds to the first SQL query above: it collects the values from a and matches these with the sums of corresponding values from b. But the above syntax allows you to do things like this:
[(sum a, sum b) | (a, b) <- t, group by a]
which seems to correspond to SQL that applies an aggregate function to a grouped column as well as an ungrouped column. "Whither SQL!?" I worried! Yet all was not lost. SQL treats grouped columns as being alternately bag-typed or primitive-typed, depending on the context. So the following is a valid query:
select sum(a), sum(b) from s group by a
Huzzah! This shows SQL to be in closer accordance with a sensible, uniform, orthogonal languge such as Peyton Jones and Wadler's than I'd imagined. In other ways, too, I keep finding it easier and easier to make SQL queries out of difficult FP expressions by using little-known SQL features. Cheers, guys.
That's Microsoft, as always, blending the sublime and the ridiculous.
(Thanks also for the bogus etymology of "covariance": "In mathematics, covariance is a measure of the degree to which two variables move up or down together. The term was co-opted by the OO crowd to describe..." Way to overlook category theory, guys!)
(Salad with) Steve Jenson lays it down for you: the null vs. undefined nonsense in JavaScript. Better recognize.
Several recent language designs have offered a unified language for programming a distributed system; we call these “location-aware” languages. These languages provide constructs that allow the programmer to control the location (the choice of host, for example) where a piece of code should run, which can be useful for security or performance reasons. On the other hand, a central mantra of web engineering insists that web servers should be “stateless”: that no “session state” should be maintained on behalf of individual clients—that is, no state that pertains to the particular point of the interaction at which a client program resides. Thus far, most implementations of unified location-aware languages have ignored this precept, usually keeping a process for each client running on the server, or otherwise storing state information in memory. We show how to implement a location-aware language on top of the stateless-server model.
The presentation shows functional programming in a good light; I have one quibble so far: The 3rd slide of the 2nd talk ("MapReduce Theory and Implementation") says, "Functional operations do not modify data structures: They always create new ones." And then, "Original data still exists in unmodified form." But this is an implementation detail; the compiler needn't keep the original data if it it's not going to be used.
More helpful would be to note that, in a functional language, a variable's value never changes, thus making it easy to read the code. Whether input data stays around after an operation is a different issue, and one the compiler can easily handle better than humans, on average.
It's interesting to read Church's papers to see how he's approaching the project. Initially, he's trying to clarify the notion of a free variable in logic, or rather to chuck it out in favor of a clear discipline for binding variables. Because he's looking at foundational issues in logic, he's at great pains to identify his assumptions.
A minor historical fact: alpha- and beta-conversion are defined in these early Church papers, but not with those names; they're identified as I-conversion and II-conversion (there is a conversion rule III, which looks like the reverse of beta-conversion).
Here's a stimulating article: Why Most Published Research Findings Are False by John P. A. Ioannidis (PLoS Medicine, 2005). It focuses on research that aims to find statistical relationships in data, and asserts that most such relationships claimed in the literature are in fact false. Distilling the discussion, I find these compelling reasons why it would be so:
This last point is particularly damning—few things are more essential to the scientific method than reproducible experiments, yet the article blithely says (and I readily believe) that most biomedical studies are not reproduced. In fact, the competitive publication cycle works against this: merely to confirm an existing result is not very publishable; to contradict an existing result may be publishable, but this means, as Ioannidis notes, that there can be an alternation of positive, then negative, then positive, then negative results on a particular question, as each team becomes interested in upsetting the last published result. Far from amassing independent evidence on a question, this is just another selection bias that works against the scientific process.
Interestingly, the article is wholly unscientific. Without stating its assumptions, it works these assumptions through to conclusions. Along the way, it presents a bunch of formulas, which add the gloss of analysis to what is essentially a work of persuasive writing—but I don't buy the formulas, which include unobservable (and perhaps ill-defined) quantities such as "the ratio of true relationships to no-relationship pairs in a field" and "the false-negative error rate." (How amusing would it be if this article were a false research finding?) But methodology aside, I do believe it: that many, it not most, published research findings are false.
I'd be interested to see someone look at the issue in other kinds of fields—fields that aren't quantitative, for example. In the field of programming languages, people make a lot of claims that are justified by proofs. How often are these proofs actually correct, I wonder? And how often are the claims correct? Moreover, much PL research is not even "claim-based," as such. Many papers simply present a feature or technique and tout its virtues—and don't make falsifiable claims, at all. And often, this is the most valuable research: someone tried something and described the experience. We can learn from others' experience, even without proven claims.
How do we assess the value of a research field, such as that of programming languages? How do we know when we're doing a good job?
A Haskell predicate that tests whether an expression raises an error (for my future reference):
import Control.Exception
raises :: a -> IO Bool
raises expr =
Control.Exception.catch (
return . (const False) =<< Control.Exception.evaluate expr
) (return . (const True))
Poplar's users and potential users had mixed feelings about the syntax. Even aspects we consider successful were not universally appreciated. No one was ever sure what the precedence rules were or should be."Har!—James H. Morris, Eric Schmidt, Philip Wadler. Experience with an Applicative String Processing Language, POPL. 1980.
It currently says, "Any imperative algorithm is expressible in these languages with a logarithmic slowdown in the worst case." David Hopwood notes in the discussion that "Imperative algorithms are expressible with no asympotic slowdown," using some clever/complicated technique involving mutable arrays (citing a caml-list posting by one Tom Breuel).
All of this sounds a bit strange to me--as far as I know, there's no conclusive reason to think that functional languages must execute a particular algorithm more slowly than an imperative language. (Where does this "logarithmic slowdown" come from, anyway?) But I don't have a mastery of all the relevant literature. Can someone else jump in and straighten this out?
Here's a rather good paper of the sort that I wish someone had shown me when I first started as a grad student: Conception, evolution, and application of functional programming languages, by Paul Hudak (1989). It gives a good history of how functional languages arose (from Lisp, ISWIM, KRC, and so on) and coalesced (into ML and Haskell and the like), and describes in some detail the salient features of (and issues within) functional programming, including pattern-matching, algebraic datatypes, memoization, nondeterminism, and of course, type-inference and polymorphism. It also includes a short bit on formal semantics, which might give an neophyte the flavor of such a formalism without getting too heavy. The paper is fairly long, but it covers a lot and is written in the old style of clear, readable prose with a minimum of specialized notation.
I found the paper from a link on Wikipedia.
Other papers that fall into this class ("classic papers I wish I'd seen at the beginning") include "The Next 700 Programming Languages" and "The Mechanical Evaluation of Expressions,"[1] both by Peter J. Landin. (I'll post more as I think of them.)
Tree papers that I gladly did read at the beginning are "Definitional Interpreters for Higher-Order Programming Languages" (1972—the original typewritten version is better than the one typeset in TeX that you find sometimes), "The discoveries of continuations" (2005), and "Gedanken" (1970), all by John C. Reynolds. The latter introduces (?) the nice idea of an escape construct (now often called let/cc) and "references" as first-class values (or is that older?).
[1] Sadly no longer available online.
This is sharp. Maybe web frameworks (or <cough /> web-centric languages) should provide support for easy memoization.What do ETags do? They allow you, when you fetch a Web resource, to send a short string along with the request saying “this is the signature you sent me for the version I know about”; then the server can look and see if the signature is still good, and if so not bother sending the data redundantly. Which is potentially a big, big win.
But also potentially not. If you look at actual real-world servers that are maxed out, they’re mostly maxing out on computation or back-end database traffic, not on bandwidth to the Net. So the saving that would be really valuable would be if you didn’t have to rebuild the requested page. Unfortunately, both Rails and Django rebuild the requested page, then compute the signature, then check the ETag. Which is extra run-time work in order to conserve a resource (Web bandwidth) that probably isn’t the problem.
It looks like the authors are concerned more with the phenomenon of web apps deployed on mobile devices and inside other web apps, rather than "ordinary" browser–server apps. Who knows what will come of it. [1] Fonseca, J. M. C., Prendes, I. M., Soriano, J., Hierro, J. J. Declarative Models for Ubiquitous Web Applications. April, 2007.
- A declarative expression language notation is not specified in (X)HTML. As a result developers have to code dynamic aspects of the user interface by means of server side scripts or Javascript DOM [DOM] manipulation.
- (X)HTML classic forms [Forms] are severely limited in terms of typing information. For example, It cannot be specified that an input element accepts a date, a number or a patterned text. Furthermore, due to limitations in the submission formats, applications have to convert everytime, typically at server side, strings to the correct type expected by the application logic.
- Lack of data binding mechanisms. Data binding is a technique that allows to associate data between the model and user interface components in a declarative way. Developers typically write server side scripts or Javascript AJAX-based code to populate user interface components with data. As a result, in (X)HTML-based user interfaces there is no clear separation between the model, the view and the controller.
It's the best thing since "To Dissect a Mockingbird."
It seems likely a kid could understand and enjoy playing with this system; I do wonder about the "color rule," though: I suspect young kids would find that tricky and not see the point—someone should do an experiment.
It might be hard to gather much information from these slides without the accompanying performance, but you might enjoy looking at the drawings or trying to run the code samples.
So yesterday I went down to Amazon's development office in Edinburgh, and gave a talk on Links, at the invitation of Andrew Birkett. The talk went well and I was quite pleased with the response and the sharp questions.
One of the biggest concerns that the Amazon people had, which we haven't tried to address in Links, is failure: how the language allows them to handle failure. In Amazon's systems, there are loads of interdependent services that depend on each other to display a single page, and any of them might go wrong; the developers described to me a wide variety of alternate responses that they might want to give, depending on the situations. For example, in some instances, a page feature might collapse down to nothing (disappearing from the page) when it fails. Other times, if a service doesn't respond, the front-end web software might use cached data to display a feature.
This came up in regard to Links because of the way we're making client-server (Ajax) calls simple RPC calls. The question there is, what happens if the remote call doesn't finish successfully, either because the server code couldn't get the right data, or because the network mechanisms themselves failed; how in Links can we handle that anomaly? Part of the answer might be an ordinary exception mechanism, which we can support fairly easily, but we should think more about how Links programmers deal with exceptional situations.
The crowd was very friendly and engaged with the talk quite deeply, I think. They fed me plenty of questions on particular points. Many of these had to do with various kinds of possible failure, as I mentioned; another theme was metaprogramming, prompted because they noticed a certain amount of repetition in my slides (e.g. (foo=foo, bar=bar) when constructing records; I do hope we can improve on that).
I gather they [present-day Amazon developers] do most of their work in Java, but they weren't thrown off when I started talking about continuations or any of the functional-programming idioms that we use in Links. There were a few self-confessed language afficionados in the crowd, but they weren't the only ones who didn't miss a beat at the functional idioms, the tuple types, or the unusual kinds of variable binding we use for our "formlets" (oh, reader—you don't know about our formlets—I'll have to post about that soon).
Between the talk itself and chatting with developers outside of it, I had a quite nice time. Thanks to Andrew and all of the others for listening so well & treating me nicely, even though I'm an academic!
UPDATE: Slides from the talk are available.
This is good because the old one was short on helpful material for beginners. I had the same problem when I started learning Erlang as I had with Perl: I was dead in the water trying to get a simple program to run, for hours or days—until, with Perl, I found Learning Perl. With Erlang it was just a tremendous amount of banging my head against the wall. The first hours of learning a new language, for me, are often spent with trivial frustrations, like figuring out what "main" is called, or how to include the standard library. Maybe that's because I usually don't start with examples, the way many people do. Rob Sayre says the book is good.
A recent tech report from Berkeley surveys the state of parallel computing [via Tim Bray]—particularly with an eye to the importance and difficulty of coding for multicore processors (one of which is helping me author this article). The bulk of the article deals with hardware design but also discusses programming systems (that's my bailliwick) and application design issues.
I've not read the whole thing yet, but I've already found some surprises:
These locking schemes are notoriously difficult to program, as the programmer has to remember to associate a lock with every critical data structure and to access only these locks using a deadlock-proof locking scheme. Locking schemes are inherently noncomposable and thus cannot form the basis of a general parallel programming model. Worse, these locking schemes are implemented using spin waits, which cause excessive coherence traffic and waste processor power.Transactional memory seems like a clear win, though it's not widely implemented and needs to get some experience behind it.
We believe that future successful programming models must be more human-centric. They will be tailored to the human process of productively architecting and efficiently implementing, debugging, and maintaining complex parallel applications on equally complex manycore hardware.This seems to me quite wise, although it's still unclear whether we can understand the cognitive (or psychological) processes in programming well enough to design for them.
Section 5.4 seems a haunting parable for language designers:
Programming languages, compilers, and architectures have often placed their bets on one style of parallel programming, usually forcing programmers to express all parallelism in that style. Now that we have a few decades of such experiments, we think that the conclusion is clear: some styles of parallelism have proven successful for some applications, and no style has proven best for all.
There seems to be a lot of wisdom collected here, which cuts across subfields of CS—as in a list of which "dwarfs" are commonplace in which application domains. This appears to be a quite useful reference for those of us who don't work in any particular application domain but make systems for programmers to use. I nudge you gently in the direction of this article.
The language called "Processing" is a Java-like language, designed for coding up lovely interactive graphics. It compiles to Java and it's easy to deploy your apps on the web for people to enjoy them. Many of the demos on the site are mouth-wateringly impressive—check them out before you read on.
What the Processing people have done really well is to think about all the primitives that you'll want when coding up graphics demos—so they have good ways of playing with color, controlling the opacity of a shape, getting nice bezier curves, rotating and skewing, plus fairly good type-drawing and 3D operations. They've also thought through the programmer startup cost, making it easy to sit down and start using the thing without much knowledge.
The area where Processing is weaker is in the ergonomics of development. It uses an imperative style where you have to build up values and images in sequential steps. It is clumsy, for example, to create a list of dots and then render each one to the screen; clumsier still to express how they change or move over time, since you need to write out the frame-by-frame changes, rather than writing it as a function of time or a system of equations. Many facets of the API are controlled using peculiar beginFoo() and endFoo() calls, which affect how the statements in between behave. Heaven help you if you call beginCamera() and leave off the endCamera() call. These drawbacks make errors likely and have a cost in debugging time and code-reading time.
There is an alternative way to make such demos, in the name of "functional reactive animation" (FRP). In the FRP style, you write the behavior of each thing and each shape as a function expressing how it changes over time. You have at your disposal the raw parameters of time, and various input values, such as mouse position and key presses, so you can make the shapes react to these things.
I'll describe briefly what FRP programs look like (if you already know, skip the next few paragraphs). In the imperative style, you forcibly update persistent variables and so to understand the behavior you need to look at all the ways those variables can be updated over time, which can be quite complex. By contrast, in the FRP style, the behavior of an object is completely defined by an expression, so you can understand that behavior just by studying that expression—a more compact way of appreciating the behavior.
For example, to define a dot that moves at a constant rate in a circle around a point (centerX, centerY), I might write it something like this:
dotPosn = (centerX, centerY) + (radius * (sin time), radius * (cos time))
(I'm positing an operation of summing two vectors—a simple thing to define in many languages, though a bit cumbersome in Java.) The above defines the moving center of the dot I want to draw; drawing it at each animation frame would require just an expression as simple as this:
myDot = circle(dotPosn, 1, redColor)
That is, draw a circle centered at the value determined by dotPosn, with radius 1, in the color red (defined elsewhere). Objects can also be defined in a mutually-recursive way, so that their behaviors depend on one another.
So FRP is an appealing paradigm for writing interactive graphical demos. But none of the existing implementations are as appealing as Processing. The original implementation, Fran, is a closed-source Windows-only application (How's that for a conversation killer?) and is no longer being actively developed. The newer FrTime (part of DrScheme) is a much more serious contender, as it is open-source and runs on all the major platforms. Yampa is another, which I haven't had the chance to dip into.
These are good efforts; but there are still some pieces missing. Processing apps can be deployed on the web very easily—which makes the pay-off for writing them so much higher. Writing a graphics demo and showing it to your friends at your next pizza party might be fun. Writing a graphics demo and slapping it on the web is a whole lot more fun. See Linerider and the great popularity it has had over the last few months.
Another problem is that most FRP implementations are very conservative: they tend to offer the minimum set of tools that makes it theoretically possible to implement anything in the space. Processing takes a different approach: it tries to offer the minimal toolbox that you're likely to reach for in the course of making a demo. So FrTime gives you accum-e, which can probably be used to express any discrete time-changing behavior imaginable; but Processing gives you shininess (for setting a solid object's surface properties) and strokeCap (for adjusting the shape of the ends of lines).
A great Summer of Code project, or Master's project, would be to compile FRP code from one of the above languages down to the JVM, so you can deploy it over the web. Another one: take FrTime or Yampa and beef it up with all these nice graphics operations, to add the kind of vocabulary that graphic designers and animators like to use. The latter project would be straightforward from a CS point of view, but it would require attention to detail and a good sense of visual aesthetics. The former is more of a CS research project, and might require some clever thinking because of the apparent mismatch between functional-style code and the JVM's computational model. But there are plenty of other languages that compile to the JVM, so it's not beyond the pale.
If I were just starting now on my PhD, I'd find a way to include this in my work, or I'd put off the PhD until I could get this done! Tinkering with graphics demos and slapping them up on the web is the kind of hacking I'd really like to be doing. As it is, I'll have to leave it to some young whippersnapper, alas.
At first it caught some good bugs. Then it started succeeding on 100 random tests every time I ran it, so I feel reasonably certain that my algorithm is correct. I played around with the test-data generator to make sure it was generating terms that were deep enough and rich enough & to make sure that it never diverges, creating arbitrarily large terms. QuickCheck is reasonably nice this way.
So here's the Haskell code for my lambda-lifting algorithm, together with the QuickCheck tests (at the bottom). Note that there are some funny things in the syntax that aren't explained; they'll have to stay unexplained until I get done with the actual research I'm doing! Any day now...
The only downside is that I have to make a few adjustments to the current publishing engine, which is Perl, and to the client-side stuff, which is of course JavaScript. I keep putting : in front of constant strings. And forgetting parentheses. Especially empty parentheses. And semicolons. Especially semicolons. Bloody stupid useless semicolons.
Tim sounds like the ideal user for Links.
Oh, but Links has semicolons.
Some of the presenters I mentioned before talked on more than one topic. Michael Hicks, after talking about futures, went on to discuss some work on "dynamic software update" or DSU. Hicks has developed a set of techniques for dynamically loading a new version of an existing program, and switching it over while it runs, with zero downtime. This includes translating old data representations to new ones. As I listened to the talk, I kept thinking, "OK, sure, we're going to design our code in this highly-restricted way, and then as long as we're willing to bend over backwards to do some tricky retrofitting, then hypothetically we might be able to update the code without downtime."
I was wrong! Hicks and his team have actually taken three years' worth of updates to sshd and vsftpd (amounting to a couple dozen updates, including some whole-number revisions) and updated running instances with each successive version, all without crashing or taking the server down. I was quite astonished that these techniques could be applied to changes that have already been made in the wild. Of course, they had to write some code to translate in-memory data structures on the fly—but they didn't have to re-architect the application to make it fit. Everyone in the seminar room woke up when Hicks showed the slide showing all the versions, with their dates, that had been dynamically loaded into these servers.
I would be interested to see whether these DSU techniques turn out to be a good software-engineering tradeoff in the long run. Most of the time, just having an extra machine to handle load while you bounce individual servers to the new version is a cheap way to get the same result. And you still have the challenge of writing your updates so that they're compatible on the wire: you can update sshd's internal structures on the fly, but updating the protocol might be more challenging. Also, to be slightly critical, sshd and vsftpd together make a pretty constrained class of software: slow-changing servers that mainly wait for connections and spawn off processes to handle them. Would this work for a more sophisticated system like a fancy real-time game system, where the gamers are actively interacting through the system?
Matthew Flatt argued for programming-language features inspired by OS features. The case was reasonably compelling: an IDE like DrScheme needs to run user programs in a special bomb-proof box, so that user programs can't impinge on the workings of DrScheme itself. This extends to lots of issues: device ownership, memory consumption, non-termination. Flatt argued for an abstraction called a "custodian" that manages all those resources together; killing the custodian frees up all the resources it manages. At the same time, he wants to enable sharing of data between programs, as an OS might allow. This makes the memory-management problem much harder, of course, since you need a policy for determining which custodian is "charged" for a block of memory, when it's shared between many. Flatt outlined a policy, whose details I didn't get, which apparently works in his setting.
Sandhya Dwarkadas talked about transactional memory from the hardware point of view. Unfortunately, her talk was pitched in the vocabulary of computer architects, so I didn't understand any of it! At a high level, what I took away was that transactional memory might be easy for processor makers to provide, by taking advantage of the cache-coherency systems that are already being included in multiprocessor machines.
Jeff Foster talked about another system for statically detecting race conditions, like Flanagan's for Java, but this time for C code. It amounts to a kind of pointer alias analysis, and the details are very complicated. A question that wasn't raised, which just occurred to me: Why was alias analysis necessary in C but not in Java? I think the answer will be that the Java system may assume that most access to data members are from within the class definition (and thus are not by reference).
Shaz Qadeer had the true misfortune of presenting last, after we'd patiently sat through 48 hours of lectures. For myself, I know I didn't follow his (or Jeff Foster's) presentation as closely as most of the others. Someone has to go last, I guess. Qadeer's presentation was on model-checking concurrent software. Some of the material he presented was basic model-checking stuff (like "What is LTL?") but he quickly jumped ahead to cover fancy techniques for state-space reduction. I'm always surprised when speakers do that. If you assume that I don't know the basics, then why do you expect me to absorb those and with some advanced material in one lecture? If you want to focus on the advanced stuff, then why not just say, "This is for people who already know X," and just give a quick refresher for X? The advanced students were probably bored while us newbies asked questions about LTL, and us newbies got bored once our intuition had been outstripped and we couldn't follow the lecture closely anymore.
All in all, the quality of the presentations at the Summer School was quite high. I was surprised that I could follow about 40 of the 48 hours of lectures, and got something out of almost every one (the previous 48 seminars I'd attended didn't have half that hit rate).
We also had a great time: Jim Allen's nightly walks around Eugene were quite nice, and we always ended up at a pub (if you like beer, they have lots of good ones in Oregon [my favorite: the Black Butte Porter, not to everyone's taste]). I met loads of people there and really enjoyed it. To PhD students in the US and abroad, I'd suggest you go to the Oregon Summer School in future years.
Cormac Flanagan presented his technique for checking & inferring that Java programs observe a locking discipline that gives desired atomicity properties. A motivation for this is that Java's synchronized keyword allows you to protect a block with a lock; but it is up to you to make sure that all of the right locks are held--in general it can be hard to tell whether a piece of code is atomic with respect to the rest of the program. Flanagan's system allows you to annotate a method or a block with atomic; a static checker then infers whether it is truly atomic by virtue of the locks it holds (viz-a-viz other bits of code in the program). The analysis is somewhat conservative, in that it may reject programs that are actually correct, but the techniques seem to lead you to write the kind of lock-based code that is ordinarily used in practice; Flanagan's team has run the checker successfully on large bodies of threaded benchmark code, and has even found bugs in the Java standard library (e.g. with the append function on Strings). The biggest drawback to this work is that it still relies on locks, and deadlock can still occur.
Dan Grossman gave an nice survey of possible semantics for transactions. Here again, programmers would wrap blocks of code with an atomic keyword, but now we are proceeding from semantics to implementation, rather than the other way around. Some of the semantic questions surround the interaction of transactions with exceptions, the nesting of transactions, and the distinction between weak and strong atomicity [1]. Dan convinced me that when an exception escapes an atomic block, you should not roll back the transaction. One good reason for this (among many) is that it preserves "serial elision"[2]: if you erase the atomic keywords, you get a sequential program that behaves the same as the original program would behave in a sequential context.
Strong and weak atomicity are distinguished by how you treat reads and writes that are not protected by an atomic block. An interesting tidbit is that Haskell's STM system moots the distinction by cleanly separating transactional memory from non-transactional memory (they have different types). This means that the low-level implementation can provide only weak atomicity, but at the language level there is no risk of other threads altering transactional memory.
Dan's thesis is that if you provide explicit threads and mutable shared memory, *then* transactions are a great solution to the problems that arise—but that it's not clear whether threads with shared memory are the best model of concurrency.
To contrast, we've had two presentations on alternate approaches to concurrency. Charles Leiserson presented Cilk, a conservative, faithful extension of C that offers declarative parallelism. Designed for big parallel algorithms (think FFT, matrix multiplication, etc.), this language allows you to tag subexpressions for evaluation in separate threads—the principal mode of communication is simply the return value of the subexpression. This model removes the chance of race conditions and deadlocks (Although the full power of C is still available so you can still shoot yourself in the foot). The language design seems reasonably elegant (just a few keywords are added) and it has the property they call "serial elision"; a drawback is that return values need to be used in specific ways (e.g., assigning directly to a local variable) and there are ways to use it unsafely (e.g., trying to use a spawned return value before the spawned thread has actually returned).
Leiserson also gave some very simple and interesting ways to analyze the parallelism of an algorithm, which gives you a good guideline to how much speedup you can expect as you add more processors. Essentially, you need to add up the total amount of work done as well as the length of the critical path (the longest dependency chain) and look at the ratio. I hope to post more about this another time.
Matthew Flatt of PLT Scheme fame (currently at University of Utah) gave a really neat demo of the Concurrent ML primitives in Scheme, in realtime: he built up a program iteratively, running it each time, while we watched. This worked surprisingly well. At times, it was easy to get confused, but asking lots of questions was a strategy that allowed me to grasp the ideas. The concurrency primitives are a lot like pi-calculus, in that they allow synchronous rendezvous and a "choice" combinator. This sublanguage is pure in the way that Erlang is: locks are not needed. Of course, the synchrony means the primitives are hard to implement (efficiently and reliably) in a distributed setting.
Michael Hicks presented "futures," a language feature first implemented in MULTILISP; futures permit declarative concurrency like Cilk, but the language handles some of the necessary mechanics: futures automatically collect or wait on result values when they're needed, whereas Cilk requires the programmer to explicitly wait for the results (and dies if this is done incorrectly!).
[1] Weak atomicity requires only that transactions are atomic with respect to one another; strong atomicity requires that they be atomic even with respect to code to non-atomic blocks. The latter is, of course, much harder to implement.
[2] So-called by the Cilk people.
My only beef is that Hayes applies the outworn, yet still circulating, classification scheme which distinguishes four types of languages: imperative, functional, object-oriented, and logic. These qualities are more like independent axes for measuring programming languages, rather than discrete categories. Most object-oriented languages are imperative, but some are functional (e.g. OCaml, CLOS and, I'd argue, Haskell). Logic programming has been screwed into Scheme (viz. Kanren in Scheme, probably others). Some functional languages have imperative features (Common Lisp, SML), while some "imperative" languages have functional features (viz. higher-order primitives (map, grep, filter) in Perl/Python/Ruby).
This categorization scheme should be abandoned and the communities entrenched in their positions should cross borders and see what the others have to offer. All the cool languages are doing it.
Final note: The article partly exemplifies Wadler's Law, in that a whole page is given over to the discussion of syntax: semicolons, comments, and identifier conventions.
#haskell:
• vincenz btw notes that #haskell is one of the greatest channels around, you ask questions in other channels and if you're not 100% accurate, they shoot you down instead of guessing what you really meant
I'll be pretty interested to see how all this pans out.
Most of his objections don't hold for Links, though. Let me go through them one by one.
Griffiths says that code with web continuations will simply stop executing at some points, when the user walks away, and that will cause (a) debugging confusion and (b) problems with resource management. In Links, the latter is not a problem, since we store all the state of a continuation on the client.
The confusion that might result from having functions that just silently quit in the middle may indeed cause trouble. Griffiths offers that at least it'ss predictable where this will occur; but not necessarily. I may write a function one day that always completes, and I'd be happy with that. But then some function that I call may change so that it serves a page and doesn't return until the user does something. My own function will now have the same problem, and I may not understand why. This is an issue, but it may be possible to add some language feature that forces programmers to declare this kind or behavior, thus being aware of the problem and catching it at compile time.
I don't know what Thread Affinity is, but it sounds like a problem that crops up in Java specifically. I know there's a lot of voodoo around how to treat threads in Java.
In Links we haven't decided exactly how to present threads on the server, but we don't expect any voodoo. Links shoud be purely functional except with respect to IO (files and databases will be mutable, of course). As a result, the identity of the thread that runs a piece of code shouldn't matter.
Griffiths worries that web requests can generally come in to any machine in a farm, but the freezing of continuations might not be so fluid. In Links, they are fluid: continuations are serialized completely to the client, so when the request comes in to a farm, it doesn't matter what machine it's assigned to.
The issue here is that user's behavior is not essentially linear, and thus a control construct that assumes a linear path for the user would be inappropriate. The particular problems are that: (a) a user can go back, causing code to be executed any number of times, and (b) a user can branch in multiple directions.
In Java, of course, this is a major worry, since Java code tends to be quite dependent on mutable objects. In that context, it could be really hard to figure out why x = 0; y = f(x); x = x + 5 ended up with, say, 15 in x. Links greatly mitigates this problem by encouraging a functional style, so that the only possible value for x after that sequence would be 5. Links libraries will be written in a pure-functional style, so you won't be required to run into the above problems just by using basic libraries.
On the other hand, he has a point: user behavior on the web is not fundamentally linear, so why do we propose a linear control construct? This is a very good question, and it's one that we don't have a great answer to, yet.
My own thinking is that a linear page flow is occassionally the right abstraction, and it would be very pleasant if we could give you a super-simple way to write such a page flow. You'll never have a page that has only one way out—you'll always have a Cancel button or a Help link or something, so we've got to figure out the right way to handle the edge cases. There's research to be done, for sure.
To summarize, Griffiths makes some good points about problems with web continuations in Java and in general. With Links, we're creating a new language, which we hope will prove that certain language features make these problems much much easier. The Java community won't be able to go right out and implement these features, but it might influence future decisions, and perhaps whatever comes after Java will learn from these problems and from our solutions.