« Browser Internals | Main | The 'Scalability' Bugaboo »

100s of Results

The most exciting thing I've come across lately is the survey paper "Hundreds of Impossibility Results for Distributed Computing" by Faith Fich and Eric Ruppert. It shows what an exciting field distributed computing is, in that there are so many strong results covering a small number of models.

The paper is a good introduction to the field in general, giving a description of some of the key problems (consensus, broadcast, leader election, resource allocation), the important models of distributed computation (synchronous/asynchronous, message-passing/shared-memory, faulty/non-faulty), and also a concise presentation of some of the clever proof techniques that are used to establish impossibility results.

Researchers in this area have a fairly clear task in front of them: present algorithms that solve one of the problems on one of the models—or present a proof that there is no algorithm for that problem and model. Any such result is an exciting, solid advancement: either something is possible, and we've learned how to do it, or it is not possible and one must strengthen the underlying model in order to achieve it.

Of course, there is no shortage of generality in the field—it's not all about individual algorithms. There is an appealing and powerful result known as Herlihy's hierarchy, or the consensus hierarchy, which states that primitives in the shared-memory model can be characterized by the number of processes for which they can be used to solve (wait-free) consensus. Some primitives, such as read/write, cannot solve it for any number of processes. Others, such as test-and-set, can be used to build algorithms for consensus for small numbers of processes. Certain primitives are universal: they can solve consensus for any number of processes. What's more, consensus itself is universal: any other shared object can be implemented on top of a system that solves consensus. Characterising the primitives in this way has clear implications for building real systems.

Contrast this with the theory of "process calculus." Those who know me know that I am not at all fond of this vein of research—I believe its merits are unproven after decades of research. The field of process calculus has hundreds of theorems, no doubt—but these theorems are all about the bisilimarity of one process calculus with another, or about when two individual systems can be considered equivalent. A result of Milner's shows that any CCS formula can be factored uniquely into a parallel composition of non-concurrent processes. But this is a statement about the process calculus itself, rather than about the systems that it could model. The field seems to spend all its time proving that its formalisms are consistent or not entirely unrealistic, rather than cutting to the chase and proving interesting statements about computational phenomena.

Proofs of the kind summarized in Fich and Ruppert's paper must surely be considered theory, rather than systems research: they are strong mathematical arguments about a broad class of phenomena. But the proofs in "distributed computing" are eminently useful, whereas those in "process calculus" are largely abstracted to the point of dubious utility.

If anyone reading this knows more about process calculus/process algebra, and wants to challenge me on this, please do. I've not read the process algebra literature as deeply as I've read the distributed computing literature. I'd love to be convinced that I'm wrong.


Disclaimer: I don't have an answer for your question. I have similar misgivings about the research on process calculi.

But I'm not sure the argument demonstrates that process calculi are to blame. It sorts of sounds like the "Java's type system sucks so untyped languages are better than typed languages" argument. If the research on process calculi is too theoretical, that doesn't necessarily mean process calculi couldn't be useful.

My intuition about process calculi such as pi calculus, and why I find them attractive, is that they are built around the notion of abstraction in programming. You can combine processes modularly and you can wrap up compound processes into seemingly atomic processes whose internal transitions are unobservable. This is exactly the idea of building abstractions: we build larger code from smaller code but at a higher level of abstraction we think of it as a black box. This is the sanest and most scalable approach to building software.

(It was the pi calculus that led me to consider message-passing concurrency over shared-memory concurrency, for much the same reason I prefer functional programming to imperative programming.)

Of course, none of this answers your question. Just because the pi calculus kinda resembles Erlang doesn't make it as useful as Erlang! Your question is: what useful things can you prove with process calculi? I don't know. Maybe they're still too young, or maybe they're old enough to start raising eyebrows.

Thanks for the thought-provoking post.

Hi Dave,

Thanks for dropping by, and sorry for the long delay. I was a little busy with something!

I didn't say that the research on process calculus is too theoretical. I said that it hasn't got any applications. I would also like to say that CCS and pi-calculus don't have the right to call themselves theories of concurrent computing.

The crux of my argument is that pi-calculus has these things which are *called* processes, but which are not in fact processes as I understand them from doing concurrent programming. For example, in pi-calculus, in order to simulate lambda-calculus (and in order to do simple computations), you have to create new processes. This already begins to break the similarity between pi-calculus "processes" and the threads of control in a concurrent system. How could you show, for example, that only a certain "process" can access a certain resource? The identity of the real process is not clear in the pi-calculus model. When single processes are modeled using multiple processes, the model has broken from the thing it is modeling.

Just to clarify that it's not the *level* of theory that is the matter, let me say this: Complexity theory can be quite theoretical, but has a relationship with what can and cannot be done on various kinds of computers. Pi-calculus doesn't have a similar relationship with concurrent systems, in my view.

I've recently discovered more realistic process calculi; see my latest post for details.

Post a comment