Sunday, August 5, 2007

Building distributed applications using Sinfonia

Authors: Marcos K. Aguilera, Christos Karamanolis, Arif Merchant, Mehul Shah, and Alistair Veitch

Paper: http://www.hpl.hp.com/techreports/2006/HPL-2006-147.pdf

So...this paper presents a system called Sinfonia that is designed to be intrastructure for building distributed systems. Sinfonia presents 3 features:
  1. a global address space, sort of -- addresses are tuples of the form (node ID, address). Values can be read and written.
  2. "minitransactions" -- minitransactions are transactions restricted to be short lived and consist of distinct read, compare, and write phases (in that order), and they may only manipulate values in the global address space
  3. notifications -- a node may request to be notified when changes occur in the store in a particular range of addresses
I'm not clear enough on the distributed theory literature to be able to tell how interesting or novel their protocol is. I think it's uninteresting. More specifically, as most systems papers do, it describes the protocol in a rather ad-hoc manner that makes it difficult to get an overall feel for what is actually happening in the system. They spend a lot of time detailing a variant of two-phase commit wherein, oddly, the failure of a coordinator can be tolerated (i.e., the system won't block) but the failure of a participant can't. The rationale for this design decision is that the coordinator ends up being a client (i.e., application server) of the system, whereas the participants are actual Sinfonia nodes. It seems a bit odd to me that they designed a protocol that essentially tolerates no failures. This is supposedly not an issue given that Sinfonia nodes have hot-backups. Decisions within a minitransaction made by the primary are forwarded to the backup. I suppose I can buy this as a design decision if transactions are small and short-lived, but I don't really see why it's interesting.

Also, and as a side note, I'm always a bit dubious of the concept of backup servers. Are you really better off making a distinction between primary and backup servers as opposed to integrating all the servers you _would_ have employed as backups into the system itself? I.e., are you better off having n primaries and n backups instead of a system with 2n nodes? Seems like the latter would give you greater flexibility, in general, in allocating your resources. I suppose primary and backup are simply logical roles, and that any given physical server could play multiple roles at one time, but if you do that you might find yourself unintentionally compromising failure correlation assumptions (e.g., several of your logical backups live on a single physical machine that crashes, thus screwing you).

Anyway, moving on...the notifications don't seem that interesting except insofar as they interact with the minitransactions, except they don't really talk about this. As far as I can tell, the semantics of the notifications are very weak. If that's the case, would employing them in the same system as the mini-transactions undermine the supposedly strong semantics the mini-transactions are supposed to give you? I suppose if you treat notifications are purely advisory then you should be okay, but isn't that ultimately the same as interacting with a system that has the same weak semantics that you're trying to avoid?

Regardless, however, I suspect their system is very practical and works just fine. I just don't really see any new contribution here. The contribution would come in the form of easing application development and making ultimate end-systems more robust, but without a deployment in the real world to point to, that point is basically lost.