7:00 PM, Thursday, 18 March 2010
MIT Room E51-315
Special Relativity and the Problem of Database Scalability
Jim Starkey
NimbusDB, Inc.
Conventional wisdom says that relational – or even transactional – databases can’t scale. Like most conventional wisdom, this opinion is based on an unstated assumption, specifically, that database consistency depends on transaction serializeability. Serializeability is a property of databases implying that transactions leave the database in the same state whether they run concurrently or serially in some order. Serializeability is analogous to the Newtonian assumption of a universal reference frame – that each transaction moves the database from one fixed state to another. At any give point in time, there is a single, fixed database state. Serializeability is a sufficient condition for database consistency. But is it a necessary condition? And if not, what are the ramifications?
An alternative is an Einsteinian universe where transactions each see a database state consistent with results of other transactions observed as committed when the transaction started, but ignore changes made by transactions not yet reported as committed. As long as a transaction cannot update a row that it cannot see and formal database constraints are enforced, the database remains consistent. Because the database is observable only through the reference frame of a transaction and each transaction may have a different reference frame, the database itself does not and, unless quiesced, cannot have a single fixed state.
Ultimately, consistency requires communication. A relativistic database, however, has a lower – and less demanding – communication requirement to detect inconsistency than a Newtonian database that must prevent inconsistency.
NimbusDB is a relativistic, transactional, relational database that scales elastically to many servers. Architecturally, NimbusDB is a near total break from a 30-year legacy of database implementation, separating persistent storage from processing nodes, eliminating fixed assignments of data to nodes, and replacing disk files with a collection of distributed objects.
The talk will discuss how a shift in thinking about database concurrency enables a radical change in database architecture and the resulting evolution of NimbusDB.
Jim Starkey’s career spans database history from the Datacomputer on the fledgling ARPAnet to his most recent startup, NimbusDB, Inc. Innovations include the date data type to the blob to multi-version concurrency control. At DEC, Starkey created the Datatrieve family of products, the DEC Standard Relational Interface, Rdb/ELN and designed the software architect for DEC’s database machine.
After DEC, Starkey founded Interbase Software, which produced the first commercial implementations of heterogeneous networking, blobs, triggers, two phase commit, database events, etc. After InterBase, he developed a graphical development environment for database applications. In 2000, Starkey started Netfrastructure, a platform for Web applications including a relational database, integrated search, a Java virtual machine, and a context-sensitive page generator - unfortunately, a little ahead of its time.
NimbusDB, Inc., founded by Starkey in summer 2008, is a re-start of relational database technology, targeted at an elastic cloud of computers. The premises of NimbusDB are that a database system should become more capable by plugging in more computers, that database system never need go down, and that recovery from human error is just as important as recovery from hardware and software failure.
This joint meeting of the Boston Chapter of the IEEE Computer Society and GBC/ACM will be held in MIT Room E51-315. E51 is the Tang Center on the corner of Wadsworth and Amherst Sts and Memorial Dr.; it's mostly used by the Sloan School. You can see it on this map of the MIT campus. Room 315 is on the 3rd floor.
Up-to-date information about this and other talks is available online at https://ewh.ieee.org/r1/boston/computer/. You can sign up to receive updated status information about this talk and informational emails about future talks at https://mailman.mit.edu/mailman/listinfo/ieee-cs, our self-administered mailing list.
For more information contact Peter Mager (p.mager AT computer.org)
Updated: Jan 26, 2010.