Monte Carlo-ing Your Eventual Consistency Bets

Jan 10, 2012   //   by Karen Lopez   //   Blog, Database  //  No Comments

One of the features of not-only-SQL (NoSQL) data storage systems is the concept of eventual consistency (via Wikipedia):

Eventual Consistency… means that given a sufficiently long period of time over which no changes are sent, all updates can be expected to propagate eventually through the system and all the replicas will be consistent.

For those of us coming from a transactional system point of view, eventual consistency can be mind-boggling at first. Thinking about data being presented in an inconsistent manner is usually seen as a data quality failure — something to be avoided. But in non-transactional systems it’s worth the trade-off for speed and scalability. Think about your Facebook page for a minute: how bad would it be if one of your friend’s updates was not visible to you at the same time it was visible to someone else, but eventually you’d be able to see that update?

Paul Cannon has a great write up on using tools to estimate your eventual consistency with Cassandra:

"The best part is that they also provided the world with an interactive demo, which lets you fiddle with N, R, and W, as well as parameters defining your system’s read and write latency distributions, and gives you a nice graph showing what you can expect in terms of consistent reads after a given time.

See the interactive demo here.

This terrific tool actually runs thousands of Monte Carlo simulations per data point (turns out the math to create a full, precise formulaic solution was too hairy) to give a very reliable approximation of consistency for a range of times after a write."

Being able to plan your architecture to best fit the business need is what is important, not necessarily data purity at the cost of speed or reliability.  Again, that sounds weird to a profession that has focused on fighting to keep data integrity on the radar of management, but the best design decisions are made balancing cost, benefit and risk.  Those of us in the data world to understand that eventually consistent is often the best solution.  Even if it feels weird.

Having tools that help us understand how to best architect the trade-offs is the first step in delivering the right data consistency for what the business needs.

Leave a comment

Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to by Email



UA-52726617-1 Secured By miniOrange