Marked As Pertinent

rails, design, art, business by BJ Clark

NoSQL: If Only It Was That Easy

The biggest thing in web apps since “rails can’t scale” is this idea that “your rdbms doesn’t scale.” This has gone so far as to be dubbed the coming of age for “nosql” with lots of blog posts and even a meetup. Indeed, there are many promising key-value stores, distributed key-value stores, document oriented dbs, and column oriented db projects on the radar. This is *definitely* a great thing for the web application scene and this level of variety will definitely open doors for organizations large and small in the near and long term.

However, along with these great tools, an attitude that “the rdbms is dead” has popped up, and while that may be true in the long run, in the short term, it’s definitely premature.

What is scaling?

First, lets get a couple things straight:

to scale (third-person singular simple present scalespresent participle scalingsimple past and past participle scaled)

  1. (transitive) To change the size of, maintaining proportion.
    We should scale that up by a factor of 10.
  2. (transitive) To climb.
    Hilary and Norgay were the first known to have scaled Everest.
  3. (intransitive) (computing) To tolerate significant increases in throughput or other potentially limiting factors.
    That architecture won’t scale to real-world environments.

The first thing we need to agree on is what it means to scale. According to our definitions above, it’s to change the size while maintaining proportions and in CS this usually means to increase throughput. What scaling isn’t: performance.

performance (plural performances)
The act of performing; carrying into execution or action; execution; achievement; accomplishment; representation by action; as, the performance of an undertaking of a duty.
In Computer science: The amount of useful work accomplished by a computer system compared to the time and resources used. Better Performance means more work accomplished in shorter time and/or using less resources

performance (plural performances)

  1. The act of performing; carrying into execution or action; execution; achievementaccomplishment; representation by action; as, the performance of an undertaking of a duty.
  2. In Computer science: The amount of useful work accomplished by a computer system compared to the time and resources used. Better Performance means more work accomplished in shorter time and/or using less resources

In reality, scaling doesn’t have anything to do with being fast. It has only to do with size. If your request takes 12 seconds, it doesn’t matter, it only matters that you can do 1 per second, 10 per second, 100 per second, 1000 per second, etc. of that 12 second query.

Now, scaling and performance do relate in that typically, if something is performant, it may not actually need to scale. Your upper limit may be high enough you don’t even need to worry about scaling. The problem with rails is not that it doesn’t scale (it happens to scale pretty easily), it’s that you have to scale it almost immediately. The problem with RDBMS isn’t that they don’t scale, it’s that they are incredibly hard to scale. Ssharding is the most obvious way to scale things, and sharding multiple tables which can be accessed by any column pretty quickly gets insane. Furthermore, you might be able to use something other than a RDBMS that you won’t need to scale because it’s more performant or efficient at doing the work you’re currently doing in a RDBMS.

So NoSQL then. . .

At my previous job, my co-workers and I evaluated most of the current nosql solutionsto varying degrees. All of the projects have been evaluated for both use as simple “tables” of data, such as storing a single type of key/value data, as well as a document db to be used as our primary data store. They have a curious data set which consists of a single set of parent objects with many child objects that relate 1-1 or 1-n with this set of objects (I call these primary objects). We then have a secondary set of objects that store changes made to both the primary objects that we primarily use for auditing. Our current db setup is standard master-slave replication in mysql with 1 master and up to 3 slaves depending on usage. The primary objects are mostly changed via UPDATES and the secondary objects are all inserted at the end of the tables. We also have a few random other data sets which loosely relate to the primary parent objects.

To get a couple out of the way, I’m not going to cover memcached (because it’s not a db), memcachedb (general sentiment that it is immature), couchdb (because we didn’t want to use map/reduce to pull information and there are questions about it’s performance and replication), dynomite (seen as immature),  Amazon SimpleDB (because of size limits), or Lightcloud (seen as immature). As far as the ones that we I deemed immature, I’m sure there are people out there using these things and having a great time, but our research into them, and word of mouth from others who have tried them kept us from really going deep.

Tokyo *

url: http://tokyocabinet.sourceforge.net/
type: Key/Value store with Full Text Search*
Conclusion: Doesn’t scale.

We liked Tokyo Tyrant so much, we put it in production. In fact, every request to AboutUs.org hits Tokyo. One of the uses is as a persistent memcached replacement for caching 10 million+ wiki pages (as a json document of all the pieces of our page, which comes out to around 51gb(edited) of data), and it works great. It runs on a single server, it serves up a single type of data, very quickly, and has been a pleasure to use. We keep other ancillary data sets on some other servers too, and it’s great for this. Tokyo Tyrant is a great example of very performant software, but it doesn’t scale. If you’d like to make it scale, it’s not very hard, you scale it exactly like Memcached (by some sort of application side hashing of keys). You can have as many servers as you’d like, but you can’t easily add servers to a cluster (increase in size while maintaining proportion) and therefore, you can’t tolerate significant increases in throughput. The good news it that here “significant”  is relatively massive, and you probably won’t need to scale it any time soon.

We tried to insert 160mil 2k -20k documents into a single Tokyo Tyrant server, and performance quickly dropped off and kept going down. You could have had a nice holiday skiing on the graph of inserts/sec. This is pretty typical of anything that you write to a disk. The more you write, the slower it goes. We could distribute the write easily, because Tokyo doesn’t scale.

Tokyo does support replication, and a few other great things, but these don’t make for scaling.

* We don’t use the full text search, so I can’t comment there.

Redis

url: http://code.google.com/p/redis/
type: Key/Value store with collections and counters
Conclusion: Doesn’t scale.

Redis is also awesome like Tokyo. I would say the two are pretty comparable as simple k/v stores. The counters and collections are AWESOME, and if I was still at AboutUs, I think I’d be pushing to move a couple pieces of the infrastructure to Redis. I have less to say about Redis because I haven’t used it in production, but it looks great if it fits your bill. Does it scale? No. It, just as memecached and tokyo tyrant, can do sharding by handling it in the client, and therefore, you can’t just start adding new servers and increase your throughput. Nor is it fault tolerant. Your redis server dies, and there goes that data. And just as tokyo tyrant and memcached, you probably won’t ever need to try to scale it. Redis also supports replication.

Project Voldemort

url: http://project-voldemort.com/
type: Distributed Key/Value store
Conclusion: Scales!

Voldemort is a very cool project that comes out of LinkedIn.  They seem to even be providing a full time guy doing development and support via a mailing list. Kudos to them, because Voldemort, as far as I can tell, is great. Best of all, it scales. You can add servers to the cluster, you don’t do any client side hashing, throughput is increased as the size of the cluster increases. As far as I can tell, you can handle any increase in requests by adding servers as well as those servers being fault tolerant, so a dead server doesn’t bring down the cluster.

Voldemort does have a downside for me, because I primarily use ruby and the provided client is written in java, so you either have to use JRuby (which is awesome but not always realistic) or Facebook Thrift to interact with Voldemort. This means thrift has to be compiled on all of your machines, and since Thrift uses Boost C++ library, and Boost C++ library is both slow and painful to compile, deployment of Voldemort apps is increased significantly.

Voldemort is also intersting because it has pluggable data storage backend and the bulk of it is mostly for the sharding and fault tolerance and less about data storage. Voldemort might actually be a good layer on top of Redis or Tokyo Cabinet some day.

Voldemort, it should be noted, is also only going to be worth using if you actually need to spread your data out over a cluster of servers. If your data fits on a single server in Tokyo Tyrant, you are not going to gain anything by using Voldemort. Voldemort however, might be seen as a good migration path from Tokyo * when you do hit that wall were performance isn’t enough.

MongoDB

url: http://www.mongodb.org
type: Document Database
Conclusion: Doesn’t scale (yet!)

MongoDB is not a key/value store, it’s quite a bit more. It’s definitely not a RDBMS either. I haven’t used MongoDB in production, but I have used it a little building a test app and it is a very cool piece of kit. It seems to be very performant and either has, or will have soon, fault tolerance and auto-sharding (aka it will scale). I think Mongo might be the closest thing to a RDBMS replacement that I’ve seen so far. It won’t work for all data sets and access patterns, but it’s built for your typical CRUD stuff. Storing what is essentially a huge hash, and being able to select on any of those keys, is what most people use a relational database for. If your DB is 3NF and you don’t do any joins (you’re just selecting a bunch of tables and putting all the objects together, AKA what most people do in a web app), MongoDB would probably kick ass for you.

Oh, and did I mention that, of all the NoSQL options out there, MongoDB is the one of the only ones being developed as a business with commercial support available? If you’re dealing with lots of other people’s data, and have a business built on the data in your DB, this isn’t trivial.

On a side note, if you use Ruby, check out MongoMapper for very easy and nice to use ruby access.

Cassandra

url: http://incubator.apache.org/cassandra/
type: Column Database
Conclusion: Probably scales

Cassandra is another very promising project that I wouldn’t use yet. Cassandra came out of Facebook and seems to be in use there powering search in your inbox. It’s described as a distributed key/value store, where values can be collections of other key/values (called column families). It is definitely supposed to scale, and probably does at Facebook, by simply adding another machine (they will hook up with each other using a gossip protocol), but the OSS version doesn’t seem to support some key things, like loosing a machine all together. They are also in the midst of changing how the basic datastructures are stored on disk, and I don’t know that I’d trust my data to this sexy db until those things are worked out, which should be soon.

Cassandra also seems like a contender for a primary database or RDBMS replacement, as soon as it matures. The scaling possibilities are very attractive, and complex data structures shouldn’t be hard to model in it. I’m not going to go any deeper on cassandra because Evan Weaver did a great job of it here, but I will say that Cassandra is very promising and we were (when I left) looking at it very closely at AboutUs.org.

Amazon S3

url: http://aws.amazon.com/s3/
type: key/value store
Conclusion: Scales amazingly well

You’re probably all like “What?!?”. But guess what, S3 is a killer key/value store. It is not as performant as any of the other options, but it scales *insanely* well. It scales so well, you don’t do anything. You just keep sticking shit in it, and it keeps pumping it out. Sometimes it’s faster than other times, but most of the time it’s fast enough. In fact, it’s faster than hitting mysql with 10 queries (for us). S3 is my favorite k/v data store of any out there.

MySQL

url: http://www.mysql.com
type: RDBMS
Conclusion: Doesn’t Scale

Now you are probably like “Dude, what?!? You got some SQL in this NoSQL article”. I’ve got news for you guys, mysql is a pretty bad ass key/value store. It can do everything that Tokyo and Redis can do, and it really isn’t that much slower. In fact, for some data sets, I’ve seen MySQL perform ALOT faster than Tokyo Tyrant (I’ll post my findings in a follow up). For most applications (and say, FriendFeed), MySQL is plenty fast and it’s familiar and ubiquitous. I’m sure the NoSQL guys reading this will all be saying “Yeah, but we are dealing with more data than MySQL can handle”. Well, you might be dealing with more data than mysql used as a RDBMS might be able to handle, but it’s just as easy or easier to shard MySQL as it is Tokyo or Redis, and it’s hard to argue that they can win on many other points.

Conclusion

So, does RDBMS scale? I would say the answer is: not any worse than lots of other things. Most of what doesn’t scale in a RDBMS is stuff people don’t use that often anyway. And does NoSQL scale: a couple solutions do, most don’t. You might even argue that it’s just as easy to scale mysql (with sharding via mysql proxy) as it is to shard some of these NoSQL dbs. And I think it’s a pretty far leap to declare the RDBMS dead.

The real thing to point out is that if you are being held back from making something super awesome because you can’t choose a database, you are doing it wrong. If you know mysql, just used it. Optimize when you actually need to. Use it like a k/v store, use it like a rdbms, but for god sake, build your killer app! None of this will matter to most apps. Facebook still uses MySQL, a lot. Wikipedia uses MySQL, a lot. FriendFeed uses MySQL, a lot. NoSQL is a great tool, but it’s certainly not going to be your competitive edge, it’s not going to make your app hot, and most of all, your users won’t give a shit about any of this.

What am I going to build my next app on? Probably Postgres. Will I use NoSQL? Maybe. I might also use Hadoop and Hive. I might keep everything in flat files. Maybe I’ll start hacking on Maglev. I’ll use whatever is best for the job. If I need reporting, I won’t be using any NoSQL. If I need caching, I’ll probably use Tokyo Tyrant. If I need ACIDity, I won’t use NoSQL. If I need a ton of counters, I’ll use Redis. If I need transactions, I’ll use Postgres. If I have a ton of a single type of documents, I’ll probably use Mongo. If I need to write 1 billion objects a day, I’d probably use Voldemort. If I need full text search, I’d probably use Solr. If I need full text search of volatile data, I’d probably use Sphinx.

If there’s anything to take away from the NoSQL debate, it’s just to be  happy there’s more tools, because more cool tools = more win for everyone.

About these ads

Written by BJ Clark

August 4, 2009 at 7:38 pm

68 Responses

Subscribe to comments with RSS.

  1. I just can’t let the FUD of boost go.

    1. Thrift only depends on a relatively old version of boost 1.33.1+, which is available in binary packages on most platforms that already have it installed. It takes less than a minute to install the package.

    2. Thrift only needs some headers of Boost library. NO compilation of entire boost libraries is required. A header only install takes less than a minute.

    vicaya

    August 10, 2009 at 2:23 pm

  2. >>If I need to write 1 billion objects a day, I’d probably use Voldemort.

    and

    >>Voldemort, it should be noted, is also only going to be worth using if you actually need to spread your data out over a cluster of servers.

    so why not S3 which “scales *insanely* well”?

    freelancer

    August 11, 2009 at 2:13 am

  3. The gist of what I got from this article is that some of the distributed DBs out now are “high-throughput, high-latency”. That’s pretty much true — for now. HBase is as fast at scans as an RDBMS now, and Streamy.com serves pages directly from it.

    The problem is that you can’t *do* anything with these databases besides high-latency calculations and low-latency gets/scans.

    I think that’s about to change with some new functionality being added to HBase that allows rapid aggregation — in real-time. Check it out:

    Bradford

    August 11, 2009 at 11:16 am

    • i think it varies by product. things coming from a map/reduce heritage will naturally be oriented towards throughput more than latency.

      but some solutions are good for low latency — e.g. i think mongodb is.

      dwight_mongodb

      August 11, 2009 at 11:18 am

  4. I am loving mongodb right now, it’s awesome.

    mongolove

    August 12, 2009 at 8:07 pm

  5. […] NoSQL: If Only It Was That Easy – A great review of a number of NoSQL databases […]

  6. Thanks for the great information. As I have been primarily studying Hadoop I am naturally curious as to why HBase and the rest of the Hadoop family were not considered.

    Keith Tomas

    August 17, 2009 at 6:10 am

  7. Very informative!

    There’s a Dr. Dobbs’ article that examines database scalability. It discusses NoSQL databases but also looks at tuple spaces, in-memory databases, XML, RDF and SQL databases.

    “Databases in the Cloud: Elysian Fields or Briar Patch?” is at

    http://www.ddj.com/database/218900502

    CinCy

    August 17, 2009 at 9:17 am


Comments are closed.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: