I should also mention that Meteor's hiring software developers in Seattle. If this is interesting to you, shoot us an email at jobs@meteorsolutions.com.
This is a good way to look at it, I think. People often ask if I'd use CouchDB again if I was building another similar system - I would, but I'd restrict it to a subset of the problem that was best-suited to CouchDB's strengths/quirks.
With regard to more advanced views: you'd be surprised how far you can get with the built-ins (_sum, _count, _stats); I've built a non-trivial data backend (on Cloudant, natch,) using pretty much entirely _sum reduces. Abusing the reduce with complex calculations doesn't seem to be worth it from either a disk space or query performance standpoint.
This reads like marketing-blogspam to me. These two projects solve totally different problems, and the author acknowledges it, yet he still goes on to highlight benchmarks where VoltDB is 10-20x faster than Cassandra. Of course it's going to be! It's an in-memory store!
So why benchmark against Cassandra? It's got a lot of buzz around it, of course. What a better way to shove your name into the "NoSQL" ring. Blech.
From their white-paper apparently DBMS systems spend 35% of their time doing buffer management, 17% doing logging, another 19% doing latching and finally 21% doing locking. This leaves only 7% for "useful work". In comparison, VoltDB has 95% capacity for useful work.
"Cassandra writes to disk. VoltDB is an in-memory database. So I gave both systems plenty of RAM to hold the data set and turned Cassandra's consistency settings pretty low."
So in memory in both cases. Cassandra has to write a log, but not synchronously.
I think Cassandra always writes to the Commit Log before returning success
Sure, but the write is not the significant part (it is likely to be cached in memory); the question is how often the commit log is sync'ed to disk. I'm no Cassandra expert, but I believe the default is to fsync() the commit log periodically, but to allow operations to return successfully before an fsync() has occurred. There's also a mode to require fsync() before returning success for an operation.
"Cassandra's example configuration shows CommitLogSync set to periodic, meaning that we sync the commitlog every CommitLogSyncPeriodInMS ms, so you can potentially lose up to that much data in a crash ... You can also select "batch" mode, where Cassandra will guarantee that it syncs before acknowledging writes, i.e., fully durable mode"
Cassandra has very fine grained control over just about everything to do with consistency and durability. I believe you can pick your desired level of consistency at access time.
>These two projects solve totally different problems
No they don't. This is the wavy-hands NoSQL defensive shield that reeks of insincerity. If you show Cassandra or Redis or some other solution replacing a MySQL install, well that's just awesome, but don't dare compare if it doesn't come out the winner.
A lot of people have workloads that could work in VoltDB, a classic RDBMS, or Cassandra, equally. There are workloads that only fit in specific silos, but they are less universal than you imply.
>So why benchmark against Cassandra? It's got a lot of buzz around it, of course. What a better way to shove your name into the "NoSQL" ring. Blech.
Okay this is just silly. Cassandra is the big name in the "next gen database" world -- of COURSE any new entrant is going to compare against it.
That's like benchmarking Berkeley DB vs. MySQL. They are on totally different levels of complexity. You can't compare memory-only db performance against a disk based db, period.
>You can't compare memory-only db performance against a disk based db, period.
But...you can. What do you mean you can't compare? Clearly you can, however mortified you might be at that prospect.
A reasonable motorcycle can go from 0-60 in about 4 seconds. A reasonable car can do it in about 9 seconds. But you need to carry two passengers so the car is your only option, and such a comparison doesn't matter to you, but to a lot of people it's interesting if ultimately they just want to get from A to B as quickly as possibly. Then again if you want to transport goods maybe you need a truck, or a train.
This is so silly. Wait -- hand wavy -- that's right, nothing can be compared to Cassandra but pure love itself.
I don't think it's hand-wavy. I think you're upset about something else related to Cassandra that perhaps you read recently -- not tlack's comment. Suggesting it would make more sense to compare an in-memory data store to another in-memory data store would be a more interesting comparison seems a perfectly valid suggestion.
"This is so silly. Wait -- hand wavy -- that's right, nothing can be compared to Cassandra but pure love itself."
C'mon, man. That doesn't further discussion. That sort of statement serves only to incite anger.
> I think you're upset about something else related to Cassandra that perhaps you read recently
Huh? No, I love Cassandra. She's a beaut.
tlack didn't say "it would make more sense to compare an in-memory data store to another in-memory data store". They said "You can't compare memory-only db performance against a disk based db, period.". There's a pretty profound difference between those two statements.
Of course you can compare them. You can compare the speed of Oracle on a huge RAC cluster vs. text files on an Amiga 500 floppy drive. But no one would, because it's stupid and worthless. I guess that's what I meant: this is a stupid and worthless article.
I didn't mean to claim anyone will be struggling to decide between VoltDB and Cassandra and then choose VoltDB based on the benchmarks we did. I think that's as ridiculous as you do.
Our point, which perhaps I made poorly, was twofold.
1. You can be both fast and SQL. Nothing about the language itself was ever the bottleneck.
2. VoltDB isn't just for big complicated transactions. You can use SQL for KV-type workloads and perform.
There's 100 other reasons to pick one data layer over another, and the best tool will be different for different problems.
Why not compare it to Redis instead? They are both in-memory snapshotting stores, sure Redis does less but it is still a closer solution. But, Redis would probably be faster.
Redis does not support partitioning so it is even more apples to oranges then VoltDB vs. Cassandra. Both VoltDB and Cassandra rely on adding nodes to scale.
I suspect that the wakemates themselves aren't doing the mobile dev work, but are outsourcing it and spending their time on the difficult stuff - getting the hardware operational and approved. In that case, if you've got the money, why not get them all developed simultaneously?
The reason you don't do all simultaneously is they all will likely suck, and then you have to redo three apps instead of one.
I would think you would want to do one mobile app just right, and then port it to other platforms. You might do a spec for all three up front so you think through each platform, but why not push one platform, iterate, and then take what you've learned before building?
Meteor Solutions (http://www.meteorsolutions.com) is hiring! We're looking for a couple of sharp devs to join a small team building cool stuff in Seattle. Come hack on Python, Javascript and CouchDB! Send an email to banderson@meteorsolutions.com if you're interested.