Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
In a surprising move, one startup has been forced to migrate its data out of a trendy NoSQL database and into a traditional relational one after running into numerous technical issues with the fluffy new tech.
Revolv makes hardware that helps connect wireless devices together and gives the user control over them. "The Revolv set-up wizard makes it easy to find and connect to the wireless devices you already own and have up and running, like lights, locks, thermostats, speakers, smart-plugs, shades, sensors and more. Once connected, you can easily create your own Actions based on an array of triggers, such as GeoSense, time of day, and sensors," the company explains.
This means that although it doesn't have to deal with the fearsome data processing rates of consumer giants such as Twitter or Facebook, it does have a complicated set of data groupings – "one to one relationships and one to many relationships and many to many relationships," explained Butcher.
Its database is needed to this, as well, and this is where NoSQL broke down for it.
Like many young companies, Revolv had dived head-on into a non-relational database (at first MongoDB then, later, Amazon's DynamoDB system) due to the ease of use and flexibility of the technology.
But as the startup grew, it ran into scaling and architectural issues that caused it lots of headaches for little gain.
Eventually it had to move to a traditional relational system (PostgreSQL), but it learned some valuable lessons about new NoSQL technologies along the way.
"You think you're biting off the easier solution, and part of the reason why is you're discounting the background stuff," Butcher told The Register. "A schema-less object storage database made it possible to ramp up and build something quickly without having to do a lot of the maintenance work you do on a relational DB. One of the factors that we misunderstood was we thought we could get better performance off of it."
Though DynamoDB is fast and easy to use, it lacks some features that are found in traditional relational databases. It also demands that queries be written in its own syntax, which was easy at first – but as Revolv scaled from 40 to 1,000 customers this became more and more difficult to deal with.
"For some reason we discount SQL as easy to write because the syntax can be a little terse, but when it's between that and writing 50 lines for a query, it's easy to write," Butcher said.
The NoSQL DynamoDB system had other problems, Butcher said, relating to how the company could easily process data stored in the system.
"DynamoDB was incredibly inflexible for us unless we started building out a Hadoop cluster or something like that," he explained. "From my background – HP and things like that – when you bite off a MapReduce solution to a problem you're buying into a pretty heavy overhead, there's a devops investment, a financial investment, architectural issues."
PostgreSQL, by comparison, is much easier for the company to deal with and hire for. In hindsight, Butcher says that DynamoDB's "schema-lessness was one of the factors that let us bootstrap an application in a month and a half," but to grow to a significant scale of tens of thousands of customers, the company believes it needs to be on a traditional relational system.
That doesn't mean it's leaving Amazon, mind – it's using Amazon's PostgreSQL relational database service so it doesn't have to take on the expense of managing its own hardware.
Perhaps the greatest issue Revolv faced stemmed from the youth of the current crop of NoSQL technologies.
"Frankly, it's frustrating to have to build and rebuild even trivial tools for NoSQL databases," Butcher wrote in his blog post. "And it's equally frustrating to find so many knowledge gaps and documentation shortages. So much time is spent on figuring out and doing the mundane. The SQL database world just looks much better on this front."
We here on El Reg's database cluster are sure that NoSQL technologies have some great benefits (the flexibility of a document-oriented system like MongoDB, or the resilient ring-storage structure of Riak, for instance), but many startups risk being spiked by arcane problems that come about as they scale. ®