MySQL's growing NoSQL problem
Web application payback
Open ... and Shut Just a few short years ago, MySQL was the undisputed king of the open-source database hill. But with the NoSQL market emerging at an 82 per cent compound annual growth rate (CAGR), it's looking like MySQL may get bulldozed by its NoSQL peers.
While this shift toward NoSQL provides an interesting commentary on where the industry is heading, it's even more instructive about the frenetic pace of innovation that open source is driving.
It says little about Oracle's stewardship of MySQL.
By most accounts, Oracle has taken good care of MySQL, investing resources to improve the technology and continuing to foster its community. As Matthew Aslett, research manager with 451 Research, notes: "The MySQL ecosystem is now arguably more healthy and vibrant than it has ever been, with a strong vendor committed to the core product, and a wealth of alternative and complementary products and services on offer to maintain competitive pressure on Oracle."
Aslett shared a few thoughts in a presentation entitled Whatever Happened to MySQL? at the Open Source Business Conference (of which I'm programme chair: it was held in San Francisco earlier this week). He said the majority of cases where companies abandoned MySQL in the wake of Oracle's ownership suggested Oracle's management of MySQL has been positive or, at worst, neutral.
This is why 451 Research pegs the MySQL market at $664m by 2015, growing at a healthy 40 per cent CAGR:
What isn't healthy in that chart, however, is the rise of the NoSQL database market. While Aslett was quick to point out that only 12.7 per cent of those companies abandoning MySQL were directly replacing it with a NoSQL database, he went further to suggest that this still represents an infinitesimal drop in MySQL's installed base. The most common replacement for MySQL? PostgreSQL. But even this is a tiny sliver of the overall MySQL installed base.
Running the numbers, Aslett pointed out that NoSQL vendors collectively claim 900 paying customers. If 25 per cent of these reflect replacements of MySQL that still represents less than 1.5 per cent of the estimated installed base of Oracle's MySQL paying customers, or less than 0.002 per cent of the total estimated MySQL installed base.
In other words, NoSQL is not making a dent in MySQL's installed base.
But where NoSQL poses a clear and present danger to MySQL is in the web application market where MySQL has made its name. Few are going to rip and replace a database for existing applications, but new applications are increasingly going the NoSQL route. As 451 Research notes: "NoSQL database technologies are largely being adopted for new projects that require additional scalability, performance, relaxed consistency and agility."
Back in 2009 then-MySQL chief executive Marten Mickos argued that Oracle should be allowed to buy MySQL as part of its bid for Sun Microsystems, as MySQL didn't directly compete with Oracle. As he said: "MySQL is growing like crazy. That hasn't hurt Oracle. MySQL works for Web-based applications. Oracle is for older, legacy applications."
Today those same, web-based applications are being built with NoSQL, and decreasingly MySQL.
This is actually pretty amazing when we stop to consider just how quickly this shift occurred. While NoSQL was first introduced as a concept in 1998, it really wasn't until 2009 that it emerged as a real trend. At that time, MySQL was the undisputed leader in open-source databases. But this dominance has had a short shelf-life, as Aslett presented using a series of 451 Research headlines:
“MySQL was very much the crown jewel of the open source database world.”
– May 2008
“There are relatively few choices for Oracle's rivals to respond to its ownership of MySQL.”
– May 2009
“The database market is awash with open source databases with lightweight architectures targeted at web applications.”
– April 2011
From no real alternatives to MySQL to an overabundance, and in just two years. That's an amazingly fast shift, and it says a great deal about how open source drives innovation.
Think about the important technologies that move the industry today. CIOs cite cloud and Big Data as their top-two priorities in 2012, according to a recent InformationWeek survey:
CIOs top two budget priorities for 2012? Cloud and Big Data, according to new InformationWeek survey data twitter.com/mjasay/status/…— Matt Asay (@mjasay) May 24, 2012
Both cloud and Big Data are overwhelmingly fueled by open source, whether it's Hadoop, Pig, Linux, or OpenStack. While open source has certainly invaded the data centre, it's really in the cloud that it dominates, as Bryan Che, senior director of product management at Red Hat, declares:
Open source is certainly at the foundation in terms of building out cloud technologies. If you take a look at market share in the server space, as you look at traditional data centers, about 70 per cent are running on the Windows platform and about 30 per cent are running Linux. As you take a look at what operating systems people are choosing to build applications on in the cloud, the ratio flips completely.
The concept of cloud has been around for awhile, but it didn't really hit its stride until relevant open-source projects made it cost effective to build and run a cloud. Similarly, we've had data mining and warehousing for many years, but it wasn't until Hadoop changed both the economics and performance of mining Big Data that the trend really came into its own.
We are now in a hyper-innovation mode when open source is no longer really competing against proprietary software, which can't keep up, but is instead competing with other open-source projects. There is rampant competition within the database market, as noted above, but also within Hadoop, for example, and between a wide array of open-source technologies. For the customer, it means that it's getting harder to choose between alternatives, but it also means those alternatives are getting better all the time, faster than ever before. ®
Matt Asay is senior vice president of business development at Nodeable, offering systems management for managing and analysing cloud-based data. He was formerly SVP of biz dev at HTML5 start-up Strobe and chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfresco's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). His column, Open...and Shut, appears three times a week on The Register.
Insert pithy witticism here
Why would people pick MySQL in the first place? Because "everybody else does it", and they don't know any better. As the M in LAMP it sees a lot of automatic deployment even if a different approach might've been a better idea. It happens.
Why would people move to NoSQL? Because it's the hype du jour, or they (think they) need "web scale" or other applications that if you would try to tackle them with "traditional SQL" you risk running head first into a brick wall of conflicting requirements. Or maybe because you can get better performance with comparable guarantees if you're willing to put up with the fiddlyness that's inevitably part and parcel of the relative immaturity of the new field.
Why would people move from MySQL to PostgreSQL? Because they realised they really do need guarantees like ACID, which MySQL only "follows"*, not actually really completely provides. And half a guarantee isn't a guarantee, even if most casual use is almost certain never to run into the corner cases where MySQL drops the ball.
So the field is changing. Is this a bad thing? By no means. MySQL is a nice hobbyist tool with bells and whistles and things. It serves lots of basic tasks pretty well. It even scales to a certain degree, thanks to lots and lots of people pushing it onward. But if you need something better, well, you need something better.
I disagree it's getting harder to choose, actually. You can still pick MySQL, or anything else you fancy, and hope to the skies above that it's good enough. Since the field tends to improvement the chances of picking something that'll be reasonable close to good enough are continually improving. And if that doesn't work you can (nearly) always throw more hardware at the problem until it goes away, and that happens often enough too.
The way to get a really good fit is still the same as before: Understand the problem, the sort of queries you do on what sort of data and so on, understand the theory of operation, and let your choice be driven by your requirements. With ever more options, the chances are there'll be one or a few "close enough" and maybe some will be open source too.
The problem then is more that you have to understand enough of the various options to make an informed choice. Since most of the NoSQL wave are more or less niche products, you can ignore them if you're in a different niche. That then leaves the more generic ones to choose from, and if you are informed you can make an informed choice.
As a trendwatching piece the article is a tad pathetic. CIOs shouldn't worry about this sort of thing. They should instead be hiring people who bring the right know-how to their projects and who'll pick the right tool for the job. Which, short of projects that are indeed best served by yet another LAMP stack, rarely would be MySQL. But then, there's still CIOs that hire MCSEs and A+ certificees, so MySQL will have a future for quite a while yet. No, MySQL was never my first choice, as you probably can tell. Popularity alone doesn't get the job done, and when all's said and done, it's the work that counts.
* That's what the manual says. It also says that constraints clauses are parsed but ignored. That chucks the C in ACID right out the window. If you need constraints at all, you need something that can provide that functionality reliably, not just when it feels like it.
No zero-sum brawl here
"What isn't healthy in that chart, however, is the rise of the NoSQL database market."
Somewhere in all this good news you lost me. I'm thinking you somehow were meaning to say mySQL now has competition, and that that's bad for mySQL. Then you show me a chart that says they quadruple revenue in 4 years. I'm supposed to shed tears because that could'a been quintuple instead?
Then you quote two sources two years apart that show markets explode in unexpected directions, and try to quote an estimate for 3 years from now?
Sounds like all good news. I can remain a fuddy-duddy bigSQL user while others can be castor-oil powered nanonuSQL users. There's choices we never imagined, and mores'a'coming. Think of all the intermural developer slagging possible. Fun fun fun!
BTW: Surely it's okay to have an article cover two interrelated subjects, without pretending it is just one overheated subject. "Outlook for MySQL and noSQL" is the subject here, not "Orcs rule, goblins drool!"
Re: Insert pithy witticism here
MySQL is only good enough if you don't care about consistent performance (thanks to a shitty optimiser, table level locking and inneficient joins on MyISAM tables), can tolerate corruption of data with something as simple as reindexing, intuitive behaviour (for example extend a column type, say int to bigint, and unlike most RDBMS any index using that column will still treat it as an int), and downright bullshit in the documentation (integrity is only ever "an application level concern", which they claimed since MyISAM doesn't support foreign key checking). That's just the tip of the iceberg. InnoDB is poorly documented (try performance tuning it), and can drop indexes entirely without warning - a bug that bit me.
As for PostgreSQL, it's used all over the place, which is why it supports a growing number of large consultancies and companies selling services built on it. It hasn't been a primarily academic projct since the mid-90's, and sorry, you show your ignorance if you think that the features PostgreSQL supports that either MySQL doesn't or only does so poorly are niche ones. For example. PostgreSQL has a mature, performant and very well featured full text indexing plugin. MySQL in comparison has a piss poor full text indexing feature (a single compiled in stop word list for example) which only works on MyISAM tables, not the InnoDB ones you use if you care anything about your data.
As for switching from MySQL to Monty's hooby project MariaDB - good luck with that. At least Oracle have competent developers.
To most people it's not.
To Matt Asay - the penguin only knows how this man's mind works.
I've been trying to make sense of it for months now, and all I can see is a list of faddy brain spasms and wannabe-corporate non-sequiturs.
@Robert Long 1
I am far from being an academics person - I see myself as a software/systems engineer. I am working for businesses who have real money to lose if their valuable data gets corrupted. I feel really shitty if I have to run "repair table" on a MyISAM table filled with expensively generated data. And I feel even shittier if the number of rows is smaller than before the "repair" action.
In one of my last jobs we had a 60GB MyISAM DB and I felt like being very smelly and brown in that job. We did nothing exotic and had only minimal query load (less than 10 queries/s).
I assume all your consulting jobs are department-level DBs which can basically be burned down without much damage. Or worse, you don't realise the problem.
So from an engineering point of view, Postgres is in many (integrity, query optimization, feature set) ways superior to MySQL. And that translates in much better economics on the long run, as your trousers won't be pulled because MySQL fscked up your mission-critical data. There is a reason some people even spend lots of money for the Ora RDBMS. They want insurance against Business Destruction.