Feeds

Facebook open sources live MySQL makeover

Schema reschemed on fly

3 Big data security analytics techniques

Facebook has open sourced a new MySQL utility that lets the social networking colossus update its database indexes and juice query times without staging the changes on test servers. With the tool – known as Online Schema Change for MySQL, or OSC – it can update indexes on live servers.

In the past, according Facebook MySQL guru Mark Callaghan, the company needed a good six months to roll index updates across its sea of MySQL servers. Now, it needs no more than a few days. "This lets us make schema changes much, much faster," Callaghan tells The Register. "And the benefit from the changes is that database queries will be faster."

At Facebook, MySQL is the primary repository for user data, with InnoDB the accompanying storage engine. "When it comes to user data, the workload is transaction processing – many simple database queries with a lot of concurrency," Callaghan explains. "And that workload is something that MySQL, along with InnoDB, excels at supporting. Performance is very good on that workload."

Nonetheless, the company is always working to improve performance, creating new indexes or updating existing indexes as a means of juicing query response time. For Facebook, such changes take to long when they're done through ALTER TABLE statements or the "fast index create" tool offered by the InnoDB plugin in MySQL 5.1. One other option is to first stage changes on test servers: you make the change on an extra server, and then you move the live workload onto that extra server when the change is complete.

But such juggling doesn't suit the sort of massive infrastructure that backs a social networking site serving over 500 million users. "We run a, well, large number of servers," Callaghan says, "and we prefer to not have a large number of extra servers."

All Callaghan will say is that the company runs "X thousands" of MySQL servers. "X" is such a large number, the company needed a way of making index changes on live machines. There's an existing tool that allows for live updates – "oak-online-alter-table," from independent developer Shlomi Noach – but Facebook has merely used this as a starting point.

"We wanted to do some things slightly different," Callaghan says. "And we do a lot in PHP, which [Noach's] tool doesn't use. It's a great tool, but it just wasn't right for us."

Written by Facebook engineer Vamsi Ponnekanti, Open Schema Change lets the company update indexes without user downtime, according to Callaghan. "Previously, we could make an update quickly on a small number of machines, but to do it on all of the machines, it took six months," Callaghan says, explaining that the company could only make updates during off-peak hours. "Now we can do it on all of the machines at pretty the same time and we can do it in about half a day.

"The difference is that there is no downtime, so we have more flexibility in scheduling when we do it."

When updating indexes, Facebook makes copies of its MySQL tables, and this alone can take a few hours. In essence, the OSC utility gives the company a way of accommodating changes that were made to the table while it was being copied.

"We need some way of applying the changes that we missed after the copy was started," he said. OSC does so using database triggers. "When the copy finishing, we replay all of the changes that were logged by the triggers, and then we briefly – for a fraction of a second – block access to the table and then we switch the original table with the copy."

This allows the company to make a change whenever it likes – even in the midst of peak workloads.

The code for OSC is derived from Shlomi Noach's tool, and like Noach's tool, it has been open sourced under a BSD license. You can find the code here.

At Facebook, the tool has been in place for about four weeks. And it's been used to make index changes the company has been planning for well over a year.

Facebook has developed other systems for other data storage and retrieval tasks, including the open source Cassandra for inbox search, the Hadoop-friendly SQL-like language known as Hive for analytics, and the proprietary Haystack for photos. But at least for the time being, MySQL will remain the tool of choice for user data.

"I personally don't have the time [to build a replacement]," Callaghan says. "All of my time is dedicated to improving quality of service and reliability of MySQL. So if it's going to be done, it's not going to be done by me." ®

Update: This story has been updated to mention ALTER TABLE statements and the "fast index create" InnoDB plugin tool as possible options for making index changes.

SANS - Survey on application security programs

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Oh no, Joe: WinPhone users already griping over 8.1 mega-update
Hang on. Which bit of Developer Preview don't you understand?
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
Ditch the sync, paddle in the Streem: Upstart offers syncless sharing
Upload, delete and carry on sharing afterwards?
New Facebook phone app allows you to stalk your mates
Nearby Friends feature goes live in a few weeks
Microsoft TIER SMEAR changes app prices whether devs ask or not
Some go up, some go down, Redmond goes silent
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
prev story

Whitepapers

SANS - Survey on application security programs
In this whitepaper learn about the state of application security programs and practices of 488 surveyed respondents, and discover how mature and effective these programs are.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Top three mobile application threats
Learn about three of the top mobile application security threats facing businesses today and recommendations on how to mitigate the risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.