Feeds

Twitter: It's the end of the sysadmin as we know it

Grow some Ganglia

Secure remote control for conventional and virtual desktops

Web 2.0 Expo Twitter says it's the end of the sysadmin as we know it.

Speaking this morning at the annual Web 2.0 Expo in San Francisco, Twitter operations man John Adams warned sysadmins they won't succeed on today's intertubes unless they learn to do a bit more than system administration. Sysadmin 2.0, he said, must develop certain talents for analyzing data.

"This is a whole new world, " Adams said. "For the longest time, people ran large data systems on a kind of ad hoc basis. We're in a world now where so many people are depending on the real-time web. A system administrator is not just a system administrator anymore. You have to use analytics. You have to grab data. You have to look at where a site is trending and where things are going so you can scale...

"If you don't start doing this work early, if you don't start collecting this data early, you will fail."

That may be stating the obvious. But there you have it.

It's no secret that Twitter endured its own scaling problems in its earlier days, as the digerati embraced the Web2.0rhea service en masse. But in mid-2008, the company ported a portion of its core code from Ruby on Rails to Scala - a new-age programming language that combines functional and object-oriented techniques - and in 2009, according to net-research outfit comScore, the micro-blogger rode a 1,358 per cent traffic leap.

Adams believes the company has handled such growth in large part because it uses open source tools like Ganglia to track performance across the site's back-end infrastructure. Currently, Adams explained, he and his ops team track about 15,000 points of site performance.

"If you're collecting data, you want to look at the aggregate. It's about how many areas across the site where you see errors. A server isn't going to tell you much. It's about the overall application," Adams explained. "We do this in as near-real-time as possible."

Well, we would hope so.

Adams acknowledged that Twitter has endured its fair share of performance problems - and in the past, the company has freely admitted that its original infrastructure was ill-suited to rapid scaling - but he insists that careful preparation got the company to where it is today. "In the first year, the site was kind of unstable, but what we were doing was trying to ramp up to a position where we are planning the future."

Such data tracking, he said, helped sidestep the so-called Twitpocalypse, the Y2K of Web2.0rhea. Many suspected that third-party clients would start failing when the unique identifier attached to each Tweet reached first the signed integer limit (2 31) and then the unsigned limit (2 32). Mining site data, Adams said, his crew accurately predicted when it would hit these limits - which meant they knew just how soon they need to make any infrastructure changes.

Over the past six months, he said, the site's most important infrastructure upgrade was the adoption of the Ruby on Rails application server known as Unicorn. The company still uses Ruby on the front-end, and Unicorn has provided an estimated 30 per cent performance improvement over the previous setup, based on Apache and a server called Mongrel.

Adams compares Unicorn to a grocery store where a single line feeds a row of cashiers - as opposed to the traditional setup where each cashier has a separate line. With Mongrel, each worker had its own request queue. With Unicorn, there's a single request queue, and requests are feed to workers as they become available.

"Ordinarily, if you're standing in line at a grocery store, you don't have any idea how quickly each cashier is going to move people through. You have ten random lines. You stand in one line. And you have no idea how long you're going to wait," he said. "The other model - which pretty much describes Unicorn - is where you have one line, and when a cashier is done, it grabs the next person from that one line. This creates a very rapid response."

He added that the 30 per cent performance boost from Unicorn affords only a "few more days of scaling" on a site like Twitter - but, he says, "every small amount helps."

Adams's other big piece of advice for aspiring Sysadmins 2.0 was to avoid - as much as possible - pulling data from disk. "Another discovery that we made when trying to increase the scale of Twitter was that disk is the new tape," he said. "With any sort of social networking operation - juggling followers, sending mail, etc. - disk is extremely slow."

Twitter has worked around the disk problem with, yes, memached, the open source distributed-memory object-caching system. But he also warns that there's such a thing as too much memcached. "You can't rely too heavily on memcached," he said. "If you put too much data into memcached, you enter this problem where if the memcached server goes down, you have to take all the data out of your database and reinsert it into memcached. You want to cache in a balanced way."

In addition to memcached, MySQL, and the open-source distributed database Cassandra, Twitter leans on a message-queue server known as Kestrel and a kind of follower database known as FlockDB. Both were developed at the company, and both have now been open sourced.

In one sense, Adam and his team exhibit a certain Google-like quality. They believe in data. But their Googleness goes only so far. They're also committed to open sourcing their back end. ®

Secure remote control for conventional and virtual desktops

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
China hopes home-grown OS will oust Microsoft
Doesn't much like Apple or Google, either
Microsoft refuses to nip 'Windows 9' unzip lip slip
Look at the shiny Windows 8.1, why can't you people talk about 8.1, sobs an exec somewhere
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
Microsoft cries UNINSTALL in the wake of Blue Screens of Death™
Cache crash causes contained choloric calamity
Eat up Martha! Microsoft slings handwriting recog into OneNote on Android
Freehand input on non-Windows kit for the first time
Linux kernel devs made to finger their dongles before contributing code
Two-factor auth enabled for Kernel.org repositories
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?