Social networks talk hidden architectures
Social networks are almost pervasive. Even if you're not actually on one, it's becoming impossible to avoid hearing of them and often it's the same networks that keep popping up, such as Facebook or MySpace.
While they might be well known, though, the companies tend not to discuss the architectures that underpin their services. Are they running some cleaver RESTful science behind the scenes or are they just using a vanilla combo of PHP on lots and lots of Windows or Linux servers?
Facebook, MySpace, Digg and Ning recently shared their trials and tribulations at the QCon conference in San Francisco, California.
Dan Farino, chief systems architect at MySpace.com, said his site started with a very small architecture and scaled out. He focused on monitoring and administration on a Windows network and the challenge of keeping the system running on thousands of servers.
"Yes, we run Windows!" he said. "It's actually a pretty good server platform. IIS is a pretty good web server. Tuned properly, it's going to serve pages; it's not going to crash or tip over when it gets Slashdotted with two requests. It's pretty solid. What isn't solid about Windows is the large-scale management tools."
MySpace relies on 4,500-plus Windows-based web servers. A middle-tier cache has been added, but it's still "basically a bunch of servers from an operational perspective," Farino said.
A key challenge for MySpace was to come up with the tools for quickly collecting data when there's a problem, so that those problems could be analyzed and avoided in the future. To collect the operational data needed for proper analysis, Farino developed a custom performance monitoring system that tracks real-time CPU requests queued, request per second, and similar information live across the company's server farm.
Digg.com is the largest content aggregator on the Net, with 3.5 million registered users. Lead architect Joe Stump gave a peak behind the scenes of a system that handles about 15,000 requests per second and reports serving approximately 26 million unique visitors per month.
Stump described the system's innards as "an architecture in transition."
"I call it that because Digg started out as a harebrained idea," Stump said. "It wasn't one of those projects where you start out saying, what happens if in a year I'm doing 300 thousand diggs a day and 15 billion common diggs?"
Digg.com uses MySQL, but on top of that is a library designed to interact with the DBMS in a specialized manner, Stump said. "Scaling is specialization," he added. "You can't just take a commercial product off the shelf, throw it into production and hope it works. The way you normally scale things is to take a few different components, layer something on top of it that has your specialization stuff in it."
Facebook has evolved into a huge social network. It has more than 120 million active users and 10 billion photos, and serves up 50 billion page views per month.
Sponsored: Hyper-scale data management