Feeds

Social networks talk hidden architectures

Back-stage bytes

Beginner's guide to SSL certificates

Social networks are almost pervasive. Even if you're not actually on one, it's becoming impossible to avoid hearing of them and often it's the same networks that keep popping up, such as Facebook or MySpace.

While they might be well known, though, the companies tend not to discuss the architectures that underpin their services. Are they running some cleaver RESTful science behind the scenes or are they just using a vanilla combo of PHP on lots and lots of Windows or Linux servers?

Facebook, MySpace, Digg and Ning recently shared their trials and tribulations at the QCon conference in San Francisco, California.

Dan Farino, chief systems architect at MySpace.com, said his site started with a very small architecture and scaled out. He focused on monitoring and administration on a Windows network and the challenge of keeping the system running on thousands of servers.

"Yes, we run Windows!" he said. "It's actually a pretty good server platform. IIS is a pretty good web server. Tuned properly, it's going to serve pages; it's not going to crash or tip over when it gets Slashdotted with two requests. It's pretty solid. What isn't solid about Windows is the large-scale management tools."

MySpace relies on 4,500-plus Windows-based web servers. A middle-tier cache has been added, but it's still "basically a bunch of servers from an operational perspective," Farino said.

Data challenge

A key challenge for MySpace was to come up with the tools for quickly collecting data when there's a problem, so that those problems could be analyzed and avoided in the future. To collect the operational data needed for proper analysis, Farino developed a custom performance monitoring system that tracks real-time CPU requests queued, request per second, and similar information live across the company's server farm.

Digg.com is the largest content aggregator on the Net, with 3.5 million registered users. Lead architect Joe Stump gave a peak behind the scenes of a system that handles about 15,000 requests per second and reports serving approximately 26 million unique visitors per month.

Stump described the system's innards as "an architecture in transition."

"I call it that because Digg started out as a harebrained idea," Stump said. "It wasn't one of those projects where you start out saying, what happens if in a year I'm doing 300 thousand diggs a day and 15 billion common diggs?"

Digg.com uses MySQL, but on top of that is a library designed to interact with the DBMS in a specialized manner, Stump said. "Scaling is specialization," he added. "You can't just take a commercial product off the shelf, throw it into production and hope it works. The way you normally scale things is to take a few different components, layer something on top of it that has your specialization stuff in it."

Facebook has evolved into a huge social network. It has more than 120 million active users and 10 billion photos, and serves up 50 billion page views per month.

Remote control for virtualized desktops

More from The Register

next story
Download alert: Nearly ALL top 100 Android, iOS paid apps hacked
Attack of the Clones? Yeah, but much, much scarier – report
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Whistling Google: PLEASE! Brussels can only hurt Europe, not us
And Commish is VERY pro-Google. Why should we worry?
Microsoft: Your Linux Docker containers are now OURS to command
New tool lets admins wrangle Linux apps from Windows
First in line to order a Nexus 6? AT&T has a BRICK for you
Black Screen of Death plagues early Google-mobe batch
Microsoft adds video offering to Office 365. Oh NOES, you'll need Adobe Flash
Lovely presentations... but not on your Flash-hating mobe
prev story

Whitepapers

Go beyond APM with real-time IT operations analytics
How IT operations teams can harness the wealth of wire data already flowing through their environment for real-time operational intelligence.
The total economic impact of Druva inSync
Examining the ROI enterprises may realize by implementing inSync, as they look to improve backup and recovery of endpoint data in a cost-effective manner.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Mitigating web security risk with SSL certificates
Web-based systems are essential tools for running business processes and delivering services to customers.