Revealed: Inside super-soaraway Pinterest's virtual data centre
How to manage a cloud with 410TB of cupcake pictures
It's every startup's dream: to be growing faster than Facebook without having to build a Facebook-sized server farm.
Pinterest is an online picture pinboard for organising your favourite snaps and sharing them. It was founded by Ben Silbermann, Paul Sciarra, and Evan Sharp in March 2010, and it's growing like crazy with just 12 employees. It raised $74.5m in three rounds of funding in the past year, yet the only thing that Pinterest isn't doing is buying warehouses of servers.
Speaking at the AWS Summit in New York earlier this month, Ryan Park, operations and infrastructure leader at Pinterest, gave a sneak peek into the Pinterest data centre, which runs on the AWS cloud.
According to ComScore data cited by Park in his presentation, Pinterest had 17.8 million monthly unique visitors as February came to a close. According to ComScore, it took the Tumblr blog-hosting service 30 months to break through 17 million uniques; Twitter took 22 months; Facebook took 16 months; YouTube (now part of Google) took 12 months; but Pinterest only took nine months after opening up its service in May 2011. And that is with an invite-only beta programme.
Among other things, the Pinterest pinboard uses Amazon's S3 object storage to keep the photos and videos that its millions of users have uploaded. Between August last year and February this year, Pinterest has grown its capacity on S3 by a factor of 10, and server capacity on the EC2 compute cloud is up by nearly a factor of three, according to Park, from about 75,000 instance-hours to around 220,000.
"Imagine if we were running out own data centre, and we had to go through a process of capacity planning, and ordering hardware, and racking that hardware, and so on," said Park in his keynote at AWS Summit.
"It just would not have been possible scale fast enough – especially with such a small team. Until about a month ago, I was the only operations engineer at the whole company."
Park walked through the basic architecture of the Pinterest application and the virtualised iron underneath it, and then explained how the company's use of autoscaling and different kinds of compute instances on AWS have evolved over time.
The Pinterest application stack has five basic pieces:
There are 150 high-core EC2 instances that run the Python web application servers that power Pinterest, which has deployed the Django framework for its web app. Traffic is balanced across these 150 instances using Amazon's Elastic Load Balancer service. Park says that the ELB service has a "great API" that allows Pinterest to programmatically add capacity to the Python-Django cluster and also take virtual machines offline that way if they are not behaving or need to be tweaked.
The Pinterest data centre on the AWS cloud also has 35 other EC2 instances running various other web services that are part of the pinboard site, and it also has another 90 high-memory EC2 instances that are used for memcached and Redis key-value stores for hot data, thereby lightening the load on the backend database.
There are another 60 EC2 instances running various Pinterest auxiliary services, including logging, data analysis, operational tools, application development, search, and other tasks. For data analysis, Pinterest is using the Elastic MapReduce Hadoop cluster service from Amazon. This costs a few hundred dollars a month, which is cheaper than having two engineers babysit a real Hadoop cluster, explained Park.
"And better than that, we are also able to experiment with new services like this, very easily and with very low risk. There's no big sales process or big up-front costs when we are trying something out. And so we can try experiments and see what works and what doesn't."
The genius of the setup is that you find what doesn't work and move on, and when you find something that does work, you can scale up capacity to support it quickly.
The Pinterest setup has a MySQL database cluster that runs on 70 master nodes on 70 standard EC2 instances plus another 70 slave database instances in different AWS availability zones for redundancy.
This database is shared into thousands of pieces, with each shard having users' account information and pins and boards within it. Each shard has thousands of users, and the site never runs queries that will span shards. When the shard architecture for the MySQL backend was launched last November, Pinterest had eight master-slave pairs. It has split three times since then, with 64 pairs right now and another 6 running other databases relating to the site but not to the pinboard and users accounts.
The S3 file storage currently has 8 billion objects in it, which weigh in at 410TB.
When Park made the presentation he showed at the AWS Summit, presumably a month or so ahead of the show, the company had only 80 web application servers, so the following data is based on those machines, not the 150 it had as of the end of April.
Initially, like any other data centre manager, Pinterest went out and provisioned its web server farm to be able to meet peak capacity and then have 25 per cent or so head room on top of that for crazy spikes:
Pinterest is still largely an American Midwest phenomenon (that's changing, of course), and so at night, the provisioned EC2 images supporting those Python application servers were just spinning their clocks, doing nothing useful except giving Amazon money. So Pinterest turned on the autoscaling feature of EC2, allowing AWS to automatically dial up and down instances with some headroom built in:
The average reduction in web server instances using autoscaling was 40 per cent over the course of a single day, and because CPU-time is money on AWS, it saves about 40 per cent for the web server farm.
Here's how the costs break down over the course of a day:
At the peak, Pinterest is spending $52 per hour to support its web farm, and late at night when no one is using the site too much, they are spending around $15 an hour for the web farm, said Park.
To push the costs down even further, Pinterest has figured out how to use a mix of reserved, on demand, and spot EC2 capacity for the web farm:
Basically, that baseline capacity needed to support users who are hitting the site in the wee hours of the American timezones are reserved up front, which has a lower per-unit capacity cost. Then the expected capacity for the day workload is acquired with the normal on-demand instances, which you pay for by the hour. Peaks are paid for through spot EC2 instances, which generally cost less than the on-demand instances.
Pinterest has created a watchdog service to work with Elastic Load Balancer to make sure it is never more than a few EC2 instances shy of safe capacity for reserved and on-demand instances. The upshot is that its peak web server costs are under $35 per hour, down from $52 per hour, and costs drop all along the curves that plot out the day.
Makes you want to get a case of beer and sit around with your buddies and form a startup, don't it? ®
Re: And how much does 410TB of storage on S3 cost?
"The same amount or not much more will buy you 410 x 3TB drives (so you can store 3 copies of everything for redundancy), and it's a one-off cost."
Yes, but then you have to have something to put those drives in, so you have to buy a storage array. That's another "one-off" cost. Also, we'll assume you want some RAID protection built in, and SATA drives take forever to rebuild, so you'll want RAID 6, plus some hot spares, so figure you'll need to buy some extra drives. Most enterprise storage vendors, however, are going to charge you a premium for the disks, so the price of the disks will jump rather dramatically from a bog-standard SATA drive. Even assuming you go with a low-cost vendor, you're still going to have to power and cool the whole thing, and the vendor is going to want you to pay for ongoing support and maintenance. If you want performance to scale beyond the paltry capabilities of your 500 or so SATA drives, you'll actually need to buy some extra drives to wide-stripe the i/o, and/or some Flash based storage to act as a front end for the hot blocks, further driving up cost.
Finding the price/performance sweet spot in terms of cost is difficult and made harder by the fact that most storage vendors are, ahem, not entirely forthcoming with the true amount of usable storage and performance that you actually get once you've provisioned it all. It sounds like Pinterest have taken capital expenditures out of the budget almost entirely and moved everything into operational expenses, and their operational expenses are spent on the actual capacity instead of trying to design, provision, and maintain a data center.
Over time, I would expect that it would be cheaper to build out a data center and storage array, but storage remains a surprisingly tricky and expensive resource, especially as it scales, so I would want to crunch those numbers *very* carefully, with an eye towards deliberately ignoring the TCO calculations foisted off by incumbent server and storage vendors.
Looks like the only thing Pinterest has done of any interest is be a case study for AWS.
Re: And how much does 410TB of storage on S3 cost?
dont forget that you have a lot more in costs than just the drives. lets just say that you are going to wind up using 250 2TB drives by the time that you are done to have 410 TB, plus RAID and hotspare. Each one of these drives will run about 800 in my experience (of course it could be less in this kind of quantity, but humor me). So that is 200,000 right there. That doesn't include the trays and SAN headers. So that is probably about 21 trays for disks, at around 5,000 a piece, so a total of about another 100,000. For the SAN headers to handle this, I imagine that you are talking another 100-200K, lets go midway to 150,000. So far we are at 450,000. We will also be using at least 2 cabinets, very possibly 3 to store it. Lets go with 3 so that is a small amount of about 10,000. Then you have your UPSs, not sure on the total, but I am sure it is another 10,000 easy. dont forget the SAN switches and HBAs that you will need, so that will be another 50,000. Now you need support contract costs, space to lease for storing the racks, someone to be able to manage it, installation costs. Then you have to option if you want to have it replicated, and then you can double everything. dont forget that you need a system admin to manage this now, which will cost 100K annually.
Anyways, I see it being at least around 1/2 a million up front costs for this and quite easily much more, and that is not counting any of the recurring costs. I don't know what kind of performance Amazons storage offers, but if is higher performance I can see the numbers rapidly climbing. Anyways Enterprise storage and consumer storage are two different things so the cost is not as outrageous as it seems.