Feeds

Bitbucket's Amazon DDoS - what went wrong

A cautionary cloud tale

The essential guide to IT transformation

After a DDoS brought down Bitbucket's web-based code-hosting service for more than 19 hours over the weekend, Jesper Nøhr speculated the attack had exposed a flaw in the sky-high Amazon infrastructure that hosts the site. Nøhr - who runs Bitbucket - has since spoken to an "Amazon executive" about the attack, and according to his account of the conversation, his earlier speculation was right on the money.

Bitbucket runs its entire site on Amazon's Elastic Compute Cloud (EC2), which provides scalable processing resources, and it uses Amazon's Elastic Block Store (EBS) to store its database, log files, user data, and more. EBS provides persistent storage for EC2 server instances. The problem, according to Jesper Nøhr, is that the storage system operates on a network channel that's exposed to the outside internet.

Bitbucket's Amazon setup worked well enough until late last Friday, when Nøhr realized EBS was "virtually unavailable." The outage persisted for more than 16 hours, in part because both Nøhr and Amazon's support reps assumed there was some sort of problem with EBS. According to Nøhr, the first rep he spoke to attributed the slowdown to the fact that EBS is a shared resource used by other bandwidth-hungry Amazon customers. In a statement sent to The Reg, Amazon gives a similar story.

"Over the weekend, one of our customers reported a problem with their Amazon Elastic Block Store (EBS)," the statement reads. "This issue was limited to this customer’s single Amazon EBS volume and other customers were not affected. We did not immediately look beyond the reported problem and spent too much time focusing on what was believed to be an issue with the Amazon EBS volume."

But as it turns out, Bitbucket's Amazonian infrastructure had been DDoSed. "We were attacked. Bigtime. We had a massive flood of UDP [User Datagram Protocol] packets coming in to our IP, basically eating away all bandwidth to the box," Nøhr wrote on his blog. "So, basically a massive-scale DDOS. That’s nice."

Once the cause of the problem was determined - more than 16 hours after the attack started - Amazon blocked the offending traffic, and things were soon back to normal. But Nøhr - and so many other netizens who followed the story - couldn't understand why a DDoS attack tied up what should have been "internal" storage resources.

Nøhr guessed that Bitbucket's storage sits on the same network interface that connects the site to the outside world, and according to Nøhr, this has been confirmed by Amazon. "We were speculating whether all the traffic was on the same interface, and [the Amazon EC2 executive] told us this was true," Nøhr told The Reg.

According to Nøhr, Amazon also told him that the company Quality of Service technology - meant to prioritize the storage traffic - did not work as the company expected. "They said they were supposed to prioritize EBS traffic over other traffic so we wouldn't be bogged down by external traffic," Nøhr says. "But they admitted it wasn't working the way they wanted it to."

Amazon has not responded to a request for comment on this specific issue. But an earlier statement from the company doesn't contradict what Nøhr has said.

"What we ultimately found was not a problem with Amazon EBS, but rather that the customer’s Amazon EC2 instance was receiving a very large amount of network traffic," the statement reads. "This large flood of traffic overwhelmed the networking of the customer’s single Amazon EC2 instance and caused performance to degrade on all I/O operations on the instance. Once we properly diagnosed the problem, we worked with the customer to put measures in place to help mitigate the unwanted traffic they were receiving."

Like many, Scott Morrison - chief architect and VP of engineering at Layer 7, a company that offers an outside security solution for Amazon's so-called cloud - finds it rather hard to believe that Amazon would put EBS on an outside net connection. "It seems like [EBS] shouldn't be externally accessible," he tells The Reg. "It's bizarre. That's sort of like making NFS mounts accessible outside your firewall - something you would never do."

The other problem with Amazon's setup, according to Jesper Nøhr, is that customers like him have no way of viewing the DDoS traffic hitting their sites - i.e. they have no way of identifying an attack. What's more, he says, Amazon told him that even the "Gold" support reps he initially spoke to didn't have a way of viewing the traffic.

"[Amazon] said that there is a department at Amazon that monitors such traffic, but [Amazon] said the first line of support can't see it," Nøhr says. "In short, you can't really see into the problem, because Amazon's Web Services is kind of a black box."

None too surprisingly, Layer 7's Scott Morrison calls this "a huge problem." Again, Amazon did not respond to a request for comment on this particular issue.

On Friday, Nøhr payed $400 to get access to Gold support. And to Amazon's credit, it has told Nøhr it will refund the money. And though he questions Amazon's setup, he feels that the company ultimately responded quite well to the problem. "Amazon has been very transparent with us and very apologetic. I don't want their name to be dragged through the mud."

Amazon does tell The Reg that such an attack may have been avoided if Bitbucket had been using additional Amazon services, such as the recently announced Elastic Load Balancing and Auto-Scaling. And Nøhr says the company told him much the same.

Nøhr says the company also told him that in the future, it would provide additional information about web traffic to customers and support personnel in an effort to better identify such attacks.

Nonetheless, says Layer 7's Scott Morrison, all this should serve as a cautionary tale for those eyeing the, um, cloud. "This is exactly what people have been warning about in the cloud for a while," he says. "Sure enough, here is the perfect example." ®

Next gen security for virtualised datacentres

More from The Register

next story
Ice cream headache as black hat hacks sack Dairy Queen
I scream, you scream, we all scream 'DATA BREACH'!
Goog says patch⁵⁰ your Chrome
64-bit browser loads cat vids FIFTEEN PERCENT faster!
KER-CHING! CryptoWall ransomware scam rakes in $1 MEEELLION
Anatomy of the net's most destructive ransomware threat
NIST to sysadmins: clean up your SSH mess
Too many keys, too badly managed
Scratched PC-dispatch patch patched, hatched in batch rematch
Windows security update fixed after triggering blue screens (and screams) of death
Researchers camouflage haxxor traps with fake application traffic
Honeypots sweetened to resemble actual workloads, complete with 'secure' logins
Attack flogged through shiny-clicky social media buttons
66,000 users popped by malicious Flash fudging add-on
New Snowden leak: How NSA shared 850-billion-plus metadata records
'Federated search' spaffed info all over Five Eyes chums
Three quarters of South Korea popped in online gaming raids
Records used to plunder game items, sold off to low lifes
prev story

Whitepapers

5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up Big Data
Solving backup challenges and “protect everything from everywhere,” as we move into the era of big data management and the adoption of BYOD.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?