Feeds

Amazon cloud still on fritz after 36 hours

'All hands on deck'

Intelligent flash storage arrays

Amazon's cloud is still on the fritz, a day and a half after the company first reported connection problems, latency issues, and increased error rates across the service. But on Friday morning, the company said that full service should be restored for a "majority" of users by the afternoon Pacific time.

"We continue to see progress in recovering volumes, and have heard many additional customers confirm that they're recovering. Our current estimate is that the majority of volumes will be recovered over the next 5 to 6 hours," the company said in a post to its Amazon Web Services status page.

In some cases, Amazon said, it will take longer to restore data. With these volumes, the company is having to restore backups it made to its own S3 online storage service on Thursday.

The problems began in the early hours of Thursday morning Pacific time. At 1:41 am, Amazon said on its status page that it was investigating connectivity issues with its EC2 (Elastic Compute Cloud) service, which provides on-demand access to processing power via the web. The outage brought down several websites that run atop the service, including Quora, Sencha, Reddit, and FourSquare.

The outage also affected Amazon's Elastic Block Store, Relational Database Service, and Elastic Beanstalk services. And according to one post from the company, it all began with a "networking event" that triggered a large amount of re-mirroring of EBS volumes in the "East region" of Amazon Web Services. Amazon divides its so-called infrastructure cloud service into multiple geographic regions, and it guarantees 99.95 per cent availability within each region.

Some regions, including the East region, are divided into multiple "availability zones". For years, Amazon has said that these zones are "insulated" from each other's failures. But yesterday's outage spread across zones in the East region. Amazon has never said how these zones are designed. It's unclear whether they're locations in separate data centers or not.

"We can assure you that all-hands are on deck to recover as quickly as possible," the company said late last night. ®

Update

Amazon has now said that a majority of volumes have indeed been restored. "These volumes were recovered by ~1:30pm PDT," the company said at 2:15pm Pacific time. "We mentioned that a 'smaller number of volumes will require a more time consuming process to recover, and we anticipate that those will take longer to recover.' We're now starting to work on those.'"

Internet Security Threat Report 2014

More from The Register

next story
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
NASA launches new climate model at SC14
75 days of supercomputing later ...
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
DEATH by COMMENTS: WordPress XSS vuln is BIGGEST for YEARS
Trio of XSS turns attackers into admins
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Seattle children’s accelerates Citrix login times by 500% with cross-tier insight
Seattle Children’s is a leading research hospital with a large and growing Citrix XenDesktop deployment. See how they used ExtraHop to accelerate launch times.
10 threats to successful enterprise endpoint backup
10 threats to a successful backup including issues with BYOD, slow backups and ineffective security.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?