Off-Prem

SaaS

We sense a great disturbance in the Salesforce: File-slinging feature breaks down for more than 12 hours

It felt as if millions of SaaS customers suddenly cried out in frustration, then headed off for the weekend early


It has been a rough morning for Salesforce here in San Francisco: part of the cloud giant's sprawling empire fell over and stayed down for more than 12 hours. In fact, it stumbled over so hard, CEO Marc Benioff's techies are having to apply fixes to servers manually.

The SaaS titan told customers on Thursday night its Files service, used to store and share documents, was struggling worldwide to, well, read and write files. It's now Friday lunchtime, and some data remains inaccessible as engineers continue to roll out repairs.

"The technology team continues to manually update a subset of servers that need to have the fix implemented as well as test and validate the resynchronization process, which is the next phase of the resolution path," Salesforce told customers at 1030 PT on Friday.

"We continue to see a reduction in read/write alerts, and some customers are reporting that they can now read and write to new files.

"However, there is still a subset of customers being impacted by the issue. Files that were created after the incident began and before the fix was applied, however, may remain unavailable until resynchronization is complete."

The systems crashed on Thursday night at roughly 2330 PT. This meant just as Western Europe was beginning to wake up and start their Fridays, the Files part of Salesforce fell over.

This outage would carry over into the US East Coast and Midwest working day, and the morning for the West Coast where Salesforce is based.

By 1100 PT, the cloud giant said it was nearly through the tedious process of hand-fixing servers to address the problem.

"The technology team now has less than 5 per cent of the affected servers to manually apply the fix to, and the resynchronization process is being implemented to a subset of production hosts," the biz explained. "If the resynchronization completes successfully on those hosts, the team will then start the process of implementing resynchronization across the rest of the affected environment."

As we were preparing to publish this piece, shortly after 1300 PT, Salesforce posted the following revised update, indicating the outage is ongoing as is the repair work:

The technology team continues to manually apply the fix to a subset of the affected servers in three of the impacted data centers. All servers in the other five affected data centers have had the fix implemented.

The synchronization process is the next phase of the resolution path that will ensure files that were created after the incident began and before the fix was applied are once again available to customers. That process was initiated within the first of the eight data centers, and on successful completion of that activity and after the technology team validates the implementation, we will then initiate that process across the remaining data centers.

Inconvenient as it is, Friday's problem pales in comparison to the nightmare outage in May that caused portions of the service to go down over a span of three days.

Hopefully, most of those affected were able to find an alternate way to move files and, barring that, at least head out for an early start to the weekend. ®

Send us news
6 Comments

US-EAST-1 region is not the cloudy crock it's made out to be, claims AWS EC2 boss

It's the region where stuff gets stressed at scale first, says Dave Brown, as he plots variants of Amazon's Outposts

Cyberattack hits Omni Hotels systems, taking out bookings, payments, door locks

As WhatsApp, Facebook Messenger, other Meta bits plus Apple stuff fall offline today

Datacenter outages are on the decline, but when they hit, they hit hard

Power snafus take limelight in latest downtime diary from Uptime Institute

Huawei's cloud unit is its current growth vehicle

Big in China – and a presence elsewhere, but not at a scale to worry global hyperscalers

Huawei Cloud reveals the dynamic traffic allocation system it uses to cut bandwidth bills

Created during COVID to handle video boom and sliced bandwidth costs by 30 percent

SharePoint logs are easily circumvented and Microsoft is dragging its heels

Now is the perfect time to review those permissions

Irish power crunch could be prompting AWS to ration compute resources

Users report being pointed to other EU regions if they need more grunt

Alibaba Cloud slashes prices outside China

Domestic customers saw their fees cut last January

What happened to agility and new business models? Cloud benefits have all gone to IT

Orgs are missing a trick when it comes to the white fluffy stuff, survey says

Cloud vendor lock-in is shocking, but there's a get out of jail card

We've done it once, we can do it again

French lawmakers take a swing at cloud monopolies

Action gathers steam in the EU, US and UK as anti-trust teams collate market feedback

Intel courts devs with open arms and exotic hardware

Is Developer Cloud enough to steal Nvidia's thunder?