Feeds

PaaS Heroku's misleading of customers poisons the cloud

If you can't trust providers, then why outsource in the first place?

Secure remote control for conventional and virtual desktops

Comment Platform-as-a-service Heroku spent two years misleading customers because it was so focused on building a new product that it didn't bother to update old documentation – this is unacceptable.

The Salesforce-owned company admitted on Wednesday that customers using its 'Bamboo" application automation technology could have spent two years paying for a service that used different tech to that found in Heroku's docs, and that this tech entailed poor app performance that was nearly impossible for both Heroku and Heroku analytics partner New Relic to detect.

An equivalent situation would be a logistics company announcing it had not informed customers that it had shifted from using a central dispatcher to direct its trucks, to a system where each driver had a list of all packages and had to work out the delivery order with other drivers with no central command. Moreover, that this change meant packages were not getting delivered on time and that it had no way of informing customers of the tardiness of their packages.

Here's what Heroku said on Wednesday in an FAQ document responding to the customer concerns* provoked by its undocumented shift from intelligent routing to random routing:

Since early 2011, high-volume Rails apps that run on Heroku and use single-threaded web servers sometimes experienced severe tail latencies and poor utilization of web backends (dynos). Lack of visibility into app performance, including incorrect queue time reporting prior to the New Relic update in February 2013, made diagnosing these latencies (by customers, and even by Heroku’s own support team) very difficult.

Q. Were the docs wrong?

A. Yes, for Bamboo. They were correct when written, but fell out of date starting in early 2011. Until February 2013, the documentation described the Bamboo router only sending one connection at a time to any given web dyno.

Q. Why didn’t you update Bamboo docs in 2011?

A. At the time, our entire product and engineering team was focused on our new product, Cedar. Being so focused on the future meant that we slipped on stewardship of our existing product.

Cloud computing lives or dies on the trust a consumer can have in its provider, and for a company to make a mistake of this magnitude is unacceptable. Yes, documentation errors happen, but the negative effects they can have on customers can be much more pernicious in a service provider environment.

"Yes, we clearly messed up here, and we're paying for it in the form of community wrath and upset customers," Heroku CTO Adam Wiggins told us via email. "We're doing what we can to set it right with the community (full disclosure and open communication), and with our customers (working directly with them to get their app's performance to a good place, or anything else they want)."

Apologizing to affected customers and helping them move their applications to the new software systems is all well and good, but it doesn't deal with the lasting damage this sort of thing does to the perception of cloud computing.

When developers provision services from a cloud provider, they are outsourcing a logistics and/or business-to-business function. The proposition is that they no longer have to care (much) about the technology the provider uses to do the service, and they get to spend more time doing whatever it is their product derives value from.

In exchange the provider delivers the agreed upon service in a timely manner, while using its domain expertise to either improve performance or reduce prices.

In Heroku's case, the company offered a service early on in its life that it found it could not operate as it grew.** It didn't tell customers this and in fact moved to a less sophisticated technology (random routing) to deal with the influx of customers that came to it for its cloud services capabilities.

By doing this it misled customers and even exposed some of them to degraded application performance. For some, Heroku's cloud got worse over time, rather than better. It's like Intel producing 'reverse-Moore's Law' chips with capabilities whose efficiency halved and pricing doubled every 18 months for a certain subset of customers without telling them.

Cloud computing is meant to bring about greater efficiency through the pooling of resources and expertise. When it works, it's a distillation of the finer features of capitalism. When it doesn't work - as in this case - it makes fools of its consumers and damages other companies in its space.

If Heroku (founded: 2007), got so carried away with new products that it made a significant change in 2011 and then forgot for two years to tell anyone until a customer wrote a blogpost lambasting the company, then wouldn't it be fair to surmise that other companies could have done the same thing?

Heroku has made changes to make sure that it can never make a foul-up of this magnitude again. "Nowadays we have a whole team devoted to documentation via our Dev Center, and we treat docs as a first-class citizen," Wiggins said.

Trust is a valuable commodity, and all companies would do well to avoid cheapening it through sloppiness paired with brash ambition. ®

Bootnote

* The routing change was highlighted by disgruntled Heroku customer Rap Genius in February.

** There's nothing wrong with changing technologies when you run into unforeseen problems. Heroku is not alone in finding routing to be a difficult technology to scale - cloud giant Amazon also tried to shift from random routing to a more intelligent routing method in the mid-2000s but ran into huge problems and eventually reverted to using standard Cisco gear, according to a post on popular developer hangout Hacker News. The problem is not telling customers that you've made the change.

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
NASA launches new climate model at SC14
75 days of supercomputing later ...
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Want to STUFF Facebook with blatant ADVERTISING? Fine! But you must PAY
Pony up or push off, Zuck tells social marketeers
Oi, Europe! Tell US feds to GTFO of our servers, say Microsoft and pals
By writing a really angry letter about how it's harming our cloud business, ta
prev story

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
5 critical considerations for enterprise cloud backup
Key considerations when evaluating cloud backup solutions to ensure adequate protection security and availability of enterprise data.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Managing SSL certificates with ease
The lack of operational efficiencies and compliance pitfalls associated with poor SSL certificate management, and how the right SSL certificate management tool can help.