AWS v Oracle: Mark Hurd schooled on how to run a public cloud that people actually use
Amazon VP takes Big Red's co-CEO to task over server boast
Amazon's AWS infrastructure boss has slapped down Oracle co-CEO Mark Hurd after the latter boasted that Big Red needs fewer data centers because its systems are, apparently, twice as good.
Writing on his personal blog this week, James Hamilton, an Amazon distinguished engineer, said the suggestion that Oracle can compete with the cloud world's Big Three – AWS, Azure and Google Cloud – by building fewer data centers with "better servers" is, to loosely paraphrase the exec, a bit bonkers.
In a Fortune interview last week, Hurd bragged:
If I have two-times faster computers, I don't need as many data centers. If I can speed up the database, maybe I need one fourth as many data centers. I can go on and on about how tech drives this.
Hurd was trying to explain why cheapskate Oracle had spent just $1.7bn on increasing its cloud data center capacity in 2016, whereas the Big Three together had blown through $31bn that year. The bigwig insisted third-tier Oracle is competitive in the market despite this scrimping.
Straight off the bat we can think of two problems with the database giant's approach: redundancy and latency. Fewer data centers means when a large IT breakdown happens – and even AWS has epic meltdowns – the impact will be greater because you've put all your eggs in few baskets. And if you don't have many data centers spread out over the world, customers will find their packets take longer to reach your servers than a rival's boxes. That's not particularly competitive.
Hamilton had similar thoughts, and took the opportunity to lay a few facts down on Hurd. If you're interested in the design of multi-data-center systems, it's a rare insight into Amazon's thinking.
"Of course, I don't believe that Oracle has, or will ever get, servers two-times faster than the big three cloud providers," Hamilton opened with.
"I also would argue that 'speeding up the database' isn't something Oracle is uniquely positioned to offer. All major cloud providers have deep database investments but, ignoring that, extraordinary database performance won't change most of the factors that force successful cloud providers to offer a large multi-national data center footprint to serve the world."
The Amazon man also brought up the big costs and engineering limits that arise when building extremely large data centers. At some point, energy bills, network infrastructure overheads and other factors will negate the cost benefits of throwing more servers into a single region, he said. It's yet another reason why multiple smaller centers is better than a few stuffed-to-the-gills warehouses.
AWS limits its facilities to a 25 to 30MW range, as scaling beyond that begins to diminish cost returns, we're told.
Hamilton also noted the logistical issues that arise when a cloud provider relies too heavily on "last mile" networks to carry traffic for entire regions, rather than building lots of individual facilities connected via a private backbone, as Amazon prefers to do. He also said businesses prefer to use nearby centers not just for latency reasons but also for legal reasons: an organization in one country may not be able to store particular data in, say, the United States, so having a healthy choice of facilities scattered across the world is more customer friendly than a limited number.
"Some cloud computing users really want to serve their customers from local data centers and this will impact their cloud provider choices. In addition, some national jurisdictions will put in place legal restrictions that make it difficult to fully serve the market without a local region," Hamilton said.
"Even within a single nation, there will sometimes be local government restrictions that won't allow certain types of data to be housed outside of their jurisdiction. Even within the same country [they] won't meet the needs of all customers and political bodies."
The comments underscore just how divided the various cloud compute providers remain in their approaches from both an engineering and business perspective. Oracle, for example, has opted to push its cloud as part of a larger Exadata server brand, while Amazon focuses on the reliability and scale of its AWS network, and Google pushes its Cloud to businesses by promising link-ups to its G Suite and AdWords offerings.
Finally, the top cloud players are all using customized chips tuned for performance to get an edge over their competitors.
"Oracle is hardly unique in having their own semiconductor team," said Hamilton.
"Amazon does custom ASICs, Google acquired an ARM team and has done custom ASIC for machine learning. Microsoft has done significant work with FPGAs and is also an ARM licensee. All the big players have major custom hardware investments underway, and some are even doing custom ASICs. It’s hard to call which company is delivering the most customer value from these investments, but it certainly doesn’t look like Oracle is ahead.
"We will all work hard to eliminate every penny of unneeded infrastructure investment, but there will be no escaping the massive data center counts outlined here nor the billions these deployments will cost. There is no short cut and the only way to achieve excellent world-wide cloud services is to deploy at massive scale." ®
AWS held a summit for customers in San Francisco on Wednesday, where it announced a bunch of stuff summarized here – a lot of it you'll have seen previewed at re:Invent in November. The announcements include a DynamoDB accelerator, the availability of Redshift Spectrum for running really large S3 storage queries, EC2 F1 instances with FPGAs you can program, AWS X-Ray with Lambda integration, the arrival of Lex, and Amazon Aurora with PostgreSQL compatibility.
The F1 instances are pretty interesting. One startup in this space to watch is UK-based AWS partner Reconfigure.io, which is offering an alpha-grade toolchain to build and run Go code on the Xilinx UltraScale Plus FPGAs attached to F1 virtual machines. That's much nicer than fooling around with hardware languages to accelerate bits of your codebase in silicon.