Feeds

Is Amazon cooking up cloudy big data service?

More than Elastic MapReduce

Top 5 reasons to deploy VMware with Tegile

There's speculation and scuttlebutt that cloud computing juggernaut Amazon is pondering fluffing up a big data crunching service as part of its Amazon Web Services subsidiary.

A blog at the New York Times suggests that the retailing giant, which has some of the best analytics in the world to run its online operation, might want to apply that expertise to an AWS service, much as it sells raw infrastructure and various computing, storage, networking, and database services today.

The idea certainly makes sense, particularly for the hundreds of thousands of businesses that use AWS for all or part of their IT infrastructure rather than using machines in their own data centers.

Amazon has a number of unique assets that can be brought to bear should it decide to offer a big data service. First and foremost, Amazon has a wealth of data that it gathers from its own operations and its affiliates, and makes these figures available on a very select basis to its largest retailers in a service called called Amazon Retail Analytics (ARA) Premium.

This is chock full of online shopper buying patterns that Amazon makes available to help retail affiliates better peddle their products. Amazon, of course, dices and slices this data to drive its own Amazon.com online store.

If you run your own website on AWS, using the EC2 compute, S3 storage and RDS database services, you also have the advantage of having all of the big data you might want to chew on inside the Amazon firewall and on the internal AWS network, so it would be somewhat easier to bring the clickstream and log file data from your AWS services into one place to feed to more Amazon machinery.

Amazon has been peddling the Elastic MapReduce service, an implementation of the open source Hadoop data muncher, on its EC2 and S3 services since April 2009, and big webby app providers including Yelp, Foursquare, Etsy and Razorfish are all using the service. Karmasphere has even created a graphical tool that plugs into Eclipse IDEs to do queries against data stored in the Elastic MapReduce service and to manage the virtual clusters running Hadoop.

The NYT speculates that Amazon might mix in payment security, fraud detection and product recommendation services as part of this hypothetical Amazon big data service. These are big data services that Amazon has developed over its history that would no doubt be useful to its partners – and maybe even its competitors.

Speaking of competitors, there is absolutely nothing that would stop any of the current suppliers of data warehousing and big data analytics tools from setting up shop on AWS and offering such services, whether or not Amazon itself wraps up such tools as an uber-service.

Such big data vendors would not, of course, have access to the ARA data, unless Amazon decided to sell it. But then again, IBM and Oracle would seem to be more inclined to run their various data warehousing and analytics software on their own clouds, but SAS Institute, Teradata and the handful of commercial distributors of Hadoop stacks – Cloudera, Hortonworks, MapR, and EMC are in there with IBM and Oracle – might be tempted to offer their wares as a service on AWS.

Amazon was contacted by El Reg for comment on this speculation and was not available at time of publication. ®

Beginner's guide to SSL certificates

More from The Register

next story
It's Big, it's Blue... it's simply FABLESS! IBM's chip-free future
Or why the reversal of globalisation ain't gonna 'appen
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Microsoft and Dell’s cloud in a box: Instant Azure for the data centre
A less painful way to run Microsoft’s private cloud
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
CAGE MATCH: Microsoft, Dell open co-located bit barns in Oz
Whole new species of XaaS spawning in the antipodes
AWS pulls desktop-as-a-service from the PC
Support for PCoIP protocol means zero clients can run cloudy desktops
prev story

Whitepapers

Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.