Feeds

Is Amazon cooking up cloudy big data service?

More than Elastic MapReduce

Boost IT visibility and business value

There's speculation and scuttlebutt that cloud computing juggernaut Amazon is pondering fluffing up a big data crunching service as part of its Amazon Web Services subsidiary.

A blog at the New York Times suggests that the retailing giant, which has some of the best analytics in the world to run its online operation, might want to apply that expertise to an AWS service, much as it sells raw infrastructure and various computing, storage, networking, and database services today.

The idea certainly makes sense, particularly for the hundreds of thousands of businesses that use AWS for all or part of their IT infrastructure rather than using machines in their own data centers.

Amazon has a number of unique assets that can be brought to bear should it decide to offer a big data service. First and foremost, Amazon has a wealth of data that it gathers from its own operations and its affiliates, and makes these figures available on a very select basis to its largest retailers in a service called called Amazon Retail Analytics (ARA) Premium.

This is chock full of online shopper buying patterns that Amazon makes available to help retail affiliates better peddle their products. Amazon, of course, dices and slices this data to drive its own Amazon.com online store.

If you run your own website on AWS, using the EC2 compute, S3 storage and RDS database services, you also have the advantage of having all of the big data you might want to chew on inside the Amazon firewall and on the internal AWS network, so it would be somewhat easier to bring the clickstream and log file data from your AWS services into one place to feed to more Amazon machinery.

Amazon has been peddling the Elastic MapReduce service, an implementation of the open source Hadoop data muncher, on its EC2 and S3 services since April 2009, and big webby app providers including Yelp, Foursquare, Etsy and Razorfish are all using the service. Karmasphere has even created a graphical tool that plugs into Eclipse IDEs to do queries against data stored in the Elastic MapReduce service and to manage the virtual clusters running Hadoop.

The NYT speculates that Amazon might mix in payment security, fraud detection and product recommendation services as part of this hypothetical Amazon big data service. These are big data services that Amazon has developed over its history that would no doubt be useful to its partners – and maybe even its competitors.

Speaking of competitors, there is absolutely nothing that would stop any of the current suppliers of data warehousing and big data analytics tools from setting up shop on AWS and offering such services, whether or not Amazon itself wraps up such tools as an uber-service.

Such big data vendors would not, of course, have access to the ARA data, unless Amazon decided to sell it. But then again, IBM and Oracle would seem to be more inclined to run their various data warehousing and analytics software on their own clouds, but SAS Institute, Teradata and the handful of commercial distributors of Hadoop stacks – Cloudera, Hortonworks, MapR, and EMC are in there with IBM and Oracle – might be tempted to offer their wares as a service on AWS.

Amazon was contacted by El Reg for comment on this speculation and was not available at time of publication. ®

Boost IT visibility and business value

More from The Register

next story
Pay to play: The hidden cost of software defined everything
Enter credit card details if you want that system you bought to actually be useful
HP busts out new ProLiant Gen9 servers
Think those are cool? Wait till you get a load of our racks
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Community chest: Storage firms need to pay open-source debts
Samba implementation? Time to get some devs on the job
Like condoms, data now comes in big and HUGE sizes
Linux Foundation lights a fire under storage devs with new conference
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
prev story

Whitepapers

Gartner critical capabilities for enterprise endpoint backup
Learn why inSync received the highest overall rating from Druva and is the top choice for the mobile workforce.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.