Feeds

Is Amazon cooking up cloudy big data service?

More than Elastic MapReduce

The Essential Guide to IT Transformation

There's speculation and scuttlebutt that cloud computing juggernaut Amazon is pondering fluffing up a big data crunching service as part of its Amazon Web Services subsidiary.

A blog at the New York Times suggests that the retailing giant, which has some of the best analytics in the world to run its online operation, might want to apply that expertise to an AWS service, much as it sells raw infrastructure and various computing, storage, networking, and database services today.

The idea certainly makes sense, particularly for the hundreds of thousands of businesses that use AWS for all or part of their IT infrastructure rather than using machines in their own data centers.

Amazon has a number of unique assets that can be brought to bear should it decide to offer a big data service. First and foremost, Amazon has a wealth of data that it gathers from its own operations and its affiliates, and makes these figures available on a very select basis to its largest retailers in a service called called Amazon Retail Analytics (ARA) Premium.

This is chock full of online shopper buying patterns that Amazon makes available to help retail affiliates better peddle their products. Amazon, of course, dices and slices this data to drive its own Amazon.com online store.

If you run your own website on AWS, using the EC2 compute, S3 storage and RDS database services, you also have the advantage of having all of the big data you might want to chew on inside the Amazon firewall and on the internal AWS network, so it would be somewhat easier to bring the clickstream and log file data from your AWS services into one place to feed to more Amazon machinery.

Amazon has been peddling the Elastic MapReduce service, an implementation of the open source Hadoop data muncher, on its EC2 and S3 services since April 2009, and big webby app providers including Yelp, Foursquare, Etsy and Razorfish are all using the service. Karmasphere has even created a graphical tool that plugs into Eclipse IDEs to do queries against data stored in the Elastic MapReduce service and to manage the virtual clusters running Hadoop.

The NYT speculates that Amazon might mix in payment security, fraud detection and product recommendation services as part of this hypothetical Amazon big data service. These are big data services that Amazon has developed over its history that would no doubt be useful to its partners – and maybe even its competitors.

Speaking of competitors, there is absolutely nothing that would stop any of the current suppliers of data warehousing and big data analytics tools from setting up shop on AWS and offering such services, whether or not Amazon itself wraps up such tools as an uber-service.

Such big data vendors would not, of course, have access to the ARA data, unless Amazon decided to sell it. But then again, IBM and Oracle would seem to be more inclined to run their various data warehousing and analytics software on their own clouds, but SAS Institute, Teradata and the handful of commercial distributors of Hadoop stacks – Cloudera, Hortonworks, MapR, and EMC are in there with IBM and Oracle – might be tempted to offer their wares as a service on AWS.

Amazon was contacted by El Reg for comment on this speculation and was not available at time of publication. ®

The Essential Guide to IT Transformation

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
Multipath TCP speeds up the internet so much that security breaks
Black Hat research says proposed protocol will bork network probes, flummox firewalls
VMware builds product executables on 50 Mac Minis
And goes to the Genius Bar for support
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
Microsoft says 'weird things' can happen during Windows Server 2003 migrations
Fix coming for bug that makes Kerberos croak when you run two domain controllers
Cisco says network virtualisation won't pay off everywhere
Another sign of strain in the Borg/VMware relationship?
Forrester says Australia, not China, is next boom market for cloud
It's cloudy but fine down under, analyst says
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.