Feeds

Amazon parks human genome on cloud

Taps Exploits world of boffins

Secure remote control for conventional and virtual desktops

In 1993, meat space bookseller Barnes & Noble started offering Starbucks coffee to augment customers' shopping experience. Not to be outdone, the Internet's largest bookseller has finally answered.

Amazon announced yesterday that it will allow free, easy access to hard-to-find datasets like the human genome - and a few other compilations that aren't nearly as exciting. With its new Public Data Sets initiative, Amazon hopes to commoditize access to this kind of data, while positioning its EC2 system as the preferred computing platform for researchers.

For users, this service is likely to speed up the science without digging too far into the meager stack of grant money. Amazon is currently hosting data sets from biology, chemistry, and economics, having hunted down files in the public domain. To expand the service, they are soliciting people to provide more data, provided that it's not proprietary.

They boast free access to the data. "Free" is about as relative a term as “pregnant,” but Amazon is still taking some liberty with it. While technically you don't have to pay for the data sets, the only way you can access them is by spinning up an EC2 virtual machine instance and mounting the data as an Elastic Block Store drive. Amazon charges for EC2 instances by the hour, and Elastic Block Store drives by the I/O request. Conveniently for Amazon, these data sets are equally as processable as they are large.

There are already several online directories of public domain data sets, so what's the value-add? When you run an EC2 virtual instance to access the data, you can choose a machine image that contains specialized processing tools. If parallel processing is an concern, you can spin up as many instances as you need to process the data and spin them down when you're done, only paying for the resources you use.

This is still a bit dubious, because Amazon isn't really doing any work here. The data sets and processing tools come from third parties. Bezos just uses them to sell Amazon Web Services. As far as exploiting the academic spirit of sharing and mutual betterment for profit, the Public Data Sets program is a winner.

While working with the data in Amazon's environment may be a bit faster and more convenient, the target market for this type of service is the notoriously tight-assed academic researcher. While a tenured professor may see a minor productivity increase by using Amazon Web Services, she can see an order of magnitude productivity increase by enslaving a graduate student to do the same work. Said graduate student, already living below the poverty line, is unlikely to spend money on Amazon. There is a time-honored tradition among these servile few of spending late nights in the computer lab slicing up US Census data because it's difficult to get SAS or SPSS licenses for their laptops.

Be that as it may, Amazon isn't making a big bet on this one. If the analytics tools provided on EC2 take good enough advantage of the parallelism offered by virtualization, perhaps graduate students everywhere will decide to drop a little money. Less time spent analyzing data means more time spent partying, remembering that you're still under thirty and without serious responsibility. ®

Ted Dziuba is a co-founder at Milo.com You can read his regular Reg column, Fail and You, every other Monday.

Secure remote control for conventional and virtual desktops

More from The Register

next story
Rosetta probot drilling DENIED: Philae has its 'LEG in the AIR'
NOT best position for scientific fulfillment
HUMAN DNA 'will be FOUND ON MOON' – rocking boffin Brian Cox
Crowdfund plan to stimulate Blighty's space programme
LIFE, JIM? Comet probot lander found 'ORGANICS' on far-off iceball
That's it for God, then – if Comet 67P has got complex molecules
'Yes, yes... YES!' Philae lands on COMET 67P
Plucky probot aces landing on high-speed space rock - emotional scenes in Darmstadt
THERE it is! Philae comet lander FOUND in EXISTING Rosetta PICS
Crumb? Pixel? ALIEN? Better, it's a comet-catcher!
SEX BEAST SEALS may be egging each other on to ATTACK PENGUINS
Boffin: 'I think the behaviour is increasing in frequency'
Post-pub nosh neckfiller: The MIGHTY Scotch egg
Off to the boozer? This delicacy might help mitigate the effects
I'M SO SORRY, sobs Rosetta Brit boffin in 'sexist' sexy shirt storm
'He is just being himself' says proud mum of larger-than-life physicist
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Mitigating web security risk with SSL certificates
Web-based systems are essential tools for running business processes and delivering services to customers.
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.