Feeds

Cloud mega-uploads aren't easy

Google, Microsoft, can't explain how to get big data into the cloud, despite rivals' import services

  • alert
  • submit to reddit

Build a business case: developing custom apps

Google and Microsoft don't offer formal data ingestion services to help users get lots of data into the cloud, and neither seems set to do so anytime soon. Quite how would-be users take advantage of the hundreds of terabytes both offer in the cloud is therefore a bit of a mystery.

Data ingestion services see cloud providers offer customers the chance to send them hard disks for rapid upload into the cloud. Amazon Web Services' import/export service was among the first such services and offers the chance to ingest up to 16TB of data, provided it is no more than 14 inches high by 19 inches wide by 36 inches deep (8Us in a standard 19 inch rack) and weighs less than 50 pounds.

Rackspace offers a similar service, dubbed Cloud Files Bulk Import. Optus, the Australian arm of telecoms giant Singtel, will happily offer a similar service. Australian cloud Ninefold does likewise, branding it "Sneakernet".

Some other cloud providers offer such a service, even if it is not productised or advertised. The Register spoke to one cloudy migrant who (after requesting anonymity) told us they borrowed a desktop network attached storage (NAS) device from their new cloud provider, bought another, uploaded data to the devices and then despatched a staffer on a flight to the cloud facility. The NASes were carry-on luggage and the travelling staffer cradled them on their lap during the flight.

It was worth going to those lengths because, as AWS points out in the spiel for its import/export service, doing so “is often much faster than transferring that data via the Internet.”

To understand why, consider the fact that headline speeds advertised on broadband connections aren't always achieved in the real world. Optus, for example, told us that while its fastest broadband connection hums along at 3-5 Gbps, the standard service level agreement “guarantees a speed of 300 Mbps, above which we would conduct fibre checks to ensure additional capacity can be reserved for the customer.” At that speed each terabyte would take about eight hours to upload, and that's with an optimistic assumption of 10% overhead and general network messiness.

It's hard to imagine how that kind of speed will be of any use for cloud services which offer petabyte-scale cloud storage, such as Azure's (or whatever it is called this week) pricing tier for amounts of data “Greater than 5 PB.” Google's BigQuery also promises to support “analysis of datasets up to hundreds of terabytes.”

Both Google and Microsoft, however, offered no details when El Reg prodded them for an explanation of just how customers can get that much data into their clouds. That's despite Microsoft telling your correspondent, in a past professional life, that it was “evaluating” such a service back in 2010.

If you think this all sounds a bit theoretical, the lack of ingestion services from the Chocolate Factory is already leading to some bizarre work-arounds.

Craig Deveson, a serial cloud entrepreneur who currently serves as CEO and Co-Founder of Wordpress backup plugin vendor cloudsafe365, says Google's lack of data ingestion services became “a genuine issue” when he worked on a Gmail migration for a large Australian software company. During that project he found the best way to get substantial quantity of old email data into Google's cloud was first to send disks to Singapore for upload into Amazon's S3 cloud storage service. Once in Amazon's cloud “we had to run a program to ingest it into Google's back end.”

Similar tricks are needed to pump lots of data into software-as-a-service providers' clouds.

Salesforce.com, for example, advised us that bulk uploads are made possible by a Bulk API which happily puts SOAP and REST to work to suck up batches of 10,000 records at a time. “Even while data is still being sent to the server, the Force.com platform submits the batches for processing,” the company said.

Pressed if disks are accepted, the company responded that “All common database products provide a capability to extract to a common file format like .csv.”

Whether anyone can afford to wait for that .csv, or other larger files, to arrive is another matter. ®

Boost IT visibility and business value

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
VVOL update: Are any vendors NOT leaping into bed with VMware?
It's not yet been released but everyone thinks it's the dog's danglies
BlackBerry: Toss the server, mate... BES is in the CLOUD now
BlackBerry Enterprise Services takes aim at SMEs - but there's a catch
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.