Feeds

Amazon opens doors to Kinesis: The Hotel California of the cloud

You can check in your data anytime you like, but as for leaving...

Next gen security for virtualised datacentres

re:Invent 2013 The more data you put in a cloud, the harder it is to migrate away. And so Amazon's new "Kinesis" data ingester is a neat piece of technology, and at the same time a canny way to turn Amazon Web Services into the Hotel California of the cloud.

Kinesis was announced by the web bazaar's chief technology officer Werner Vogels in a speech at the company's re:Invent conference today. It's essentially Amazon's attempt to fire up a commercial variant of open-source data processing and messaging engines Storm, Spark Streaming, and Kafka.

The difference between Kinesis and these systems is that Amazon handles all the pesky infrastructure management and provisioning, and simply exposes the system to a developer as a service that lets the programmer pick what data to ingest, how much, and where to feed it to.

Kinesis will also compete with commercial systems, like Google's BigQuery – though the ad-slinger's streaming capabilities are rudimentary in comparison to Kinesis's data-huffing tech.

Amazon's service can "collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources," the biz wrote in a document discussing the tech. It can then stream this out rapidly to other complementary AWS technologies such as DynamoDB or RedShift for analysis or presentation.

Some of apps this makes possible could include social media data-mining software, live data analytics around financial markets, or a way of feeding changing inventory information from massive stores into software to re-order stock.

"In a few clicks and a couple of lines of code, you can start building applications which respond to changes in your data stream in seconds, at any scale, while only paying for the resources you use," the company wrote.

Admins can set up a Kinesis service by saying how much input and output they need in blocks of 1MBps named 'Shards', via the AWS console, API, or SDKs. The size of the stream can be adjusted without needing a restart, and data is loaded in with HTTP PUTs

Data placed in Kinesis is available for analysis "within seconds" the company said, and is stored across several bit barns for 24 hours, during which time it can be "read, re-read, backfilled, and analyzed," or moved into storage services like S3 and RedShift, the company said.

Amazon has also produced a client library for the service that automates how Kinesis adapts to "changes in stream volume, load-balancing streaming data, coordinating distributed services, and processing data with fault-tolerance," the company said.

However, Kinesis is an ambitious project and perhaps more prone to performance wobbles than other AWS services. "Real-time is definitely one of the areas where we have to crack a lot of nuts still," Vogels told El Reg. We will be watching its performance closely for any wobbles.

Kinesis costs $0.015 per 'Shard' per hour, along with $0.028 per 1,000,000 PUT transactions. Initially, the tech is available from the US East Region as a "Limited Preview" that developers will need to apply for. Inbound data transfer is free and there's no cost to transmit data from Kinesis into other AWS apps.

An AWS pricing example says Kinesis could be used to create an app ingesting 10MBps of data while feeding info out to two real-time processing applications for $4.22 a day.

Kinesis looks like the outcome of a big-data service which El Reg spotted in February, and dubbed the Mystery-Amazon-Data-Service (MADS).

MADS was going to be capable of "highly available, highly reliable processing of data in near-realtime". Recruitment adverts at the time said the service would have to slurp between two and five million database records per second at launch, and eventually scale to deal with hundreds of millions – which is exactly the sort of capability Kinesis requires.

With Kinesis, Amazon is giving companies a way to analyze streams of changing web data, and to do so without operating any complex hardware or infrastructure. It also gives it an app that will pour more data than ever before into its cloud, fattening its margins and letting it buy more storage gear at lower prices than ever before, allowing Bezos & Co to make money twice – first from the customer paying him, and then from the discounts he can get from his equipment suppliers. ®

Next gen security for virtualised datacentres

Whitepapers

Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Advanced data protection for your virtualized environments
Find a natural fit for optimizing protection for the often resource-constrained data protection process found in virtual environments.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.