Ocarina makes waves with lossless image compression

Heeeeey Ocarina! Aaaha!

Designing a Defense for Mobile Applications

Interview Ocarina, the deduplication startup, is making waves with its partnerships with storage vendors, due to its unique lossless image compression technology. Yet the Ocarina founders were not wedded to image deduplication when they started up the company. How did it come about?

The way Murli Thirumale, Ocarina's CEO, tells it, the three founders had three ideas for a startup which they tested with potential customers and with consultants in a proof of concept exercise. They asked which of the three would have long-lasting and true value in the eyes of customers. The one dealing with ever-growing data storage received overwhelming customer support.

Much of this growth was due to rich media, images and videos. These file types, JPEGs, TIFFs and MPEGs and so forth, were previously thought to be uncompressible if there was to be no loss of image or video resolution. Ocarina's chief technology officer and co-founder Goutham Rao, came up with ways of doing this, of compressing the uncompressible. Thurimale says: "He didn't know you're not supposed to be able to do that."

Image and video files list picture elements and their characteristics. An Ocarina technology brief states: "Visual information is typically complex and includes large numbers of values to represent pixels, chrominance, luminance and other information for both recreating an image for the human eye to see, and storing information about the image for computer programs to be able to manipulate."

You can compress such files by getting rid of pixels, but this means losing image quality. What Rao invented was a way of recoding image and video files to store the same information in fewer bytes, with no lost pixels, with "bit-for-bit losslessness". This is Ocarina's secret technology and it's not revealing much about it.


The technology brief says Ocarina's technology: "extracts the full rich image data from an existing image file in to a Discrete Cosine Matrix (DCT space), correlates related image information like chrominance and luminance boundaries around like areas in an image, and then applies Ocarina’s patented image optimization compressor to the grouped areas. The ECO process is able to compress already-compressed JPEGs up to 40 percent, and sets of scaled images - common on web sites - up to 80 percent. Results on medical and life sciences image formats range from 40 percent to 70 percent, and results on grouped studies are better than on individual images."

Intriguingly, the DCT concept is often used in signal and image processing for lossy compression. Intriguingly again, an academic work on DCT has been written by a Dr K. R. Rao and P. Yip, entitled "Discrete Cosine Transform: Algorithms, Advantages, Applications" (Academic Press, Boston, 1990). Dr Rao is credited with being the co-inventor of the Discrete Cosine Transform but he is not related to Ocarina's Goutham Rao.

It looks as if Goutham Rao has found a way to use a method of encoding data for use in lossy compression algorithms to the opposite end.

Producing an Ocarina-encoded image or video file is much, much harder than displaying it. Production is carried out by using an Ocarina hardware appliance, an Optimizer, which comes in two models, a 2400 and a 3400. The 2400 has two quad-core Xeon 5400 processors, 16GB of RAM and two 500GB SATA disk drives.

The 3400 has the same Xeon processors but paired with 32GB of RAM and four 15,000rpm SAS drives. There is heavyweight processing going on here with a maximum bandwidth of 2TB per 24 hour day quoted by Ocarina. It can be down to 1TB a day if pure JPEGS are involved. Thirumale said: "We are CPU-bound in our optimization," and "We are in the process of benchmarking both the Nehalems and also processors with more cores."

He says that Ocarina does things to ensure it does not overwhelm the storage filer. For example, optimisation can be scheduled for off peak times, and it can be throttled back if the filer is really busy.

During optimisation an existing image or video file is read in by an Optimizer and recoded, using the DCT mathmatical method, and processed with other techniques to produce a smaller output file.

This can be read by Ocarina's ECOreader, a piece of software which sits in-line between the storage and the application needing to access the file. It can be deployed on web servers, application servers, proxy appliances, or in some cases, directly on file servers.

Each Ocarina-encoded file is self-contained and holds all the data and metadata required by any ECOreader to access it and send it on the requesting application in real-time.

Other Ocarina compression techniques

Thurimale says Ocarina's Optimizer looks inside files that can contain various objects, such as images in Word documents and PowerPoint decks and PDFs. Once it finds these it can compress them. Also, once the Optimizer has a DCT version of the data, it can compare this with already processed images and use any correlations to improve the optimisation. It doesn't specify how it does this but it sounds like a form of deduplication.

The Optimizer also looks for sets of images sharing common information, such as a sequence of CT scans. The common data is dealt with once, single-instanced in effect, and stored only once. Ocarina's brief describes this as "an example of deduplication applied at the visual information level, rather than at the block storage level."

The idea here is that sub-file-level deduplication cannot process such files or objects because it doesn't know they exist, it's not application data-aware and only sees raw blocks.

The Optimizer can deal with sets of scaled images by only storing data from the largest one and using it to recreate the smaller ones on the fly through the ECOreader. Where there are small thumbnail images which may be stored inefficiently as separate files, 2KB of data stored in an 8KB block is an example Ocarina uses, then the thumbnails can be grouped together to use storage more efficiently. In other words the small thumbnails can be grouped to fill up the minimum block size in a file server.


Thurimale says 61 percent of Ocarina deployments have been associated with an increase in disk purchases by the customer. This is because customers are using Ocarina-compressed files on disk to do work that was not possible in real time before. They'll have active data for use in creative work on videos or images and older data that had been consigned to tape. Restoring this for real-time editing work is not practical.

Talking of a movie studio customer, Thurimale said: "We were able to give them real-time access to this archive data." Having it stored on disk in an Ocarina-encoded format means they can use it in real-time and thus the creatives are more productive.

He says: "It's about tape replacement. Tape's rightful place is in a deep archive," and talks of Ocarina enabling "cheap and deep" disk storage and of it being no threat to disk storage sales.

According to him, Ocarina's products are resold by BlueArc, and have been certified by Hitachi Data Systems, HP, and Isilon. Ocarina is working with DataDirect and pursuing certification with EMC, where the focus is on Celerra, and Ocarina is being integrated using EMC's file mover API. Thurimale said: "We could work very well with Atmos."

Cloud storage provider Nirvanix is also working with Ocarina. However there is no certification with IBM or NetApp.

There have been two Ocarina funding rounds, the last for $20m in February this year with a total of $31m having been raised. Thurimale is especially pleased about the second round which took place in the middle of the recession. The company must have been able to demonstrate good potential. but no customer numbers or revenue numbers have been made public.

Thurimale says: "There are several billion files under Ocarina optimisation (and) we're demonstrating great customer traction (with) customers in production and making repeat purchases. ... We're an add-on to your storage. We're not a rip-and-replace company. The applications won't recognise that we're there." ®

The Power of One eBook: Top reasons to choose HP BladeSystem

More from The Register

next story
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
Attack of the clones: Oracle's latest Red Hat Linux lookalike arrives
Oracle's Linux boss says Larry's Linux isn't just for Oracle apps anymore
THUD! WD plonks down SIX TERABYTE 'consumer NAS' fatboy
Now that's a LOT of porn or pirated movies. Or, you know, other consumer stuff
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
FLAPE – the next BIG THING in storage
Find cold data with flash, transmit it from tape
prev story


Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Reducing security risks from open source software
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.