Startup offers penalty-free file data reduction
Magic from Swiss gnomes
Swiss startup balesio, staffed by all of nine people, has devised a penalty-free way of reducing unstructured data file sizes without altering the original file format, meaning no rehydration or decompression is needed to read the reduced size files.
Its Native File Optimisation (NFO) software technology analyses unstructured data files and restores their contents in a visually lossless manner with a smaller file size. The optimised and reduced files can still be read by their originating applications, such as Windows PowerPoint, SharePoint and Excel.
This approach is very much like that of the Dell-acquired Ocarina, which optimised and reduced the size of various image formats in a visually lossless manner but requiring an Ocarina reader to read the optimised files.
Christoph Schmid, balesio's chief operating officer and sales VP, says Ocarina started with image optimisation and then moved into unstructured files, whereas balesio started with Microsoft format unstructured data files – now including PowerPoint, Excel and SharePoint – and has progressed into PDFs and various image formats.
Schmid says balesio's NFO software can recover 50 to 95 per cent of an unstructured file's disk capacity by storing its contents more efficiently. For example, repeated elements on a PowerPoint deck, such as logos, need only be stored once. Imported image files often have colour and resolution attributes that can be scaled back, reducing the image's file size without compromising its visual integrity for the human eye.
The company produces various FILEminimizer software applications, which can be run on PCs and servers to reduce file sizes. Developers have access to a SDK. The company started up in 2006 providing eLearning products, and then looked for ways to reduce the size of the files involved. Balesio focused on this application, became incorporated in 2008 and has has shipped – wait for it – more than 4.5 million copies of its FILEminimizer software. It claims up to 2,000 large accounts are using multiple copies of its applications. The majority are in Europe with some in Japan and America, and include General Electric, Lafarge, and Hyundai. Word of mouth has been responsible, Schmid says, for multiple purchases within balesio's customers.
As a consequence of this sales volume, balesio is entirely self-funded and profitable – it is a venture-capital-free zone.
Balesio's claims are that its technology is highly efficient, completely open and has a one-shot approach, optimising and reducing a file once and forever, with no lock-in to a balesio reader. It says it optimises primary file data.
We asked how efficient it was compared to NetApp's A-SIS deduplication. Schmid said that, according to reports, A-SIS can return 60 per cent dedupe ratios for VMware virtual machine files but only 5 to 10 per cent for general unstructured data files.
We can intuitively conceive that A-SIS, as a block-level deduplication technology, is not file-content-aware as balesio is. Schmid sums up the NetApp balesio optimisation efficiency comparison like this:
Taking a "classic" primary storage share with 75-80 per cent unstructured files, we can achieve a data reduction of 50-85 per cent of that, compared to the 5-10 per cent that NetApp A-SIS is doing. Even if the remaining 20 per cent could be massively deduped by Netapp, I believe it would not achieve our level of realised storage space savings.
We note balesio optimises within a file, without looking for or finding redundancies across several files or across multiple balesio instances. This is an intrinsic feature of the product as it optimises the way an application stores data and removes redundant information within a file, rather than looking for repeating patterns of data within a data stream as simple compression does, or repeating patterns of information across multiple files or block groups, as deduplication technology does.
The company says it can flatten the fate of unstructured data growth in storage capacity terms and does so, it appears, better than any other supplier. It says that it actually helps performance, instead of hindering it, because smaller files are quicker to load, faster to back up, and consume fewer network resources when sent between computers, either in or between data centres or from a data centre to a hosting centre or the the cloud.
There are free trial offers of up to 12 optimisations on balesio's website. A single user FILEminimizer Office licence costs £34.95 and multi-user licence costs have volume discount curves. It seems worth a trial at least to see if you can turn your giant unstructured data files into reduced gnomic Swiss instances of their former selves. ®
Showing Microsoft how to write software
by reorganising the data and removing the bloat.
Microsoft aren't interested in better products, only products with more features.
True and not true
true and not true.
1) "Scaling back" is not what we are doing because it would mean we treat every object in the same exact way. No, what we are doing is recognizing the contents (if you wish "interpreting" correctly the elements and objects there) and optimize them according to what they are. the result is a visually lossless file. If we were to scale back attributes, we would not be visually lossless.
2) true, that is a customer comment. What is the true ratio for these kind of unstructured files with internally compressed content (PowerPoint, images, etc.)?
3) It is penalty-free because you do not need a reader or any rehydration of an optimized file. The optimization itself requires performance, but only one time. Once the file is optimized the file is smaller and doesn't need to be rehydrated anymore by no application or system. And a smaller file is also loaded faster, so after the optimization less performance is required for handling that file.
In general, our approach is totally different than dedupe. We don't look across files but INSIDE files to optimize capacity. By doing so, we create an open form of capacity savings and users can do primary dedupe and all other things in the same way after optimization.
In the demo I saw there was a balesio run against a couple of images. The resulting output files were much smaller than the input files. Their on-screen dimensions were the same and their on-screen appearance to my eyes was the same as well.
As far as I can see the optimisation technology is visually lossless (to human eyes). It does what it says on the can.
Also, just to enjoy a tart comment for a second, there was no press release, the story being based on an interview.