Mimosa adds files to archive cocktail

Archive's dusty old barn is shaken to foundations

Beginner's guide to SSL certificates

Mimosa, an email archiving software company, is adding file archiving to its NearPoint product, this way striking out on a unified archiving strategy.

Once upon a time data protection meant backup to tape. Those simple times seem a long time ago now, with tape backup's front-end restore role and back-end archive role both under sustained attack from hard-drive based products. Disk-based backup, together with virtual tape libraries (VTL), has revolutionised the former backup role of saving files in case they were inadvertently deleted. Now you can continuously protect files and restore them to any point in time.

Disk-based archiving has spread like wildfire for email. There are also products for images, particularly in healthcare with X-rays and formal PACS (Picture Archiving and Communication System) products. There are enterprise content management (ECM) products such as Documentum, which archive files, unstructured or semi-structured data and have developed from enterprise document management systems.

We also have search, indexing, compliance and legal discovery products, such as Recommind, Autonomy-Zantaz, and Kazeon's Information Server. What we are seeing here is the coming together of various archive content silos and the layering across them of archive platform services to ingest data into a repository, index it, search it, extract subsets for compliance or legal discovery, and provide policy-driven retention services.

Archive stack

You can conceive of an archive stack comprising four layers:

  • Data-creation source such as email, PowerPoint, Word, a blog, etc.
  • Archive ingest software such as NearPoint which captures data and puts it into a repository,
  • Platform services to generate meta data on the content, single-instance or de-dupe it, search it, obtain and apply retention policies, and manage any storage tiering,
  • Storage hardware meaning hard drives, optical disks, and tape.

Mimosa fizzing

Public and private-sector organisations are drowning in a sea of data, and its volumes are rising unstoppably as they are loath to throw anything away and virtually every content-creating activity gets digitised. Mimosa has grown furiously as its customers adopt email archiving so as to tame the Exchange elephant.

UK MD Brian Bennett says the privately-held, 200-person company now has 525 customers, up from 300 a year ago. In the first quarter of this year it revenues were more than the whole of 2007. In the second quarter it doubled 2007 revenues, and it's on track to grow its business 300 percent this year. Guessing, its turnover must now be in the $20m-$50m range - Mimosa isn't saying.

Scott Whitney, Mimosa's product management VP, says NearPoint v3.5 will add file archiving from Windows' file shares, with other sources posibly added later. A Mimosa agent will crawl the file share looking for files that meet user-set policies defining things like file size, creation date, time since last access, and zap files that meet the criteria into the archive, leaving a stub behind if so desired. Once a file is archived it stays archived, with no 'ping-pong' as a user accesses the archive copy, causing a restore and then a re-archive once it stays unaccessed for enough time.

Files are single-instanced with any file archive candidate compared to previously archived files and email attachments. That means an old PowerPoint, already archived as an email attachment, won't take up fresh space in the repository. Instead there will just be a pointer to the existing entry.

Mimosa is using Stellent technology which means that the new NearPoint product will know of 300 filetypes and can search them properly. The product will have content monitoring capability to alert sysadmins if outbound material contains user-defined sensitive data, and it will have an existing eDiscovery capability applied to it as well.

A software development kit (SDK) will be made available to developers so that more specialised applications can have data ingested, indexed, searched, extracted or managed by the NearPoint archive.


What about de-duplicating an archive? Conceptually, de-duping file data in an archive is just de-duping data. But crawling through a de-duped archive to detect data you can safely throw away is something that will take a huge amount of CPU time and must needs be done with fanatical care.

Say there is a de-duped data element which is referenced by 10,000 pointers. If the element is deleted then 10,000 references to it are lost too. So the retention crawler must understand de-dupe element pointers and these must have their own unique metadata and retention criteria. Each one is a virtual archive file and must be treated as such by the retention crawler.

As archives grow in size they will hold billions of objects, the majority of which will have been de-duped down to pointers. Trawling these for deletion candidates will be a vital task.

Mimosa has deals with Plasmon and Data Domain to support their storage products. It is possible that de-duplication will move up the archive layer from storage hardware to archive platform services. Alternatively, we may see combinations of archive platform services and storage hardware emerging as single products.

The disk technology revolution sweeping through archiving is causing tectonic shifts amongst the suppliers of storage, email, backup software, data protection vendors, compliance, legal hold, eDiscovery, enterprise content management and document management products and services. What NearPoint is doing is sure to be replicated by other archive product suppliers as they react to the forces of convergence and unification sweeping the backup-to-tape-is-everything cobwebs from archiving's dusty old barn. ®

Security for virtualized datacentres

More from The Register

next story
It's Big, it's Blue... it's simply FABLESS! IBM's chip-free future
Or why the reversal of globalisation ain't gonna 'appen
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
CAGE MATCH: Microsoft, Dell open co-located bit barns in Oz
Whole new species of XaaS spawning in the antipodes
Microsoft and Dell’s cloud in a box: Instant Azure for the data centre
A less painful way to run Microsoft’s private cloud
AWS pulls desktop-as-a-service from the PC
Support for PCoIP protocol means zero clients can run cloudy desktops
prev story


Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.