Isilon and a question of Big Data

Or was that ingestion?

Next gen security for virtualised datacentres

El REg How is Isilon's scale-out NAS product better than competing products from HP (IBRIX), IBM (SONAS), BlueArc, and Dell (Exanet and DSFS)?

Rob Peglar: There's much more to your question than meets the eye. Your piece of 11 April had a nice overview, though.

El Reg What role does flash have to play in scale-out filers?

Rob Peglar: A very interesting one. Flash, or fast non-volatile memory in general, has an interesting role in scale-out. Most of its impact is currently around holding metadata, and it’s quite useful for that. Isilon in particular can use flash nicely in that the backend communication path is already very fast and scalable, using Infiniband. Internodal messages traverse very quickly and efficiently. Combine this with holding node-based metadata in flash, and insuring all nodes are in sync via InfiniBand, is a solid architectural solution.

Doing metadata synchronisation using rotating disk is less efficient because of the inherent latency involved, and the interposition of write cache. However, using flash devices to quickly get metadata onto stable storage is highly efficient. At scale, this becomes an overriding concern. One can easily sync two nodes’ metadata using HDD, for example, but that is not scale-out, and adding an extra layer of file system overhead (e.g. an aggregation layer) on top of legacy file systems to simulate scale-out is highly inefficient. Scale-out starts at three nodes and goes to N. The current challenge is to increase N without adding quadratic latency, and flash helps greatly here.

A secondary role for flash in scale-out is to hold data itself which is quite read-intense, in particular big data after it has been mapped (the map stage of map/reduce) and now is undergoing processing, again mostly reads. There is much research and development going on currently in this area, especially as flash devices become denser and less expensive to procure.

El Reg Do scale-out filers need an integrated archiving/backup back-end system to store cold data, perhaps in a deduplicated form?.

Rob Peglar: In general, the answer is yes. Cold data is only one use case; the other is more strategic, i.e. the preservation of important/critical data, albeit infrequently used (‘cold’). Data of high importance must be archived not only for protection’s sake but also for legal and/or security concerns. Thus, any system, scale-out or not, must be so protected. Scale-out has a very important role to play here because it can serve as both primary and secondary repository – i.e. archive to scale-out. Archiving in particular lends itself to scale-out approaches by its very nature – typically always adding data to a permanent archive.

Archive is also typically the ‘repository of last resort’, so protection is paramount.  This is another reason why scale-out is a superior approach; it adds not only disk protection but node protection as well, thus isolating the archive at large from any set of individual failures. Isilon in particular has developed an M+N approach to scale-out, thus minimizing the probability of data loss not only due to drive (media) failure but also node failure (e.g. power outage, cable pulls, human error, etc.)

This is a superior approach to tape archive, for example, because the failure of a given tape library means the cartridges contained therein – the media of last resort – are inaccessible and must be physically removed and transported to another library of similar characteristics. This is not scale-out. Scale-out archives imply one copy of the archival data, and protection via architecture is paramount.

Secure remote control for conventional and virtual desktops

More from The Register

next story
HP busts out new ProLiant Gen9 servers
Think those are cool? Wait till you get a load of our racks
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Community chest: Storage firms need to pay open-source debts
Samba implementation? Time to get some devs on the job
Like condoms, data now comes in big and HUGE sizes
Linux Foundation lights a fire under storage devs with new conference
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
prev story


5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up Big Data
Solving backup challenges and “protect everything from everywhere,” as we move into the era of big data management and the adoption of BYOD.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?