Feeds

Isilon and a question of Big Data

Or was that ingestion?

Providing a secure and efficient Helpdesk

El REg How is Isilon's scale-out NAS product better than competing products from HP (IBRIX), IBM (SONAS), BlueArc, and Dell (Exanet and DSFS)?

Rob Peglar: There's much more to your question than meets the eye. Your piece of 11 April had a nice overview, though.

El Reg What role does flash have to play in scale-out filers?

Rob Peglar: A very interesting one. Flash, or fast non-volatile memory in general, has an interesting role in scale-out. Most of its impact is currently around holding metadata, and it’s quite useful for that. Isilon in particular can use flash nicely in that the backend communication path is already very fast and scalable, using Infiniband. Internodal messages traverse very quickly and efficiently. Combine this with holding node-based metadata in flash, and insuring all nodes are in sync via InfiniBand, is a solid architectural solution.

Doing metadata synchronisation using rotating disk is less efficient because of the inherent latency involved, and the interposition of write cache. However, using flash devices to quickly get metadata onto stable storage is highly efficient. At scale, this becomes an overriding concern. One can easily sync two nodes’ metadata using HDD, for example, but that is not scale-out, and adding an extra layer of file system overhead (e.g. an aggregation layer) on top of legacy file systems to simulate scale-out is highly inefficient. Scale-out starts at three nodes and goes to N. The current challenge is to increase N without adding quadratic latency, and flash helps greatly here.

A secondary role for flash in scale-out is to hold data itself which is quite read-intense, in particular big data after it has been mapped (the map stage of map/reduce) and now is undergoing processing, again mostly reads. There is much research and development going on currently in this area, especially as flash devices become denser and less expensive to procure.

El Reg Do scale-out filers need an integrated archiving/backup back-end system to store cold data, perhaps in a deduplicated form?.

Rob Peglar: In general, the answer is yes. Cold data is only one use case; the other is more strategic, i.e. the preservation of important/critical data, albeit infrequently used (‘cold’). Data of high importance must be archived not only for protection’s sake but also for legal and/or security concerns. Thus, any system, scale-out or not, must be so protected. Scale-out has a very important role to play here because it can serve as both primary and secondary repository – i.e. archive to scale-out. Archiving in particular lends itself to scale-out approaches by its very nature – typically always adding data to a permanent archive.

Archive is also typically the ‘repository of last resort’, so protection is paramount.  This is another reason why scale-out is a superior approach; it adds not only disk protection but node protection as well, thus isolating the archive at large from any set of individual failures. Isilon in particular has developed an M+N approach to scale-out, thus minimizing the probability of data loss not only due to drive (media) failure but also node failure (e.g. power outage, cable pulls, human error, etc.)

This is a superior approach to tape archive, for example, because the failure of a given tape library means the cartridges contained therein – the media of last resort – are inaccessible and must be physically removed and transported to another library of similar characteristics. This is not scale-out. Scale-out archives imply one copy of the archival data, and protection via architecture is paramount.

Security for virtualized datacentres

More from The Register

next story
Wanna keep your data for 1,000 YEARS? No? Hard luck, HDS wants you to anyway
Combine Blu-ray and M-DISC and you get this monster
US boffins demo 'twisted radio' mux
OAM takes wireless signals to 32 Gbps
Apple flops out 2FA for iCloud in bid to stop future nude selfie leaks
Millions of 4chan users howl with laughter as Cupertino slams stable door
No biggie: EMC's XtremIO firmware upgrade 'will wipe data'
But it'll have no impact and will be seamless, we're told
Students playing with impressive racks? Yes, it's cluster comp time
The most comprehensive coverage the world has ever seen. Ever
Run little spreadsheet, run! IBM's Watson is coming to gobble you up
Big Blue's big super's big appetite for big data in big clouds for big analytics
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Protecting users from Firesheep and other Sidejacking attacks with SSL
Discussing the vulnerabilities inherent in Wi-Fi networks, and how using TLS/SSL for your entire site will assure security.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.