Feeds

Isilon and a question of Big Data

Or was that ingestion?

Build a business case: developing custom apps

El REg How is Isilon's scale-out NAS product better than competing products from HP (IBRIX), IBM (SONAS), BlueArc, and Dell (Exanet and DSFS)?

Rob Peglar: There's much more to your question than meets the eye. Your piece of 11 April had a nice overview, though.

El Reg What role does flash have to play in scale-out filers?

Rob Peglar: A very interesting one. Flash, or fast non-volatile memory in general, has an interesting role in scale-out. Most of its impact is currently around holding metadata, and it’s quite useful for that. Isilon in particular can use flash nicely in that the backend communication path is already very fast and scalable, using Infiniband. Internodal messages traverse very quickly and efficiently. Combine this with holding node-based metadata in flash, and insuring all nodes are in sync via InfiniBand, is a solid architectural solution.

Doing metadata synchronisation using rotating disk is less efficient because of the inherent latency involved, and the interposition of write cache. However, using flash devices to quickly get metadata onto stable storage is highly efficient. At scale, this becomes an overriding concern. One can easily sync two nodes’ metadata using HDD, for example, but that is not scale-out, and adding an extra layer of file system overhead (e.g. an aggregation layer) on top of legacy file systems to simulate scale-out is highly inefficient. Scale-out starts at three nodes and goes to N. The current challenge is to increase N without adding quadratic latency, and flash helps greatly here.

A secondary role for flash in scale-out is to hold data itself which is quite read-intense, in particular big data after it has been mapped (the map stage of map/reduce) and now is undergoing processing, again mostly reads. There is much research and development going on currently in this area, especially as flash devices become denser and less expensive to procure.

El Reg Do scale-out filers need an integrated archiving/backup back-end system to store cold data, perhaps in a deduplicated form?.

Rob Peglar: In general, the answer is yes. Cold data is only one use case; the other is more strategic, i.e. the preservation of important/critical data, albeit infrequently used (‘cold’). Data of high importance must be archived not only for protection’s sake but also for legal and/or security concerns. Thus, any system, scale-out or not, must be so protected. Scale-out has a very important role to play here because it can serve as both primary and secondary repository – i.e. archive to scale-out. Archiving in particular lends itself to scale-out approaches by its very nature – typically always adding data to a permanent archive.

Archive is also typically the ‘repository of last resort’, so protection is paramount.  This is another reason why scale-out is a superior approach; it adds not only disk protection but node protection as well, thus isolating the archive at large from any set of individual failures. Isilon in particular has developed an M+N approach to scale-out, thus minimizing the probability of data loss not only due to drive (media) failure but also node failure (e.g. power outage, cable pulls, human error, etc.)

This is a superior approach to tape archive, for example, because the failure of a given tape library means the cartridges contained therein – the media of last resort – are inaccessible and must be physically removed and transported to another library of similar characteristics. This is not scale-out. Scale-out archives imply one copy of the archival data, and protection via architecture is paramount.

Boost IT visibility and business value

More from The Register

next story
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
VMware's high-wire balancing act: EVO might drag us ALL down
Get it right, EMC, or there'll be STORAGE CIVIL WAR. Mark my words
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
VMware vaporises vCHS hybrid cloud service
AnD yEt mOre cRazy cAps to dEal wIth
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Scale data protection with your virtual environment
To scale at the rate of virtualization growth, data protection solutions need to adopt new capabilities and simplify current features.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?