Feeds

Isilon and a question of Big Data

Or was that ingestion?

HP ProLiant Gen8: Integrated lifecycle automation

Interview Xiotech technology VP Rob Peglar has moved to Isilon, now an EMC business, to become chief technology officer (CTO) for the Americas.

We interviewed Rob and asked him questions that reveal quite a lot about Isilon's prospects, big data, the role of flash in scale-out filers, reduplication and Isilon, and what we should think about archiving data from Isilon clusters.

El REg Why did you join Isilon?

Rob Peglar: Primarily, for a personal reason - to take the CTO Americas role.  Secondarily, significant parts of the industry are moving towards greater use of file-based storage and the resultant use (gathering, analysis, reduction) of data stored in files. Isilon is an innovator and leader in that space and I joined to help end users realize new capabilities in their use of file data as well as be a key participant in the next generation(s) of file-based storage architectures.

El Reg What does the CTO Americas do that's different from the overall CTO?

Rob Peglar: CTO Americas role is an allied position to the corporate CTO (Paul Rutherford).  Isilon has a thrice-distributed CTO function in world geographies; Americas (basically, the Western Hemisphere), EAME and Asia-Pacific (AP). These roles have an outward (i.e. towards end users and channels) function as well as an inward (i.e. towards products, roadmap, strategy, engineering, etc.) function. In my role, I will be facing customers and channels to give them a thorough understand of not only what Isilon does, how and why we do it, and so on, but also higher-level industry trends, techniques, technologies, and executive-level briefings on the strategic implications of file data to businesses and organizations.

El Reg Is big data in general different from big data in the HPC world and, if so, how?

Rob Peglar: In general, it is. While there are some similarities – both being unstructured data, for example – there are typically differences between big data in the commercial/business world and big data in the traditional HPC/supercomputing world. I am fortunate to have experience in both worlds, dating back to 1978 on the traditional HPC side. HPC typically involves the analysis of very large but ‘fixed’ sets of data, i.e. a dataset describing an initial condition. That data is then ingested and subjected to an iterative process, typically a very large job which simulates and analyzes the forward-in-time progress of the computation, performing a certain computational model based on the initial condition.

During the job, large intermediate files are produced to save the job’s state and its data at a given time step. This process is often referred to as ‘checkpointing’.  Checkpoints are taken because HPC jobs may run for weeks at a time; restarting a job from its initial condition is to be avoided, for all the obvious reasons. The end result of the HPC job may actually be very little data; just a set of results or a visualisation, computed over a given time interval. Or, the net result may be another very large dataset which would then in turn undergo yet another set of analysis, perhaps by a different job.

Contrast this with commercial/business ‘big data’ as being generated and stored by what I call ‘constantly running’ applications, e.g. web hits, cookie-based widgets, error logs, transaction logs, streaming apps, and the like. This kind of data, while unstructured like its HPC cousins, is constantly changing and being appended to by the outside world.

Data analysis jobs in this world typically take a ‘chunk’ of this big data and attempt to reduce it for specific analysis, pattern matching, searching, and/or general data mining, seeking to understand the data itself for a business purpose. The key to this kind of big data is that it’s constantly evolving, whereas data in the HPC world typically doesn’t. Both types of big data, however, require large, reliable and – the seminal characteristic by far – scalable storage.

Reducing security risks from open source software

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
Amazon Reveals One Weird Trick: A Loss On Almost $20bn In Sales
Investors really hate it: Share price plunge as growth SLOWS in key AWS division
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
BlackBerry: Toss the server, mate... BES is in the CLOUD now
BlackBerry Enterprise Services takes aim at SMEs - but there's a catch
The triumph of VVOL: Everyone's jumping into bed with VMware
'Bandwagon'? Yes, we're on it and so what, say big dogs
Carbon tax repeal won't see data centre operators cut prices
Rackspace says electricity isn't a major cost, Equinix promises 'no levy'
prev story

Whitepapers

Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Seven Steps to Software Security
Seven practical steps you can begin to take today to secure your applications and prevent the damages a successful cyber-attack can cause.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.