Feeds

Isilon and a question of Big Data

Or was that ingestion?

Beginner's guide to SSL certificates

Interview Xiotech technology VP Rob Peglar has moved to Isilon, now an EMC business, to become chief technology officer (CTO) for the Americas.

We interviewed Rob and asked him questions that reveal quite a lot about Isilon's prospects, big data, the role of flash in scale-out filers, reduplication and Isilon, and what we should think about archiving data from Isilon clusters.

El REg Why did you join Isilon?

Rob Peglar: Primarily, for a personal reason - to take the CTO Americas role.  Secondarily, significant parts of the industry are moving towards greater use of file-based storage and the resultant use (gathering, analysis, reduction) of data stored in files. Isilon is an innovator and leader in that space and I joined to help end users realize new capabilities in their use of file data as well as be a key participant in the next generation(s) of file-based storage architectures.

El Reg What does the CTO Americas do that's different from the overall CTO?

Rob Peglar: CTO Americas role is an allied position to the corporate CTO (Paul Rutherford).  Isilon has a thrice-distributed CTO function in world geographies; Americas (basically, the Western Hemisphere), EAME and Asia-Pacific (AP). These roles have an outward (i.e. towards end users and channels) function as well as an inward (i.e. towards products, roadmap, strategy, engineering, etc.) function. In my role, I will be facing customers and channels to give them a thorough understand of not only what Isilon does, how and why we do it, and so on, but also higher-level industry trends, techniques, technologies, and executive-level briefings on the strategic implications of file data to businesses and organizations.

El Reg Is big data in general different from big data in the HPC world and, if so, how?

Rob Peglar: In general, it is. While there are some similarities – both being unstructured data, for example – there are typically differences between big data in the commercial/business world and big data in the traditional HPC/supercomputing world. I am fortunate to have experience in both worlds, dating back to 1978 on the traditional HPC side. HPC typically involves the analysis of very large but ‘fixed’ sets of data, i.e. a dataset describing an initial condition. That data is then ingested and subjected to an iterative process, typically a very large job which simulates and analyzes the forward-in-time progress of the computation, performing a certain computational model based on the initial condition.

During the job, large intermediate files are produced to save the job’s state and its data at a given time step. This process is often referred to as ‘checkpointing’.  Checkpoints are taken because HPC jobs may run for weeks at a time; restarting a job from its initial condition is to be avoided, for all the obvious reasons. The end result of the HPC job may actually be very little data; just a set of results or a visualisation, computed over a given time interval. Or, the net result may be another very large dataset which would then in turn undergo yet another set of analysis, perhaps by a different job.

Contrast this with commercial/business ‘big data’ as being generated and stored by what I call ‘constantly running’ applications, e.g. web hits, cookie-based widgets, error logs, transaction logs, streaming apps, and the like. This kind of data, while unstructured like its HPC cousins, is constantly changing and being appended to by the outside world.

Data analysis jobs in this world typically take a ‘chunk’ of this big data and attempt to reduce it for specific analysis, pattern matching, searching, and/or general data mining, seeking to understand the data itself for a business purpose. The key to this kind of big data is that it’s constantly evolving, whereas data in the HPC world typically doesn’t. Both types of big data, however, require large, reliable and – the seminal characteristic by far – scalable storage.

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
Turnbull should spare us all airline-magazine-grade cloud hype
Box-hugger is not a dirty word, Minister. Box-huggers make the cloud WORK
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
Microsoft adds video offering to Office 365. Oh NOES, you'll need Adobe Flash
Lovely presentations... but not on your Flash-hating mobe
prev story

Whitepapers

Driving business with continuous operational intelligence
Introducing an innovative approach offered by ExtraHop for producing continuous operational intelligence.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
How to determine if cloud backup is right for your servers
Two key factors, technical feasibility and TCO economics, that backup and IT operations managers should consider when assessing cloud backup.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Choosing a cloud hosting partner with confidence
Download Choosing a Cloud Hosting Provider with Confidence to learn more about cloud computing - the new opportunities and new security challenges.