Feeds

Big Blue boffins scan 10 billion files in Flash in a flash

Enticing developments at the lab

Internet Security Threat Report 2014

IBM and Violin have announced a great big GPFS numbers record: the software scanned 10 billion files in a flash – well, 43 minutes – using four Violin flash memory arrays.

This was 37 times faster than a previous GPFS record of scanning one billion files in three hours, but that was with the file system metadata stored, like the file data, on disk drives.

Why does this matter? IBM says it is because GPFS needs to scan files its filesystem so that they can be moved between storage tiers, migrated, archived, etc. This is non-production work and has to be done in the background. When done with metadata on disk, the process becomes slower and slower as the number of files in a GPFS system rises and rises. So much so that, conceivably, eventually there aren't enough hours in the day to do it.

So IBM Research tried putting the metadata on flash arrays and seeing how much faster the system went. The result is impressive, very impressive, but not that surprising. It's also a tad, well, background, since the system wasn't handling real data.

Back in February, IBM announced a wondrous SONAS SPECsfs 2008 benchmark result of 403,326 SPECsfs ops/sec from a single GPFS system using 1,975 hard disk drives.

EMC bounded past this with a flash-heavy VNX system doing 497,623 ops/sec, using 436 x 200GB, SAS SSDs and 8 file systems.

SONAS is based on GPFS. From where El Reg sits, a re-run of the IBM SONAS SPECsfs2008 benchmark looks feasible, but this time using a few Violin Memory Arrays to hold the SONAS data and so get to the 500,000 SPECsfs2008 ops/sec area and beyond. We have asked both IBM and Violin about this but didn't expect to get anything looking like a "Yes, we're doing this" reply.

Much to our surprise we received this from Bruce Hillsberg, director of storage systems, IBM Research – Almaden: "You are correct: if we were to re-run the SPECsfs benchmark on a SONAS system with using the technology described in the press release, we would see a significant performance improvement."

Enticing, isn't it? ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
Intel, Cisco and co reveal PLANS to keep tabs on WORLD'S MACHINES
Connecting everything to everything... Er, good idea?
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
'Urika': Cray unveils new 1,500-core big data crunching monster
6TB of DRAM, 38TB of SSD flash and 120TB of disk storage
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Oracle hires former SAP exec for cloudy push
'We know Larry said cloud was gibberish, and insane, and idiotic, but...'
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.