Feeds

DataDirect Networks beefs up its 1.7 MILLION IOPS monster

It's big, it's bad, it's the SFA12KX

Internet Security Threat Report 2014

DataDirect Networks has boosted its already big, mean and fast SFA12K big data/HPC storage arrays to go faster still with the SFA12KX product, running at up to 48GB/sec from a single appliance.

Previously a fast SFA12K could run up to 40GB/sec, as seen in the Titan supercomputer complex which uses 36 of them.

The growing problem, according to DDN, is that big data and HPC applications are run across hundreds of processors which have multiple cores and many threads. A destination file system has to deal with "access to hundreds or thousands of files simultaneously, via a single file system namespace – or the effect of thousands of threads writing a single file [and this] requires substantial POSIX metadata operations that require high-speed random IOPS for optimal response." (Extract from DDN white paper, 20-page PDF.)

DDN tells us: "Top supercomputers have over 1.5 million CPU cores in their compute clusters, resulting in potentially hundreds of thousands of simultaneous file writes during checkpoint operations."

And it's going to get worse. Outside the rarefied supercomputing world the main websites have to cope with "hundreds of thousands of file accesses per second," and random file accesses at that. There's no streaming data relief available.

Array bulk random data access speed increases can only come from reducing the number of HDD accesses needed to get a piece of data read or written. DDN says its using "a state of the art, multi-threaded data integrity engine" to do this; a fine piece of jargon.

SFA12KX RAID Controller architecture

DDN SFA12KX RAID Controller architecture

The SKA12KX's RAID engine, cache engine, data movers, drivers, and schedulers are parallelised and multi-threaded. There are Active/Active controllers with Dynamic Routing instead of distributed locking. File locks are not sent between the controllers using an inter-controller link: "Each logical unit is online to both controllers, but only one controller takes primary ownership for a given logical unit at a given time. The controller that masters the logical unit caches data for the logical unit and accesses the physical disks that contain that logical unit’s data."

RAID processing has been accelerated: "There are two parallel instances of the storage engine: one in RAID processor (RP)0 and one in RP1. Thus, the SFA12KX actually has two parallel, multi-threaded RAID engines that work simultaneously in each controller for a total of 4 RAID processors across the redundant controller pair.”

"Further,” continues DDN, “each RAID processor runs multiple threads that manage the SFA cache, data integrity calculations and I/O movers. Thus, as the number of storage system cores are increased, additional parallel processes can be run simultaneously and both IOPS and bandwidth will increase accordingly."

DDN also uses flash to speed LUN data access with its Storage Fusion Accelerator (SFX): "SFX cache can be allocated to a Logical Unit Number (LUN), which refers to a logical disk created from a group of real disks, or can be shared between multiple LUNs. It has the effect of front-ending the LUN with some very fast and large cache, without having to dedicate expensive SSD drives to a single LUN."

Combine this with a large DRAM cache and the number of HDD access is reduced.

DDN claims the SFA12KX “delivers random IOPS [of] over 1.7 million burst to cache and over 1.4M sustained 4K IOPS to SSDs. Sequential block bandwidth performance is 48GB/s for simultaneous reads and writes."

It supports up to 1,680 drives with a combination of SAS, SATA or SSD drives.

The SKFA12KX line uses Intel Ivy Bridge processors and there are enough of these to run system applications inside the array. The host SFA OS “acts as a hypervisor, using technologies such as ccNUMA and KVM to control processor, core, memory, I/O and virtual disk allocations."

Both the Lustre file system and GPFS can run inside virtual machines in the SFA12KXE. DDN says this reduces the number of servers, infrastructure requirements and network connections, and streamlines I/O, reducing latency, by removing data “hops” and eliminating storage protocol conversions.

We're told the SFA12KXE uses this In-Storage Processing technology to run DDN’s EXAScaler (Lustre) and GRIDScaler (GPFS) parallel file systems, "as well as customer applications running natively within the storage array. The SFA12KXE delivers up to 23GB/s of file system performance and eliminates external servers and storage networking for a converged approach that yields significant acquisition and management savings."

DDN claims its SFA12KX and XE systems provide both high random IOPS speed and high bandwidth, using just (!) 21 SKA12KXs to reach an overall 1TB/sec of throughput. By optimising the software stack, and through judicious use of flash caches, a bulk capacity drive array can deliver lots of random data quickly to multi-threaded, multi-cored, multi-processor computing complexes. These are spreading from supercomputing and HPC into the commercial arena as big data type apps, with bulk data sets, appear that need analytic processing.

You can't use all-flash arrays for this; that's far too expensive. DDN would also say you can't use mainstream business storage arrays for this either; they don't have the stack optimisations that DDN has learnt about from years of HPC deployments.

Read the SFA12KX white paper here (pdf). The SFA12KXE appliance will be available this quarter with the SFA12KX scheduled for general availability in early 2014. ®

Beginner's guide to SSL certificates

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
'Urika': Cray unveils new 1,500-core big data crunching monster
6TB of DRAM, 38TB of SSD flash and 120TB of disk storage
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Oracle hires former SAP exec for cloudy push
'We know Larry said cloud was gibberish, and insane, and idiotic, but...'
Symantec backs out of Backup Exec: Plans to can appliance in Jan
Will still provide support to existing customers
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.