Feeds

Alacritech apprehends an NFS anomaly

Wildly unbalanced filer I/O

Internet Security Threat Report 2014

Comment Alacritech claims NFS filer I/O is grossly skewed towards reads and suffers from read metadata processing that chokes controller CPUs.

It has just launched its ANX 1500 filer accelerating cache product based on its recognition of NFS read metadata filer I/O loads that can overwhelm filer processors and delay file delivery.

A couple of years ago Alacritech had a 10gig Ethernet adapter nearing readiness but found that the market had moved on, wanting converged network adapters (CNA) which could do FCoE and iWARP, as well as iSCSI and TCP/IP Offload and basic Ethernet NIC'ing. It would need to have written its own code or licensed IP from Emulex or QLogic and decided, according to marketing VP Doug Rainbolt, it was "not worth it". (Ironically Emulex licenses Alacritech IP for its CNA.)

Alacritech decided to turn aside from the adapter business and, reflecting its founders' Auspex roots, look at accelerating network-attached storage (NAS) file access. Most filer shops use NFS v3. Close inspection to NFS v3 filer I/O patterns showed wild read and write asymmetry. One Fortune 500 company exhibited this pattern:

  • Reads – 52 per cent
  • Metadata (eg Lookups, GETATTRS) – 47.96 per cent
  • Writes – 0.04 per cent

From the point of view of the filer's controller, half of its life was spent getting data off the disk drives and out to accessing host servers and the other half checking the metadata associated with read requests. Write I/O activity was basically inconsequential. Particularly from the disk I/O point of view as writes would be cached in the controller's NVRAM and re-ordered to provide near-sequential I/O. Also, for NetApp users, Rainbolt said WAFL is good for writes.

Reads can not be re-ordered because they have to be answered as and when they come in and are randomly located on the filer's drive platters. The typical answer to this is to use high-speed drives and, if necessary, short-stroke them to minimise head movements (seek time). Both are expensive to do.

But what Alacritech realised was that the randomness of read I/O wasn't the only problem – read metadata was just as big a problem, turning a filer's processors into access bottlenecks if enough metadata checking was needed. Rainbolt said: "The controller is becoming a bottleneck before the disk drives do. The processor can't keep up ... Metadata consumes the CPU like you wouldn't believe."

If you could remove the metadata checking from the filer's CPUs and carry it out some place else, then the filer could get on with its core job of answering read requests and serving files as fast as it is capable of doing.

Alacritech and Isilon

Rainbolt said Isilon's scale-out clustered filers are affected by the same problem even though they serve lots of large files, meaning more sequential than random reads. Accessing clients store lots of Isilon-originated data in their caches and check whether their cache contents are up to date before hitting the Isilon fillers with read requests, meaning the Isilon processors can also get hit with metadata requests. Isilon-type systems also struggle when faced with lots of small file requests.

Rainbolt said an example 9-node Isilon system was running 500,000 NFS metadata operations per second. Placing an Alacritech ANX 1500 front-end metadata offload engine in front of it bumped the number up to 2.6 to 3 million NFS metadata ops/sec and the Isilon served more files.

In other words, Alacritech contends, there is generic filer processor bottlenecking going on, slowing down filer responsiveness to read requests, due to the metadata processing consequent on NFS v3 read requests.

Isilon has added flash to speed up metadata operations.

Alacritech saw an opportunity to cache filer metadata in a front-end device, its ANX 1500 – an NFS metadata offload engine in effect – and remove that burden from the filer. That means filers can stop using lots of expensive short-stroked 15K rpm drives and revert to using fewer slower and cheaper middle of the road drives.

Alacritech co-founder Peter Craft said: "We created an appliance to do metadata caching and use SSD (Solid State Drives). It involves our NFS Bridge technology and uses the ASICs from our 10gig Ethernet adapter work. It is very efficient and we have very low CPU utilisation on our box."

The ANX 1500 uses these ASICs with micro-code and has a "very thin, high-performance operating system."

Alacritech and NetApp

Craft said that other people saw there was a file access speed problem and recognised flash was a potential solution – and so mentioned NetApp's PAM (Performance Acceleration Module, now called Flash Cache). This is a slug of flash in NetApp's FAS controllers which functions as a read cache. He said: "In SPEC results PAM systems use fewer disk drives but the top end result is the same because they are CPU-bound. Even Avere can only do 22,000 ops. We can scale to hundreds of thousands of (SPEC NFS) ops."

He is saying that NetApp filers are limited in NFS ops scalability because they become limited by CPU processing bandwidth and not disk bandwidth. Cache resolves disk bandwidth problems but sits downstream of the CPUs and doesn't fix CPU issues.

Beginner's guide to SSL certificates

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
'Urika': Cray unveils new 1,500-core big data crunching monster
6TB of DRAM, 38TB of SSD flash and 120TB of disk storage
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Oracle hires former SAP exec for cloudy push
'We know Larry said cloud was gibberish, and insane, and idiotic, but...'
Symantec backs out of Backup Exec: Plans to can appliance in Jan
Will still provide support to existing customers
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.