Feeds

Alacritech apprehends an NFS anomaly

Wildly unbalanced filer I/O

Gartner critical capabilities for enterprise endpoint backup

Comment Alacritech claims NFS filer I/O is grossly skewed towards reads and suffers from read metadata processing that chokes controller CPUs.

It has just launched its ANX 1500 filer accelerating cache product based on its recognition of NFS read metadata filer I/O loads that can overwhelm filer processors and delay file delivery.

A couple of years ago Alacritech had a 10gig Ethernet adapter nearing readiness but found that the market had moved on, wanting converged network adapters (CNA) which could do FCoE and iWARP, as well as iSCSI and TCP/IP Offload and basic Ethernet NIC'ing. It would need to have written its own code or licensed IP from Emulex or QLogic and decided, according to marketing VP Doug Rainbolt, it was "not worth it". (Ironically Emulex licenses Alacritech IP for its CNA.)

Alacritech decided to turn aside from the adapter business and, reflecting its founders' Auspex roots, look at accelerating network-attached storage (NAS) file access. Most filer shops use NFS v3. Close inspection to NFS v3 filer I/O patterns showed wild read and write asymmetry. One Fortune 500 company exhibited this pattern:

  • Reads – 52 per cent
  • Metadata (eg Lookups, GETATTRS) – 47.96 per cent
  • Writes – 0.04 per cent

From the point of view of the filer's controller, half of its life was spent getting data off the disk drives and out to accessing host servers and the other half checking the metadata associated with read requests. Write I/O activity was basically inconsequential. Particularly from the disk I/O point of view as writes would be cached in the controller's NVRAM and re-ordered to provide near-sequential I/O. Also, for NetApp users, Rainbolt said WAFL is good for writes.

Reads can not be re-ordered because they have to be answered as and when they come in and are randomly located on the filer's drive platters. The typical answer to this is to use high-speed drives and, if necessary, short-stroke them to minimise head movements (seek time). Both are expensive to do.

But what Alacritech realised was that the randomness of read I/O wasn't the only problem – read metadata was just as big a problem, turning a filer's processors into access bottlenecks if enough metadata checking was needed. Rainbolt said: "The controller is becoming a bottleneck before the disk drives do. The processor can't keep up ... Metadata consumes the CPU like you wouldn't believe."

If you could remove the metadata checking from the filer's CPUs and carry it out some place else, then the filer could get on with its core job of answering read requests and serving files as fast as it is capable of doing.

Alacritech and Isilon

Rainbolt said Isilon's scale-out clustered filers are affected by the same problem even though they serve lots of large files, meaning more sequential than random reads. Accessing clients store lots of Isilon-originated data in their caches and check whether their cache contents are up to date before hitting the Isilon fillers with read requests, meaning the Isilon processors can also get hit with metadata requests. Isilon-type systems also struggle when faced with lots of small file requests.

Rainbolt said an example 9-node Isilon system was running 500,000 NFS metadata operations per second. Placing an Alacritech ANX 1500 front-end metadata offload engine in front of it bumped the number up to 2.6 to 3 million NFS metadata ops/sec and the Isilon served more files.

In other words, Alacritech contends, there is generic filer processor bottlenecking going on, slowing down filer responsiveness to read requests, due to the metadata processing consequent on NFS v3 read requests.

Isilon has added flash to speed up metadata operations.

Alacritech saw an opportunity to cache filer metadata in a front-end device, its ANX 1500 – an NFS metadata offload engine in effect – and remove that burden from the filer. That means filers can stop using lots of expensive short-stroked 15K rpm drives and revert to using fewer slower and cheaper middle of the road drives.

Alacritech co-founder Peter Craft said: "We created an appliance to do metadata caching and use SSD (Solid State Drives). It involves our NFS Bridge technology and uses the ASICs from our 10gig Ethernet adapter work. It is very efficient and we have very low CPU utilisation on our box."

The ANX 1500 uses these ASICs with micro-code and has a "very thin, high-performance operating system."

Alacritech and NetApp

Craft said that other people saw there was a file access speed problem and recognised flash was a potential solution – and so mentioned NetApp's PAM (Performance Acceleration Module, now called Flash Cache). This is a slug of flash in NetApp's FAS controllers which functions as a read cache. He said: "In SPEC results PAM systems use fewer disk drives but the top end result is the same because they are CPU-bound. Even Avere can only do 22,000 ops. We can scale to hundreds of thousands of (SPEC NFS) ops."

He is saying that NetApp filers are limited in NFS ops scalability because they become limited by CPU processing bandwidth and not disk bandwidth. Cache resolves disk bandwidth problems but sits downstream of the CPUs and doesn't fix CPU issues.

Secure remote control for conventional and virtual desktops

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Object storage bods Exablox: RAID is dead, baby. RAID is dead
Bring your own disks to its object appliances
Nimble's latest mutants GORGE themselves on unlucky forerunners
Crossing Sandy Bridges without stopping for breath
A beheading in EMC's ViPR lair? Software's big cheese to advise CEO
Changes amid rivalry in the storage snake pit
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.