Original URL: http://www.theregister.co.uk/2009/08/27/netapp_elephant/
NetApp's missing bits: Don't we need a switch infrastructure?
There's an elephant in the room, and it don't look quite right
Comment NetApp has announced dynamic scale-out, a feature of ONTAP 8 that has multiple heads for its filers. But what wasn't announced was any description of a head and filer interconnect, nor the ability for multiple filers to connect to the multiple heads. So how can that work, then?
Existing highly scalable file storage systems, such as those from HP and Isilon, have an architecture with multiple heads connecting to multiple disk drive enclosures through a switched fabric. How can NetApp address the same sort of market as these suppliers if it is missing the switching fabric and ability to have many storage units connect through it to the many heads?
A classic NetAppFAS filer has a storage unit containing disk drives in shelves linked by a 4Gbit/s Fibre Channel Arbitrated Loop (FCAL) to a dual controller which runs Data ONTAP and links to application servers via Fibre Channel (FC), Ethernet (iSCSI) and Fibre Channel over Ethernet (FCoE). Add multiple heads to this mix and things must surely change downstream, in the head-storage unit interconnect area.
The heads can have the PAM II flash cache cards, with up to 4TB of the stuff. These are super-charged in the I/O department and start making the system, with just one filer, look unbalanced. If there are multiple heads, which implies more than two, then they need to co-operate so as to have coherent cache contents and not write to the same file at the same time.
It would help if, should one fail, the others could pick up the load. We're basically talking about some form of clustering and that requires an interconnect. What is it?
Let's call the storage unit in this multi-headed FAS filer a storage node and let's call a head a processing node. Now let's look at HP's ExDS9100.
HP's Extreme Data Storage 9100 box set has up to 10 82TB storage nodes connected by a SAS backplane to up to 16 processing nodes with PolyServe clustering software keeping the heads sweetly inter-operating. HDS' AMS storage arrays also use a SAS interconnect, with up to 32 point-to-point SAS links. The 6Gbit/s SAS standard outperforms 4Gbit/s FCAL too.
Now let's look at Isilon's clustered IQ filers. Storage nodes are interconnected by an InfiniBand fabric to the processor (Accelerator) nodes which do I/O and processing and are clustered. It's the same basic box set - several storage nodes connected by a switching fabric to clustered processing nodes.
Pillar's Axiom has storage nodes, called bricks, linking to performance nodes called slammers with a separate management node called a pilot.
The beauty of separately scaling storage and performance nodes is that you can scale them independently. You also avoid having the interface to the single storage node becoming a bottleneck and slowing things down.
Let's return to NetApp and have a fresh look at its multi-headed filer set up. We see multiple performance nodes (heads) with ONTAP 8 cluster software and one storage node. There is no interconnect fabric between the heads that we know of and no ability to have multiple storage nodes (headless filers). Isn't this like a multi-headed elephant with a skinny body and only one leg? It's unbalanced and just doesn't feel right.
Given that NetApp's competition for scale-out and highly performant filers have already added multiple storage nodes and an interconnect to link them to their interlinked and clustered performance nodes, isn't it logical that NetApp will do the same?
We know it's heading to the clouds and embracing object storage and giving its heads greatly enhanced I/O capability. This surely means that, in terms of balance, the rest of a multi-headed filer has to catch up with the extremely powerful processing and I/O-rich heads.
I submit that NetApp has a secret elephant in its multi-headed filer room, and said beast is an interconnect fabric between the multiple performance nodes plus the ability to have multiple storage nodes. Some things just can't be ignored, can they? ®