Which Ethernet SAN storage protocol is best?
iSCSI easier choice for IT shops with no Fibre Channel
You the Expert looks at Ethernet block access storage protocols, with the candidates being AoE (ATA over Ethernet), FCoE (Fibre Channel over Ethernet) and iSCSI. As well as consultant Chris Evans, there are three Reg reader experts here: Geoff Barnett, Evan Unrue and "Mike1", all asked to contribute because of the high calibre of their comments on previous articles.
The general conclusions are, firstly, that iSCSI is better than FCoE for IT shops with no Fibre Channel legacy; the ones with such a legacy should choose FCoE. Secondly, AoE is ruled out when routability or security is required. However, AoE is the lightest weight protocol of the three.
Independent storage consultant
The two most widely supported Ethernet SAN options in the market are iSCSI and FCoE (Fibre Channel over Ethernet). If you have an existing Fibre Channel deployment, you will need a bridge device for whichever of these storage technologies you might choose to move to, but lets not forget that just getting a LUN attached to your device isn't the whole story, we also need to ensure resilience of connectivity and that any snapshot-based backup technologies either continue to function or are replaced with something equivalent. Depending on your vendor then, using FCoE may be the best option, as it allows you to keep your existing software and preserve your investment.
But what if you don't have Fibre Channel already, is FCoE still the right choice?
iSCSI is the predominant Ethernet SAN technology in the market and for good reason, it's been a relatively inexpensive method of providing SAN storage in a market where Fibre Channel was previously the only credible option. But often iSCSI is criticised for being slower or less efficient than Fibre Channel. That certainly may have been the case in the past, but today, it stands a credible contender, provided you design your storage infrastructure well.
A common misconception among IT professionals is that iSCSI doesn't need a Host Bus Adapter and that a software-based initiator will do the job just as well. While you can use a software initiator paired with a NIC that has a TOE (TCP Offload Engine), it won't perform as well. (You can also use software with FCoE on Linux or Windows; Windows requires non-Microsoft software). This might be fine for some applications where disk usage is light, but for applications with high disk usage you should always consider an HBA; why make a busy CPU do work another piece of hardware can do better?
A key benefit of iSCSI is that it uses standard Ethernet Switches, your switches could be 5+ years old, but they are ready for iSCSI. Not so with FCoE, which required a number of enhancements to be made to the Ethernet specification to provide support. Consequently FCoE may require a firmware update or even new switch fabric to be installed in your infrastructure. iSCSI is therefore generally the easier option for existing networks where there is legacy but no Fibre Channel.
Whichever Ethernet SAN protocol you choose, there are a number of rules you should follow when designing your storage infrastructure that work for both:
- Ensure file and block protocols are separated onto different adapters, both NFS and SMB/CIFS are very bursty protocols that will affect performance of iSCSI or FCoE storage connections.
- Depending on the make up of your network and your budget, you might also consider using separate switch fabric to prevent any bottlenecks.
- Just because you can use a software initiator, doesn't mean you should; both iSCSI and FCoE are sensitive to delayed writes. If your device is running a high load, use an iSCSI HBA or FCoE CNA.
Geoff Barnett is a freelance IT Infrastructure Consultant who works with Public and Private Sector clients. He is a NetApp Accredited Storage Architect and has previously provided consultancy services to IBM Global Services, Serco Civil Government and the UK Home Office. He lives in London.
Independent storage consultant
Athough all Ethernet-based, the three protocols of iSCSI, FCoE and AoE (ATA over Ethernet) are very different.
iSCSI operates at the TCP network layer and so benefits from TCP functionality such as congestion/flow control and routability. FCoE and AoE both operate at Ethernet layer 2 and are therefore not routable and have no inherent flow control mechanisms built in. To remedy this situation, for FCoE, the Enhanced Ethernet specification has been developed (variously called DCB and CEE). This will introduce changes to the Ethernet protocol to make FCoE a reliable transport mechanism like Fibre Channel. AoE has no flow/congestion control mechanisms. You have to ask whether routing your storage traffic is important to you; if it is then iSCSI is available. If not, FCoE users will need to wait until specifications are developed for routing FCoE traffic, which will surely come in the same way as FCIP and iFCP were developed to allow routing of Fibre Channel traffic. The AoE protocol doesn't appear to have any development on it and so is very unlikely to offer routing any time soon.
iSCSI requires no dedicated hardware to deploy and this has been its benefit since inception. It can be implemented on standard Network Interface Cards (NICs), or TOE (TCP Offload Engine) cards that offload the iSCSI processing to dedicated hardware, reducing the load on the server's processor. AoE also uses standard NICs. FCoE requires dedicated hardware in the host in the form of Converged Network Adaptors (CNAs) and new network switches that recognise FCoE data and can also integrate with existing Fibre Channel networks.
The upshot of this is that iSCSI and AoE require very little hardware investment and can use commodity components. FCoE on the other hand requires new (and at this stage relatively expensive) CNAs and switches. This will mean additional cost in skills training and many of the negatives associated with Fibre Channel today, including interoperability matrices and relatively higher cost of ownership. At the storage array end, iSCSI is widely supported and FCoE support has been road-mapped by the major storage vendors. AoE is only supported by a single vendor, Coraid.
iSCSI and FCoE protocols have both encryption and authentication built into the protocol. FCoE uses existing Fibre Channel security methods (masking/zoning) and iSCSI can use CHAPS. The AoE specification does not document any authentication or encryption capabilities. If security is important to you, then AoE is a bad protocol choice. iSCSI and FCoE are better, with FCoE offering the best security model.
The FCoE protocol continues to be developed and evolve. This is where the focus from major vendors has been placed. AoE isn't widely supported and doesn't have any development work in progress. Similarly, the iSCSI specification is relatively static. In terms of future development, the options look best for FCoE and if you currently use Fibre Channel today, this will offer a transition path to move from existing technology investments and skills.
In summary, iSCSI is widely supported and therefore a great choice for SMB environments. FCoE, whilst more complicated and expensive, offers better integration with existing Fibre Channel networks, has a better security model and has ongoing development in place. AoE stands out as an anomaly and is unlikely to be implemented in anything other than niche vendor-specific deployments.
Chris M Evans is a founding director of Langton Blue Ltd.. He has over 22 years' experience in IT, mostly as an independent consultant to large organisations. Chris' blogged musings on storage and virtualisation can be found at www.thestoragearchitect.com.
Product Specialist at Magirus UK
Although there may be an overlap in the market which each of these protocols addresses, it's important to note that each of the three protocols has an area in which it plays best.
If we take ATA over Ethernet for example which is the least deployed protocol of the three, It has a number of elements which work well for it and a number of downsides. AoE is lightweight and easy to configure. It doesn't utilise TCP/IP, so it has access to a larger portion of the Ethernet frame for Payload and introduces less CPU load on the server to handle IO transactions. However, not using TCP/IP does have its drawbacks as it means that storage packets cannot be routed over disparate IP Networks. Though some may see this as a security benefit, rather than a drawback, but it all depends on what the business wants to do.
As far as ATA over Ethernet goes, this is has had some adoption, with the likes of Coraid, but I doubt it will see widespread adoption unless they can fill the functionality gap which they lost when they ripped out TCP/IP.
If we move onto iSCSI, which uses the SCSI command set encapsulated in IP packets, we have some more agility in terms of how and where we access storage. Although, we do sacrifice some of our packet payload to TCP/IP, meaning less data can be carried in a single packet. Neither ATA over Ethernet or iSCSI are lossless protocols; they both rely on re-transmission to handle packet loss and congestion. Ideally, we want an entirely lossless network when it comes to storage.
I personally still see iSCSI as geared to the mid-market or for specific departmental deployments if larger enterprises are going to use it. There is a certain comfort factor which many larger organisations have with Fibre Channel and there are overheads with iSCSI, introduced by TCP/IP which means they are not getting as much bandwidth for their buck.
Enter Fibre Channel. It is completely lossless in design and does not rely on re-transmission to handle packet loss. It works on a system called buffer-to-buffer credits to handle congestion, meaning an initiator can happily transmit data, so long as there are credits available. When a write has been acknowledged, a credit is made available, when using Fibre Channel over long distance we simply increase the number of credits available. FCoE works in exactly the same way, with one difference.
As it is being carried over Ethernet, it needs to ensure that when Ethernet switches get congested they do not drop FCoE packets. They do this by applying a no-drop policy to FCoE packets using a Class of Service identifier on FCoE packets to signify its priority. So if a switch gets congested it can drop any packets, with the exception of FCoE packets to handle that congestion. In addition, as FCoE packets are not utilising TCP/IP, they can also carry larger payloads in Ethernet frames than iSCSI.
I don't see FCoE as an end-to-end deployment (yet). It plays best currently in network edge consolidation. When we introduce switches capable of FCoE conversion into the distribution layer of a network, we can consolidate FC and IP traffic at the edge of the network, meaning fewer cables, fewer network cards to manage and smaller physical and power footprints. In the datacentre world this has massive benefits, as it reduces server deployment time, which incidentally saves money. Obviously these benefits apply more to environments with existing Fibre Channel storage arrays or for IT administrators who want the benefits of Fibre Channel, without complexity, at the edge of their network.
So all three of these technologies utilise Ethernet, and all can effectively make a play for the unified network approach, and all have some merit. But to summarise the pros and cons of each; FCoE requires 10 Gigabit switches, which can be costly, and dedicated Converged Network Adapters (cards which support FC protocol and IP protocol) which can also be costly. ISCSI introduces increased CPU cycles onto the server unless TOE cards are deployed (again, costly), but can run over gigabit Ethernet (as well as 10 Gigabit Ethernet), which is less expensive to deploy and is suitable for many SMB environments. ATA over Ethernet, being lighter and equally as cheap as iSCSI, could potentially sit well in SMB environments but is limited in how its deployed, being non-routable.
In my view, as each of these protocols progress, FCoE has the largest amount of potential as 10Gigabit Ethernet sees greater levels of adoption and FCoE gears itself toward more of an end-to-end connectivity protocol. Storage administrators know and trust Fibre Channel and network administrators can carry on doing what they're doing.
Sometimes the best choice is to not choose
The article is about storage, and so we're talking iSCSI. iSCSI travels over Ethernet. An iSCSI connection is an Ethernet connection so for the rest of this post I'm going to say Ethernet for iSCSI. Ethernet also gives you other things, which of course you know because somehow this post came to you over Ethernet.
10G Ethernet and 8Gbit Fibre Channel are here now. You can buy that if you want to. Each adapter is going to run more than a thousand dollars, and you're going to pay again on the switch side. You'll pay twice if you're buying SFP+ modules as well. It will be a few years before we pass beyond 10Gig Ethernet and 8 Gig FC, so you've some confidence in your investment. Those standards are fairly new in fields that make large infrequent steps and happened relatively at the same time this past two years only by coincidence. With these connections you can pay more for the connectivity than for the box with processors, but probably not including the RAM and storage.
Fibte Channel over Ethernet is a fairly new link that, despite the name, isn't just Fibre Channel over Ethernet. It's Fibre Channel and/or Ethernet over a new connection type that's not quite Ethernet. Calling it Fibre Channel and/or Ethernet over a new connection type that's not quite Ethernet yields an unwieldy acronym: FCaoEoancttnqE, and that's hard to sell so we call it FCoE. The switches can be pretty expensive - the ports are all 10Gbps. The Host Bus Adapters (HBAs) are expensive too. But you can put those in your server. The links aren't just fast in bits per second - they're also very low latency, which is even more important.
You can skip all the per-server NICs and HBAs, SFP+'s and cables. Some blade servers, like HP BL465c G7, come with integrated dual 10Gbps FCoE now. Others from several vendors come with onboard dual 10Gbps Ethernet. No adapters, SFPs or cables to buy, only one (or two) blade interconnects in the back of the chassis, and many servers (16 or so) can talk to each other with amazingly low latency high-bandwidth connections. This is pretty cool because when something goes wrong, 90 per cent of the time it's the cables.
You don't even have to build out 10Gig and 8Gig infrastructure yet, because uplinks can be slower. Choose the right interconnect modules and you can choose how much of that you want to be Ethernet, and how much Fibre Channel, and change your mind at any time. Another advantage is that the blade interconnect can be "not a switch" so if the server teams need an interconnect between their servers that they can manage without permission or interference from the network team, this is it.
The microseconds latency is the most important thing. Most of the traffic is multiplied many times in your cluster. One client form update request from the uplink turns into dozens of file requests, database reads and writes, logfile updates, SAN block writes and reads amongst your servers before a single, simple next page is returned through the uplink. If you can change dozens of 1 millisecond hops into 20 microsecond hops between request and response they add up to a perceptible improvement in responsiveness to the customer even if his bandwidth to the cluster is limited.
Other brand servers you can get the same FCoE in a Mezzanine card today, and that's almost as good - the servers might be cheaper to offset. You still get the same leverage of no SFPs, no cables, and so on, but you use up a precious Mezzanine slot. The FCoE adapters don't cost much more than either the 10G Ethernet or the 8G fiber, and definitely less than both. Dell sells these, and I'm sure IBM does too. Cisco has one for their UCS. Not sure about the others, but it seems likely.
The way these FCoE interfaces work you can use them as any of 1Gbit Ethernet, 10Gbit Ethernet, 10Gbit FCoE, 8/4/2 Gbit fibre or 4/2/1 Fibre depending on the SFP module. So if you're using Fibre Channel now but migrating away from it, or are pure iSCSI now but might want Fibre Channel also in the future, you're covered. Some of them even have some internal "virtual connections" that allow the bandwidth to be divided up into multiple Ethernet and/or FC ports.
10Gig is the way to go in blades, and FCoE if you can swing it. Choosing has the downside risk that you might choose wrong. The nice thing about choosing FCoE adapters in your blades is that you can change your mind later.
So we're left with "What about rack servers?" Not everybody needs enough servers to justify a pair of blade chassis. Rack servers now typically come with four 1Gbps Ethernet NICs. If you need more than that -and you almost certainly do - or you need FC, you're going to need a NIC or HBA. If money's so tight that you can't think about strategy you're going to buy the quadport 1Gbit NIC or the bare minimum FC card you can get today and this post wasn't for you in the first place.
Here the decision point comes down to how many servers and if you already have the switch. If you don't have the switch and you need to provision links for enough servers, then a FCoE switch like the Cisco Nexus 5010 at 20 ports and $11K is more likely to give good return on investment even in the short term. For both 10GbE and 10Gbit FCoE you can use relatively inexpensive copper-based cables with integrated SFPs to the top of rack and keep costs down. Those Fibre SFP+ modules are pretty spendy in the pairs you need.
Regardless, keeping your options open in the future should add some weight to the FCoE side even if it's not the most economical solution today - though that's probably harder to sell to the executive team. The break-even is at about four servers today.
If you can't put that over with the E-team you're back up there with the quadport NICs and the cheapie FC HBAs and reading this must be sheer pain. I'm sorry. Rack servers benefit as much from low latency connections to each other as blade servers do - it results in a more responsive experience to the end users, who are the point of the exercise.
FCoE is new, and for now it's a one-hop deal. Your FCoE adapter can go to a FCoE switch, but that switch has to break out the connections and diverge the paths into Ethernet and Fibre Channel. It can't yet send it on to another switch still in FCoE form. The standard that allows for the second hop, routing and such things won't be ready for a year or two.
Fair notice: I don't own stock in any company mentioned. I do work for a company that sells solutions in this space including some but not all of the products mentioned, but my opinion is my own and my employer is neither responsible for it nor influenced it. I didn't get paid, nor do I stand to profit, from saying these things.
Mike1 is a Register commentator who prefers to remain anonymous so he can write freely. This piece originally appeared as a comment to the Ethernet storage article we ran in August. It is so good we figured it is well worth giving it wider exposure here.
Sponsored: Transform Your IT Infrastructure