Simplest Ethernet storage validated
ESG sees shocking simplicity and incredible cost-efficiency
Coraid's simpler-than-iSCSI Ethernet storage protocol has been validated by ESG which found it could install and use it in less than two minutes, and get better-than Fibre Channel performance at a fifth of the cost.
Coraid's EtherDrive SAN storage uses the lightweight ATA-over-Ethernet (AoE) protocol to link servers and storage arrays using standard Ethernet switches. This protocol ensures lossless delivery of data packets without involving upper level network stack processes such as the TCP/IP ones used by iSCSI.
The ESG report (pdf) describes hands-on testing of the product in a virtualised server environment. It says "AoE is a simpler and more direct protocol than either iSCSI or Fibre Channel. AoE is not built on IP, TCP, or SCSI; packets are addressed to devices using their Ethernet MAC addresses and sent across the network with a minimum of overhead."
"Both Fibre Channel and iSCSI run SCSI over high level networking protocols on top of a physical network infrastructure, consuming additional overhead and processing compared to AoE, which connects servers and storage directly across the physical Ethernet layer."
AoE packets are non-routable and confined to an Ethernet LAN.
ESG tested the EtherDrive SAN shelves connected to quad core x86 servers running ESX with Linux and Windows guest virtual machines. It created logical units, presented them to the servers for use and configured filesystems in under two minutes and said it found this "shockingly simple".
The ESG tester found that a 12 SAS-drive EtherDrive SAN could support 4,500 Exchange 2007 users, with a 23-drive one supporting just over 9,000 users. The report says: "streaming media performance was excellent, delivering 826 MB/sec from just 6 [64GB] SSD drives and more than 1,200 MB/sec from 24 [1B] SATA drives."
ESG also looked at acquisition costs for 1PB of networked storage, including the storage network infrastructure, with the EtherDrive SAN costing about $1.25m, an iSCSI SAN around $1.5m and a Fibre Channel SAN heading towards $6.9m.
The report says the "Coraid EtherDrive SAN has the lowest cost of acquisition, by a wide margin. The relative cost of acquisition of alternative technologies ranges from roughly 1.4x for DAS [directly-attached storage] to more than 5x for FC SAN. The FC SAN solution is so much more expensive in part due to the cost of acquiring FC SAN connectivity."
ESG confirmed that each EtherDrive SRX3500 can deliver hundreds of MB/sec of throughput for bandwidth-intensive scale-out applications using cost-optimized, high capacity SAS, SATA, and SSD drives and users can scale up to a petabyte of high performance capacity in only two racks at a cost of storage and connectivity far below Fibre Channel, iSCSI, or even DAS.
Coraid and VMware
In testing with virtualised servers and Coraid's EMX Mirror appliance ESG "was able to provision storage for virtual machines without the need for a storage administrator to complete the task. Likewise, the entire virtual storage infrastructure and the mappings to Coraid storage devices were visible through the vSphere client."
It found that it could take LUNs offline while virtual machines are running and not have a server crash. You can't do this with most if not all Fibre Channel and iSCSI SANS. ESG's report states: "The Coraid EMX Mirror appliance was able to synchronously mirror a live volume and provide seamless failover with no interruption in service. The ability to move disks between chassis live and online, while under load, was an eye opener, the support implications of simply relocating disks to a hot spare chassis are profound."
ESG noted there was only a command line interface and that Coraid was aiming to introduce a GUI plus REST support this quarter. The EtherDrive does not have thin provisioning or tiered storage but ESG points out that, firstly, Coraid's storage is much cheaper than arrays with these facilities, reducing the need for them, and, secondly, they are becoming available in hypervisors and filesystems anyway.
More than a modicum of praise
ESG says that the EtherDrive SAN has rock solid reliability, impressive cost-efficiency, consistent throughput levels, through hardware faults, and shocking simplicity, making management of petabyte-level storage capacities "a reasonable task". It recommends users consider Coraid EtherDriveSAN storage as the foundation for their virtualised data centres. Praise indeed. ®
Way too many fails
AoE - ATA over Ethernet. Easy, ATA doesn't offer much of a sophisticated command set, it's all master-slave stuff so the transport can be simple. I see resource locking is pretty coarse, hmmm, not good for performance. In essence a point to point topology that's exploiting that most all Ethernets are now layer 2 switched. Not much likelihood of collision there so traffic can assume the network to be deterministic. AoE can't be regarded as a storage network but it's good enough for simple configs (otherwise there'd be no CoRAID customers).
iSCSI - protocol heavy & exploited by LeftHand and Equallogic to build storage networks out of server bricks. So the thinking goes, Joe Admin's comfortable with NAS, let's do something similar but hook up via iSCSI. iSCSI ok its weight mitigated by network speeds but the "brick" concept.... I (personally) don't believe the brick concept, in all failure modes, can assure a write all the way through its appliance software, the underlying OS stack and the off-the-shelf RAID controller to the disks (it's only the last bit that has cache). Of course, Joe's use case might not be sensitive to a little bit of data loss.
FCoE - overcoming the reasons "ethernet" networks are not good for complex storage networks.
Simple, really, it's about choice & how easily that's ill informed. However, with run for consolidation on virtualised infrastructure Joe Admin needs to be confident of how he's managing his risks.
Plan9 finally on route to victory?
CoRaid is the one cool company that largely develops on Plan9
All the best to them, and in a way using AoE on a small lan might still be less fail than any enterprises falling for the FCoE hoax, pulling out their working setup to find convergence is when they need to deploy a second LAN on expensive high-specced 10GE switches and still never get full ISL speeds :)
Ok, it's a total puff piece. AoE is still nifty.
Really, AoE is worth a look The specification is impressively brief.
If I were to pick on something awkward, I would wonder what happens when the ethernet switch became saturated. Everyone knows ethernet is fast and cheap under ideal conditions; when it's not too busy. When it's stressed, well...
Some light reading was coughed up by the development of the LHC:
Pity they didn't (couldn't?) name the switches.
A paper on FC behaviour when similarly stressed? Anyone?