How to build a BONKERS 7.5TB, 10GbE test lab for under £60,000
Sysadmin Trevor runs the sums, builds a dream rig
Part Two In part one of The Register's Build A Bonkers Test Lab feature, I showed you how to build a test lab on the cheap; great for a home or SMB setup, but what if we need to test 10GbE? Part two is a wander through my current test lab to see how I've managed to pull together enough testing capability to give enterprise-class equipment a workout on the cheap.
A proper enterprise test lab would be based on the same equipment as is used in production. This allows sysadmins to test esoteric configuration issues that can – and do – arise. Virtualisation has made this less of a pressing concern for fully (or mostly) virtualised shops, but it is still general good practice. That said, we don't all get that luxury; budgets can stand in the way, or even complications arising from corporate mergers. Let's take a look at what I've pulled together.
As with most real world test labs, mine contains equipment that spans various generations. Several of the deployed virtualisation servers populating my client sites are white box systems known internally as "Persephone 3" class units. These servers sport ASUS KFSN5-D motherboards with 2x AMD Opteron 2678 processors each.
Inside a Persephone 3 file server
One of theses Persephone 3 servers – sadly, with only 16GB of RAM – serves dual duty as both file server and edge Hyper-V server. It contains an ASUS PIKE allowing for up to 4 RAID 1 arrays. The file server also has a Supermicro AOC-SAS2LP-MV8 8-port SATA HBA connected to 8 Kingston Hyper-X 3K 240GB SSDs.
The operating system on the file server is Windows Server 2008 R2; though only because I do not have licensing that would allow me to use the vastly superior Server 2012 storage technology. Networking is a pair of Intel-provided NICs: one, an Intel 1GbE ET Dual Port, the other an Intel 10GbE X520-2 Dual Port. This gives me the ability to test file and block storage on 1GbE or 10GbE links; alternately configured as teamed or unteamed so that I can also test failover and MPIO capabilities of Hypervisors, VMs and applications. The onboard NICs serve as management interfaces.
The combination of disk array types gives me some options as well. A Windows RAID 0 of the SSDs will provide enough I/O to saturate a 10 gigabit NIC full duplex. A Windows RAID 5 of the same SSDs can get to about 900 MB/sec in one direction or about 750MB/sec full duplex and appears to be bounded entirely by the CPU.
The PIKE card's RAID 1 is good for mounting lots of VMs on big spinning rust arrays that don't need heavy IO. It only runs minor VMs locally, but serves up iSCSI LUNs for the rest of the test lab via Microsoft's free iSCSI Target.
My server is housed in a Chenbro SR-107; this gives me eight 3.5" hotswaps for the RAID 1 arrays. I am planning to install the SSDs in two Icy Dock 4-in-1s. Unfortunately due to lack of stock at the local computer store, at time of press I was still reliant on my ghetto "cardboard and duct tape" 8-in-2 mounting solution. The SR-107 allows for an extended-ATX motherboard; important for my homebrew file server as without a motherboard that large I would run out of PCI-E slots rather quickly.
Being a recycled server, exact costs on my build aren't possible. Assuming your 3.5" hotswaps only contain a pair of 1TB disks for the OS, a new version of this storage node based on a Supermicro X9DR7-TF+, two Intel Xeon E5 2603 and 16GB of RAM will cost roughly $2,000 without the SSD array.
The SSD array cost me $1,400 and the network cards can be had for about $550 if you hunt. Windows Server is $750. This puts the total cost of a fast+slow 10GbE homebrew storage server capable of running a fairly large and diverse test lab at roughly $4,700. I can fill up the rest of the hotswap trays with 2TB drives and still keep costs under $5,500.
The net result of this configuration gives is about 6TB usable of slow storage and about 1.5TB usable fast storage. I also get two 1GbE management/routing NICs, as well as two 1GbE and two 10GbE test lab-facing NICs all on a system perfectly capable of hosting a few VMs in its own right.
One of my virtual hosts is a recycled Persephone 3 server with 64GB of RAM and a pair of SATA hard drives in RAID 1. If your production fleet is older, such legacy systems can be useful to test VMs on the same speed system as you have commonly deployed. Unless you have a spare handy, however, it may not be worth your time. The cost of the RAM is brutal – hence why my file server limps along with 16GB – and the $/flop/watt of operation makes continuing to keep the old girl online increasingly questionable. Even for test labs, OpEx is a real consideration.
Switches and virtualisation nodes
I also have a pair of the "Eris 3" hosts discussed in part 1 of this series. While I certainly wouldn't call homebrew vPro-based systems "enterprise class" by any standard, the Intel Core i5-3470 processors and 32GB of RAM are more than fast enough for all but the most demanding VMs; they've proven to be solid testbed nodes even for intense database work requiring high IOPS. These make great systems for dedicating to a specific high-IOPS VM; typically a database server.
These Eris 3 nodes are basic units costing about $1,000 a piece and sport a 2-port Intel 1GbE NIC. There remains an open PCI-E slot to add 10GbE NICs as required. While wonderfully cheap, their downside is the space requirement they impose; when my test lab is supposed to fit under my desk, they're great. When I need to do enough compute that I require a rack's worth of power and cooling, these are a terrible plan. Another major downside is that they cannot use ECC registered RAM, preventing me from testing RAM configurations in real world use in my server farms.
Enter the Supermicro Fat Twin: eight compute nodes in a 4U box. If you only need to front a handful of enterprise-class VMs, you can probably get by on some Eris 3-like nodes. If your required test environment has dozens – or even hundreds – of different VMs the Supermicro's widgetry makes far more sense. The Fat Twin has far better compute density, is rack mountable and is notably more power efficient.
My my, what a beautiful big Supermicro rack, you have
Configuring each node with two Intel Xeon E5-2609 (quad-core 2.4Ghz processor) and 16 8GB DIMMs (128GB total) costs roughly $21,000, or about $2,625 per node. This configuration would give the Fat Twin about one and a half times the CPU power and four times the RAM for a little over one and half times the price per node of my Eris 3 white box specials.
I will get the opportunity to provide more detail on the Fat Twin option soon. Supermicro is shipping me a Fat Twin with a range of different node configurations; expect a thorough review, including comparisons to both the Persephone 3 and Eris 3 nodes for a variety of computing workloads.
Now, the real fun and games starts: Networking
Networking has proven to be the most troublesome aspect of my test lab assembly. Most of my networking nerd friends are thoroughly indoctrinated Cisco believers. If you corner them, they will admit that something called "Juniper" can probably do what needs be done, but you need the better part of a bottle of scotch to get them to even admit that Arista even exists. They aren't much help when price is a consideration.
If your corporate network is run by these types of folks you may end up deploying whatever the company standard switch is into your test lab. Just as likely you wouldn't be allowed to touch the thing if you did buy a company-standard switch as that would fall under the oversight of the networking team. While I can't help you resolve internal corporate politics, I have done some legwork into finding solid 10GbE switches for the rest of us.
First on my list is the D-Link DGS-3420-28TC; with 24 10/100/1000Mbps ports and four SFP+ 10GbE ports it is the entry-level to the 10GbE world. With a little bit of hunting, you can find this switch for roughly $1,500, more commonly about $2,000. It is a solid switch if your 10GbE port needs are four or fewer; it is also a great switch to bridge the 1GbE and 10GbE segments of your network. $375 per 10GbE switch port is almost reasonable, and a solid consideration if you are only putting a few 10GbE compute nodes into play.
More switches and virtualisation nodes
If you need versatility, try the 24 SFP+ 10GbE Dell PowerConnect 8132F. With the lower support option, the Dell switch costs $8,069; $336 per port, and very reasonable. If your organisation requires full enterprise support, even for test lab setups, then the cost jumps to $12,880. At this price the Dell is an eye-watering $536 per 10GbE switch port. For the money you get a switch that supports more enterprise features than the D-Link and has an expansion module. You can fill the expansion module with either four additional 10-GBase-T or SFP+ ports, two 40Gbit QSFP+ ports or an additional eight SFP+ ports via the QSFP+ module and breakout cables. The expansion modules cost extra.
The workhorse switch for me is the 24 SFP+ 10GbE Supermicro SSE-X24S. It supports [PDF] the standards I want in a test lab – or indeed in a production – switch and can be found for as little as $7,500. This puts 24 ports of 10GbE at your disposal for $312.5 per 10GbE switch port. It has no uplink ports, expansion modules or easy connectivity to traditional 1GbE networks; it is best for those looking to keep their entire network segment 10GbE.
These are, of course, prices for those who don't have any sort of volume deals with the vendors in question. You can drive down the cost of the network a small amount with a little effort.
If you are not using the Fat Twin for your compute nodes, then a dual-port Intel X520 10GbE card will run you roughly $425. Fat Twin users can snag a Supermicro AOC-CTG-I2S for $430. It is Micro-LP, so depending on the Fat Twin it will leave your other LP PCI-E slot open for additional goodies; an important considering for a test lab node.
10GbE nose to tail
Our storage node is $5,500 and delivers a mix of slow and fast storage. We can get 8 Fat Twin compute nodes for $21,000, but outfitting it with 10GbE cards and buying the requisite SFP+ direct attach cables will drive the total cost to $25, 000. The switch is an additional $7,500. The total bill for our enterprise test lab – without an operating system on our compute nodes, mind you – is $38,000.
Of a 24-port switch, this configuration would have 18 ports occupied. That's just enough room in my test lab to hook up my Persephone 3 and two Eris 3 systems. Seeing as how the storage server has to be up 24/7, it is here that I house long-term test lab VMs. The domain controller, DNS/DHCP server, various layers of CentOS router/firewall VMs and an administrative VM used to RDP into the lab all live here.
This works well for me; the test lab is 10GbE throughout and lives on its own subnet. A CentOS VM running on the file server acts as a router bridging the rest of the network and the test lab. Because the file server sports both 1GbE and 10GbE NICs, I can RDP into the test lab network through one of the file server's 1GbE ports, saving the cost of a 10GbE port on my production network to bridge the two networks. More importantly, it allows me to put all sorts of firewalls and intrusion detection systems between the two to keep any of the creepy crawlies that may leak into the test lab from breaking out into the production network.
There are a lot of tweaks you can make to this configuration. Rather than relying on software raid in my storage node, an additional $800 would buy an LSI MegaRAID SAS9280-16i4e. Not only does it do all the commonly requested RAID levels, it could use the eight Hyper-X drives as a block-level cache array to front end a whole lot of spinning rust. Your files server instantly becomes bulk automated tiered storage. Once I get my hands on one, I'll review it.
Unless you rely a great deal on open source, the hardware may well be the least of your concerns. You aren't going to cover eight nodes of compute with a Technet license. Eight Server Standard licences will run you at least $6,000, with eight Server Datacenter moving just over $30,000. If you choose Microsoft's hypervisor and management tools to power your test lab, the System Center Datacenter management tools – and let's face it, you'll be running enough VMs on those kind of systems to need Datacenter – will run almost another $30,000.
Going the VMware route (using acceleration kits to provide the first 6 licences and the vSphere server, then adding 10 CPU licences and the mandatory minimum support) will range from about $23,700 for Standard to just over $70,000 for Enterprise Plus. That's before you add any Windows licences you may or may not need.
Bringing it all together, that's 64 Sandy Bridge generation Xeon cores at 2.4Ghz with 1TB of RAM, 6TB of slow storage, 1.5TB of fast storage, for $38,000. Add VMware Standard and unlimited Windows Server instances all interconnected with dual 10GbE networking for a little over $90,000. Not a bad test lab at all. ®