AMD engineering another Opteron-like leap
A future of APUs and dense-packed servers
VMworld 2012 It is not as much fun to be in the server part of Advanced Micro Devices these days, with Intel surging in the server racket and expanding out to switching and storage with its Xeon processors and Intel more or less counting the substantial innovations that AMD's engineers crafted for the Opterons a decade ago. The good news if you like a good fight is that there is a whole new management and engineering team at AMD now, and they not only understand that AMD has to do some serious innovating, but they are itching for the fight.
Suresh Gopalakrishnan, who was hired to be vice president and general manager of AMD's server business unit back in June, is using the VMworld 2012 virtualization extravaganza in San Francisco to throw down the gauntlet to Chipzilla and to begin the long process of articulating the Opteron processors' current advantages over rival Xeon and future Atom processors for servers, but also what AMD plans to do in the future to address the very different needs that future systems will require.
The first step in fixing a problem is to admit there is one, and Gopalakrishnan is not shy about that. Mainly because he wasn't at AMD when the company took its eye off the server ball ahead of the Great Recession, thereby ceding hard-won ground back to Chipzilla in the server racket. Gopalakrishnan was vice president of engineering at networking upstart Extreme Networks before coming to AMD, and way back in the dawn of time he was the engineer in charge of Hewlett-Packard's PA-RISC workstation chipsets and then an engineering manager for Sun Microsystems' breakthrough "Blackbird" UltraSparc-II processors. He has both networking and server chip chops, and this matters greatly in the modern systems arena.
Gopalakrishnan sees AMD's situation the same way that many of us outsiders do. The company did a slew of innovation with the initial "SledgeHammer" Opteron processors, shooting the gap between Intel's 32-bit Xeons and 64-bit Itaniums and bringing innovations like the HyperTransport point-to-point interconnect and multicore processing out as well. Because of these innovations and compatibility with the x86 instruction set, the Opterons took off in the high performance computing and fat SMP server spaces and ate market share from Intel. But between 2007 and 2011, there was reduced emphasis on server chip engineering and some hiccups with the delivery of Opterons as well at the same time that Intel woke up and responded to the Opteron threat with the "Nehalem" designs, which essentially took cores for laptop processors and tricked them out for servers while adding its own QuickPath Interconnect point-to-point processor lasher.
AMD admits it has not focused enough on servers in recent years
Many have argued, El Reg included, that Intel has brought bigger guns than AMD to the server chip war in 2012.
"AMD's reduced emphasis on server innovation over the past several years led to a decline in the performance advantage," says Gopalakrishnan, conceding that when this happens, you end up competing on price. "Our previous approach was broad-brushed, but now I want to focus our go-to-market strategy. And we need to execute both on the products and that go-to-market strategy."
Fun with virty benchmarks
So to start with at VMworld 2012, what Gopalakrishnan and his team are focusing on is a simple and precise message that demonstrates the performance and bang-for-the-buck advantages that the current Opteron 6200 processors have over the current Xeon E5-2600s from Intel.
To start with, AMD has gathered up some VMmark performance stats on pairs of Opteron and Xeon servers, respectively from Hewlett-Packard and Dell (pretty much the two main tier one Opteron sellers at this point), and ginned up pricing on the benchmarked machines used in the VMmark test while at the same time carving them up with either 30 or 40 virtual machines. The comparison that AMD is using pits a two-socket servers using eight-core 2.9GHz Xeon E5-2690 processors with 256GB of main memory in each node against two-socket servers with 16-core 2.7GHz Opteron 6284SE but with only 128GB of memory in each node. Remember that the Opterons don't have simultaneous multithreading (what Intel calls HyperThreading) while the Xeons do, so both sets of machines presented 64 threads to the VMmark test.
Opterons versus Xeons on the VMark virty benchmark
On the HP machines shown above, the two Xeon machines had a VMmark of 11.05 versus an 8.31 score on the pair of Opteron servers, so it is a bit unclear how AMD can say these machines offer equivalent performance running the ESXi 4.1 hypervisor as the above chart implies. It would have been better if HP had tested the machines with the same memory configurations. The Dell machines also had half as much main memory on the boxes, and the PowerEdge 715 machine cited by AMD was not yet online as El Reg went to press, but presumably there is a similar gap in the VMmark ratings between the two machines. The VMmark performance gap--how many tiles you can load up running a mix of applications inside of virtual machines--is about 25 per cent between the Opteron and the Xeons.
AMD says that it can deliver around 30 per cent lower cost per VM and that can be as much as a savings of $130,000 per rack (including the cost of the base hardware and vSphere Enterprise Plus virtualization tools), but those VMs in the comparison have less performance and memory if you really look at the VMmark results carefully. AMD doesn't do the VMmark benchmarks or choose the configurations to compare--HP and Dell did--and this is the data AMD has to work with. But the number of VMs, the amount of main memory, and the VMmark performance should the same to make a proper comparison between the two server types.
The VMmark test may not be the best gauge of performance for real-world applications, which is why AMD ran a different set of tests using the DVD Store, an e-commerce application that has been turned into a benchmark to simulate the running of web servers and database servers. This is actually one of the components of the VMmark test suite, in fact.
To show that the Opteron can keep pace with the Xeon, AMD ran the DVD Store test on an HP ProLiant DL385 Gen8 server with two Opteron 6274s running at 2.2GHz and sporting sixteen cores per chip as well as on a ProLiant DL380p Gen8 server equipped with two eight-core Xeon E5-2665 processors running at 2.4GHz. Both machines have 96GB of main memory and a single 500GB 15K RPM disk; both are running Microsoft's Hyper-V hypervisor and Windows inside the guests that support the DVD Store benchmark's web and database servers. Their performance tracks as VMs running the DVD Store test are loaded, and the throughput from each VM falls as VMs are added to both boxes, nearly in lockstep:
Operations per VM for Opteron and Xeon servers
In another test run, which pitted that HP ProLiant DL380p with the same Xeon E5-2665 against a Dell PowerEdge R715 using two 2.2GHz Opteron 6274s, the Opteron machine equipped with eight VMs running the DVD Store benchmark shows about a 12.5 percent lower cost per VM than the Xeon E5 box.
Cost per VM for Opteron and Xeon servers
AMD has some advantages, but it is a far cry from the gap the Opteron processors enjoyed over Xeons back in 2004 and 2005.
Doing this kind of analysis is what Gopalakrishnan is talking about when he says AMD will have a more targeted approach to marketing the Opterons going forward.
The plan is to focus Opterons on rapidly growing segments of the server biz, including virtual server stacks, public and private clouds (which are virty machines with automation and utility pricing), big data, and supercomputing.
Looking ahead to SledgeHammer, the sequel
If you want to see the future of AMD's server business – or at least a part of it – look no further than the accelerated processing unit, or APU, hybrids that the company has created for laptops and now desktops. With APUs, AMD is putting a low-power CPU and a modestly powerful GPU on a single die. For many workloads, you can make the GPU not just run a display, but do computational work alongside of the CPU. In many cases, as AMD tried to demonstrate with its FireStream GPU coprocessors, which have been discontinued, and as Nvidia and Intel continue to show with their respective Tesla GPU and Xeon Phi x86 coprocessors, the coprocessor is better designed to do parallel work at a much lower power draw than the CPU. So it makes sense to get ceepie-geepie server chips into the field for AMD.
"Servers, as I look forward, become dense, heterogeneous compute clusters," says Gopalakrishnan. "They will have many, many cores put inside of a chassis. And we will deliver heterogeneous compute clusters with different types of processing elements, depending on the workload."
So that means some chips will be a combination of an Opteron CPU and a GPU, while others will be more traditional CPUs, and still others will be server variants of the Fusion or FirePro APUs used inside of PCs and maybe tablets at some point. All of these parts will be branded Opteron, and they will be certified against the Windows and Linux software stacks commonly used in enterprises these days.
It also means making use of the "Freedom" 3D torus/mesh interconnect that AMD got its hands on when it acquired SeaMicro earlier this year, which the company will be enhancing with an Opteron processor later this year.
AMD is not providing any roadmaps for Opteron APUs just yet, but Gopalakrishnan tells El Reg that the first thing the company is doing is getting server software certified to run on selected members of its notebook APU stack right now and that the Opteron roadmap will be updated perhaps early next year once AMD further fleshes its plans out.
It is quite possible that a Fusion or FirePro APU tweaked for server workloads will be the first Opteron processor certified to run inside of the SeaMicro chassis, but it seems far more likely that AMD will plink the Opteron 3200 processor for single-socket machines onto the SeaMicro mobos, replacing Intel's Xeon E3s.
Don't get the wrong idea and think that AMD is getting out of the general-purpose server racket, because it most certainly is not.
"Just because you have dense servers does not mean that 1P, 2P, and 4P die off tomorrow," says Gopalakrishnan. "Once we make a part, it is up to our partners to put them into form factors."
Later this year, AMD will begin the rollout of its "Piledriver" cores for the Opteron family of server chips, and as El Reg has previously reported, the future "Abu Dhabi" Opteron 6300 parts are only expected to offer a 200MHz performance bump over the current Opteron 6200s. The Piledriver cores will have all kinds of architectural tweaks to squeeze more performance out of the same thermal envelopes, of course, so there is no telling yet what the overall performance boost will be.
What Gopalakrishnan could confirm is that the future Abu Dhabi Opteron 6300s for two-socket and four-socket machines, the "Seoul" Opteron 4300s for two-socket and single-socket machines, and the "Delhi" Opteron 3300s for single-socket boxes would have a staggered launch as in Opteron days gone by, with launches in 2012 and 2013 rather than all at once as the Opteron 3200s, 4200s, and 6200s, came out last November. ®