Original URL: https://www.theregister.com/2009/08/24/sun_3leaf_specjbb_tests/

Sun touts Sparc T Java bang

3Leaf Systems blows Sun out of the water

By Timothy Prickett Morgan

Posted in Channel, 24th August 2009 22:00 GMT

Sun Microsystems might be as quiet as a church mouse these days as it awaits the final OK on the looming $5.6bn Oracle acquisition, but the company is still benchmarking its latest Sparc T iron as if everything were normal - they're just not bragging about the results as per usual.

Sun has just tested the new 1.6GHz Sparc T2 and T2+ processors, which it quietly announced in late July. And when I say quietly, I mean Sun didn't tell anyone and didn't talk publicly about the chips. At all.

Which probably isn't helping server sales these days.

But what do I know? I don't have my own fleet of yachts and jets ... and we all know that Oracle CEO Larry Ellison has said the company is committed to the hardware business. Some days, however, it sounds more like Oracle plans to have the hardware business committed.

Anyway, Sun and its server partner, Fujitsu, loaded up the SPECjbb2005 Java server benchmark suite onto a four-socket T5540 server, the current top-end box in the Niagara family of multicore Sparc T servers. The SPECjbb2005 test is put together by the Standard Performance Evaluation Corporation and is essentially a Java implementation of the Transaction Processing Council's TPC-C transaction processing test (minus some of the disk I/O requirements).

Using the 1.6GHz Sparc T2+ chips (that's 32 cores and 256 threads a-processing) and 256GB of main memory, the T5540 was able to crank through 841,380 business operations per second (BOPS for short). This machine was configured with the Solaris 10 operating system and Sun's own HotSpot 32-bit JVM.

By itself, Sun also tested a single-socket Sun Blade T6320 blade server using the 1.6GHz T2 (that's 8 cores and 64 threads), which was able to deliver 229,576 BOPS of performance on a blade with 64GB of main memory. The company was also touting how a single-core Sparc T5220 server using the 1.6GHz chips "posted a single-chip world record," but these results are not available at the SPEC site. (Sun's PR people might have meant the blade server above?)

These are perfectly respectable performance figures for a four-socket or single-socket server, but nothing extraordinary. And while Sun always wants to aim its marketing gun at IBM's Power Systems machine running AIX, the most recent and most interesting machine to be put through the SPECjbb2005 paces is a shared memory cluster from little-known server maker 3Leaf Systems.

That system uses a clustering technology called Voyager that rides atop an InfiniBand interconnect, and which creates a cache-coherent NUMA cluster out of x64 server nodes. In this case, the Voyager server setup from 3Leaf included 16 three-socket Opteron server boards. Two of the sockets had 2.7GHz quad-core Shanghai Opteron 8384 processors, and the third socket had the Voyager ASIC for corralling all those processors into a shared memory system.

The 3Leaf box had 128 cores (and 128 threads since Advanced Micro Devices doesn't do simultaneous multithreading) and 488GB of main memory, and was set up with 3Leaf's own DVVM hypervisor, Red Hat Enterprise Linux 5.2, and Oracle's JRocket JVM. It chewed through a stunning 5.5 million BOPS, setting the all-time record for SPECjbb2005 performance on a single system.

The prior record was 5.18 million BOPS on a Silicon Graphics Altix 4700 Itanium-Linux shared memory system tested back in the fall of 2007. To my knowledge, however, the Altix 4700 has never been bought to run a commercial Java workload, making this feat basically irrelevant.

The largest x64 system prior to the 3Leaf test was an Express5800/A116 that was tested this January by its maker, NEC. This sixteen-socket SMP box used Intel's X7460 six-core Dunnington processors and packed a total of 96 cores running at 2.67GHz into a single system image with 256GB of main memory. Running RHEL 5.3 and JRocket, this box pushed 2.15 million BOPS.

Just because Sun and IBM like to pick on each other, let's throw in come IBM SPECjbb2005 test results for Big Blue's Power Systems iron.

A 32-core, 64-thread Power6+ Power 570 machine revving at 4.2GHz and with 128GB of main memory was tested last September (ahead of the October launch of this machine) and delivered 1.24 million BOPS running RHEL 5.2 and IBM's J9 variant of the JVM.

IBM hasn't tested a single-core variant of any of the most recent Power6+ machines, as far as I know. IBM did test a single-socket, two-core Power6 box using 4.7GHz processors back in early 2007, which handled a mere 88,089 BOPS on the SPECjbb2005 test with 8GB of memory. Sun was very proud that its single socket machine was able to best it.

But comparisons are not based on sockets, but on cores and threads and how much money it takes to bring them to bear.

Bang for buck: unknown

The SPECjbb2005 test doesn't include price metrics for configurations (but they damned sure ought to), so there is no easy way to do price/performance analysis without spending a lot of time. Neither Sun nor IBM are eager to talk about pricing for the Unix iron, by the way.

Considering the whole low-power, high-performance angle that Sun has been pushing for years with its Niagara series of servers using the Sparc T processors, it seems a bit odd that Sun has not opted to run the SPECpower_ssj2008 benchmark on the machines, which not only gauges performance, but performance per watt as a machine is loaded up with work.

In May, Sun did test its Netra X4250 servers based on the old Harpertown L5408 quad-core Xeons using the SPECpower_ssj2008 benchmark, which is an odd choice given that the much more powerful Nehalem EP Xeon 5500s were already out and available in other Sun boxes.

And you would think that Sun would definitely want to show off the new 1.6GHz Sparc T2 and T2+ processors on this power/performance metric, as well. But no one answers questions at Sun these days, so it remains a mystery why Sun does what it does and doesn't do what it doesn't do.

The useful part about Sun's recent SPECjbb2005 benchmark tests is that they show that slightly higher clock speeds, when coupled with a main-memory bump, can deliver a significant goose in Java performance within the Niagara server line.

Last fall, Sun tested a Sparc T5440 using the 1.4GHz T2+ chips with 128GB of memory, and it delivered 692,736 BOPS. The faster 1.6GHz machine with double the main memory was able to do 841,380 BOPS, as we pointed out above, which is 21.5 per cent more BOPS for 13 per cent more clocks. Not a bad trade, if you can afford the extra memory.

Similarly, a single-socket Sun Blade T6300 blade server tested in May 2007 with the 1.4GHz T1 processor and configured with 32GB of main memory was able to crank through 96,523 BOPS on the SPECjbb2005 test, but the current Sun Blade T6320 configuration using 1.6GHz chips (that's a jump of two generations, not one) handles 229,576 BOPS. That is a factor of 2.4X more oomph, which is nothing to be ashamed of in a two-year time frame.

Clearly Sun needs to keep making leaps like that, and this was apparently the plan with the Rainbow Falls Sparc T3 chips, which are expected to have 16 cores per chip and 16 threads per core as well as expanding to eight-socket systems. Should Sun keep the processing speeds more or less the same - oh, let’s be optimistic and say that Sun can get the Sparc T3 chips up to 2 GHz if they survive the budget axe at Oracle - then that is twice as many threads per core, twice as many cores per socket, twice as many sockets per system, and a 25 per cent clock speed goose.

If you assume the threads and cores get you maybe 50 per cent more oomph, and the SMP expandability from four to eight sockets gets you maybe another 50 per cent, then a top-end Sparc T3 server might be able to do close to three times as much work as the current four-socket T2+ box. Or, in the case of the SPECjbb2005 test, probably somewhere around 2.4 million BOPS. You would expect a 2,048-thread box to do well on multithreaded code like Java.

IBM's top-end Power 595, using 5GHz Power6 processors and deploying 64 cores and 128 threads, was tested in March 2008 on the SPECjbb2005 test and was able to deliver 3.44 million BOPS. And while IBM has some very big Power7 iron on the way, Oracle will apparently be taking its midrange box up into enterprise territory with the T3 systems, especially now that the Rock UltraSparc-RK chips and their servers seem to be dead in the water.

And the real competition might be something clever, like the box coming out of 3Leaf. ®