Original URL: http://www.theregister.co.uk/2013/03/26/oracle_sparc_t5_m5_server/

Oracle's new T5 Sparcs boost scalability in chip and chassis

Also aims brawny M4 – scratch that – M5 CPU at big-iron workloads

By Timothy Prickett Morgan

Posted in Servers, 26th March 2013 20:00 GMT

Oracle is launching its much-awaited Sparc T5 processors for entry and midrange servers, along with Sparc M5 processors to effectively replace the iron it currently resells from server and chip partner Fujitsu.

That Japanese supplier furnished Sun's and then Oracle's brawny-core Sparc Enterprise M midrange and big iron systems for the past six years – and the one thing that Oracle was not willing to talk about in a prebriefing with El Reg was the F-Word.

You might think that you slipped a few cogs here, because as far as we knew back in the wake of the OpenWorld extravaganza hosted by Oracle last fall and the company's processor roadmaps, what Oracle had been working on was the 16-core Sparc T5 processor, about which we went into great detail about here. There was also a beefier chip called the Sparc M4 that we told you about back in November after some feeds and speeds slipped out.

What is this Sparc M5 then? Well, it's the exact same chip as the M4, except its naming convention is now being brought into phase with the T5.

If you study Oracle's processor roadmaps, around the end of 2010 when the company was laying out its systems plans, the M4 chip was expected in late 2012 or so, with the Sparc T5 coming in early 2013.

Then last year, Oracle stopped talking so much about the M4 and said it could bring the T5 to market a little early, perhaps by last fall. And then OpenWorld came around, and Oracle pushed the T5 back to its original early 2013 timeframe. The M5 proper was supposed to come out early in 2014. Oracle will either push that original M5 out and call it an M6, or jump straight to the thing that was expected to be an M6. Hopefully Oracle will update its roadmap soon and explain.

Oracle is also not interested in talking about the "Athena" Sparc64 X processors from Fujitsu and the M Series of servers that Oracle's partner quietly launched back in January.

Fujitsu has launched a Sparc M10-1 machine with one of the sixteen-core Sparc64-X processors, a Sparc M10-4 with four of the chips in a single system image, plus a cluster called the Sparc M10-4S that glues sixteen of the four-socket Sparc64-X machines together using a distributed crossbar interconnect. The M10-4 is a replacement for the midrange M4000 and M5000 machines, while the M10-4S is a replacement for the high-end M8000 and M9000 servers. In Japan at least, the new Sparc64-X machines also bore the Fujitsu and Oracle brands, as the past several generations of Sparc Enterprise M servers did.

Marshall Choy, director of systems solutions and business planning at Oracle, would not comment on what plans Oracle might have to rebadge these boxes and resell them in the United States or Europe. With the T5 scaling up to eight sockets and the M5 scaling up to 32 sockets, it doesn't sound much like Oracle is interested in peddling Fujitsu Athena machines.

"This is an Oracle announcement with Oracle IP," explained Choy. "They have kind of done their own thing in Japan."

The Sparc T5 chip is not socket-compatible with the Sparc T4

The Sparc T5 chip is not socket-compatible
with the Sparc T4

The Sparc T4, T5, and M5 processors are all based on the "S3" core design, and the differences between the processors come down to core count, L3 cache size, and clock speed.

The S3 core, you will recall, has much more balanced performance on single-threaded or multithreaded workloads than did the predecessor T1, T2, and T3 chips, thus making them more suitable for a broader range of workloads. The clock speeds are a lot higher, as well, which helps on single-threaded work. The T4 chips ran at 3GHz, and the T5 and M5 chips run at 3.6GHz.

All three chips are fabbed by Taiwan Semiconductor Manufacturing Corp, with the T4 using its 40-nanometer processes and the T5 and M5 baked using its 28-nanometer processes. You have to get in line to get your hands on 28nm capacity, as Nvidia and AMD learned at the beginning of the ramp. Choy says that TSMC has ramped up production on the new T5 and M5 processors and Oracle has even gotten a few machines to beta customers in the past several months. The servers using the new chips are generally available starting Tuesday.

Meet the T5 servers

Even though the new Sparc T5 series of servers are very similar to the Sparc T4 boxes these now replace in the Oracle lineup, you can't take a Sparc T5 chip and put it into a Sparc T4 machine because they do not use the same processor sockets or interconnect.

To recap, the T5 chip has 16KB of L1 data cache, 16KB of L1 instruction cache, and 128KB of L2 cache for each of its sixteen cores, plus an 8MB L3 cache that all of the cores share. Each core has eight processing threads, for a total of 128 threads per socket.

The chip has two PCI-Express 3.0 controllers on the die plus four memory controllers that can drive up to sixteen DDR3 memory sticks running at 1.07GHz. There are two out-of-order integer execution pipelines and one floating point unit on the T5's S3 core. There are also cryptographic and encryption accelerators on the T5.

Generally speaking, Choy says that a T5 has about 2.3 times the throughput of a T4, which is the result of the doubling up of the number of cores on the T5 and goosing the clock speed by 20 per cent. Single-threaded app performance will be goosed by around 20 per cent because of those extra clock cycles.

With the T5 machines, as El Reg explained in detail last summer, the T5 has an on-chip NUMA interconnect that can be used to gluelessly create machines with two, four, or eight sockets, and the 8x9 crossbar switch that implements this NUMA has a bi-section bandwidth of 1TB/sec, twice that of the T4 interconnect. (It scales across twice as many sockets, so that stands to reason.)

The Sparc T5-1B blade server for the 6000 series chassis

The Sparc T5-1B blade server for the 6000 series chassis

The first T5 machine is the Sparc T5-1B blade server, and it is the only single-socket box that Oracle is putting out (at least today) using its new chip for entry and midrange systems. The T5-1B server slides into the same Sun Blade 6000 chassis that Sun Microsystems got many years ago when Sun cofounder Andy Bechtolsheim sold his company, Kealia, to Sun to try to give it a better server story. That chassis holds up to ten blades and virtualizes the I/O to the network and storage from the blade itself.

The Sparc T4-1B blade had an eight-core chip running at 2.85GHz, so the jump up to sixteen cores running at 3.6GHz is going to provide a big jump in performance. The T5-1B blade server has an integrated 10 Gigabit Ethernet interface and sixteen memory slots for a maximum capacity of 256 GB using 16GB memory sticks – the T5-1B also supports 8GB sticks if you want to go cheaper and less dense on the memory. There are also two 2.5-inch disk bays in the blade for local disk or SSD storage.

You can't have just one in a racker: The two-socket Sparc T5-2

You can't have just one in a racker: the two-socket Sparc T5-2

Oracle does not have a single-socket Sparc T5 rack machine, but it does have the Sparc T5-2 two-socketeer with a chassis that is essentially the same as the T4-2. It has two compute-memory units, or CMUs in the Oracle lingo and what Sun used to call a uniboard back in the day. Each card holds one T5 processor and sixteen memory slots, and plugs into the system board to link them to each other using the NUMA interconnect and to the peripherals and ports in the chassis.

Memory in the T5-2 tops out at 512GB using 16GB sticks, and the server comes with four 10GE ports standard. There's room for six 2.5-inch disk drives on the right side of the chassis, and Oracle has 300GB or 600GB 2.5-inch SAS drives or 100GB or 300GB SSDs as options. The box has eight PCI-Express 3.0 x8 slots and room for two 2,060 watt power supplies.

The Sparc T5-4 has four of the new T5 chips under the skin

The Sparc T5-4 has four of the new T5 chips under the skin

The Sparc T5-4 is a 5U rack server that has four of the CMUs crammed into its chassis. Each CMU slides in from the front like a blade server, albeit horizontally instead of vertically, and a disk shelf below the CMUs has room for eight hot-plug disk drives.

The box tops out at 64 cores, 512 threads, and 2TB of main memory, with four 10GE network ports and sixteen PCI-Express 3.0 slots, and has redundant 3,000 watt power supplies. This T5-4 box has enough oomph, says Choy, to be a hefty application server or a database server for midrange customers.

The Sparc T5-8 has twice the sockets and four times the cores as the top-end Sparc T4-4 system

The Sparc T5-8 has twice the sockets and four times the cores as the top-end Sparc T4-4 system

The T5-8 server basically adds 3U of space at the top of the T5-4 so another four CMUs can slide into that NUMA interconnect and create an eight-socket box with doubled-up capacities. The T5-8 maxxes out at 128 cores, 1,024 threads, and 4TB of memory, which would have been a refrigerator-sized big-iron box only a few years ago. The machine has the same disk slot, network port, and PCI-Express slot count as the T5-4 because the base of the box hasn't changed.

The M5 servers flash back to Enterprise 15K

The Sparc M5 processor is perhaps the one thing that might make it possible for Oracle to wean itself off Fujitsu iron and have a big memory system with lots of cores for heavy-duty in-memory processing and big SMP/NUMA jobs. We'll see once some benchmark tests are run and customers get a feel for the box. It's precisely the kind of chip that Sun should have been able to field for itself a decade ago.

The Sparc M5 is a cache-heavy, core-light variant of the Sparc T5

The Sparc M5 is a cache-heavy, core-light
variant of the Sparc T5

The Sparc M5 processor has six of the S3 cores on a single die, with exactly the same eight threads per core and exactly the same cache hierarchy as the Sparc T5 chip. One big difference is that Oracle has removed ten of the cores and plunked down another 40MB of L3 cache memory, for a total of 48MB, in the M5 chip.

Another big difference between the T5 and the M5 is that the on-chip coherency-control unit and scalability links in the M5 allow for as many as 96 processors to be lashed together as a shared-memory system. And yet another difference is that the M5 can have 32 memory sticks hanging off of each socket, twice that of the T5 chips.

At the moment, Oracle is shipping only one box based on the Sparc M5, with 32 sockets, called the Sparc M5-32 (obviously). Fully configured, this big-iron box weighs in at 192 cores, 1,536 threads, and 32TB of main memory. No one has as much memory in a single image today – not IBM, not Silicon Graphics, not HP, and not Fujitsu.

The Sparc M5-32 box puts Oracle/Sun back into big iron

The Sparc M5-32 box puts Oracle/Sun back into big iron

"We look at M5 as being a major leap forward," says Choy. "This is the thing for really big workloads, with 1.4TB/sec of memory bandwidth, 3TB/sec of system bandwidth, and over 1TB/sec of I/O bandwidth." Financial services and telecommunications companies are among the early adopters of the Sparc M5-32 system, according to Choy.

The Sparc M5-32 system fills a rack and has 32 drive bays; you can use 600GB disks or 300GB SSDs. The system has 64 PCI-Express 3.0 x8 slots, up to 32 10GE ports, and a dozen 7,000 watt power supplies. Generally speaking, Oracle says that this M5-32 machine will deliver 1.5X better single-threaded performance than the Fujitsu-based Sparc Enterprise M series machines, and as much as 6X throughput on jobs that just love, love, love threads.

All of the Sparc T5 and M5 machines come configured with Solaris 11.1 Unix, which includes the VM Server for Sparc 3.0 hypervisor, which most of us still call logical domains or LDoms for short.

The earlier Solaris 11 Unix is also supported on the boxes, and you can use older versions of Solaris (8, 9, and 10) in Solaris zones. These are virtual private servers that can run on bare metal atop the Solaris operating system or on a Solaris instance running inside of an LDom partition. The old-style electrically isolated dynamic domains, which allow you to isolate CMUs from each other with hardware partitioning, are options on multi-socket M5 machines as well.

Pricing for the Sparc T5 and M5 machines was not available at press time, but we will circle back and see what these all cost and do the usual price/performance analysis once the data is available. ®