Sun goes over Rainbow Falls
At Hot Chips, not in a barrel
Sun Microsystems, muzzled by Oracle's impending $5.6bn takeover, somewhat surprisingly showed up at the 21st annual Hot Chips conference sponsored by the IEEE and hosted at Stanford University - the birthplace of Sun - where Sun's chip techies talked about the future Rainbow Falls Sparc T processors and their integrated encryption engines.
The Rainbow Falls chips, presumably to be called the Sparc T3s if Oracle decides to keep them alive, are basically what El Reg told you  they would be back in June 2008 when the chip was known by another code name, K2. As expected, they'll sport 16 cores, each with 16 threads, which is double the number of cores per chip and double the number of threads per core of the current Sparc T2+ chips.
According to a presentation by Sanjay Patel, a Sun senior chip architect, the cores are the relatively easy part, while the supporting electronics for SMP systems have gotten messier as Sun has scaled up the core and thread counts for its Sparc T series chips.
As more cores are added to chips, and as more chips are ganged up to make more powerful systems, the electronics needed to keep cache memories coherent (which is what allows a single copy of an operating system, in this case Solaris, to run on a machine) get progressively hairy.
Patel's presentation was very technical, and talked only about the challenges that Sun faced when coming up with the Rainbow Falls design.
As was the case with the Sparc T2 and T2+ chips, Sun wanted to be able to connect processors together gluelessly, meaning without external chipsets that complicate the system and add costs to the box.
The original Niagara T1 chips could not be glued together, and the T2 chip was also only available in single-socket machines. To make the interconnections for SMP boxes, Sun ripped out the integrated networking on the T2 and created the Victoria Falls T2+ chip, which appeared in two-socket servers in April 2008  and in four-socket boxes in October .
Sun also wants to be able to make bigger Sparc boxes, and each Rainbow Falls processor will be able to deliver 256 threads, the same number that are currently available in the top-end four-socket Victoria Falls box, the T5440.
There are a lot of different ways that Sun can link Rainbow Falls chips together, but if it deploys them in eight-socket machines as expected, then the largest Sparc T3 machines would have 2,048 threads.
If Sun could get clock speeds up to around 2GHz for the cores, and assuming a certain amount of SMP overhead, then it's reasonable to expect that the future top-end Rainbow Falls box would deliver about three times the performance of a T5540 using the latest 1.6GHz T2+ chips.
That's a very respectable leap in aggregate performance, and one that Oracle would be pretty dumb to walk away from - but those are my performance estimates, not Patel's.
Rainbow Falls' 16 cores interface to 16 banks of L2 cache memory. As with prior Niagara-family chips, the Rainbow Falls chips don't have L3 cache memory - although Sun could park one off the chip in a ceramic package, as IBM did with its Power4 and Power5 chips.
The Rainbow Falls chips have a total of four cache-coherence units, with two interfacing with three high-speed links used to glue multiple chips together. These coherence units manage local and remote memory access, and each coherence plane (there are two per chip, which includes L2 memory banks, coherency units, and the links) can be mirrored in adjacent chips to cut down on SMP traffic. Patel explained that these links can be configured on the fly, allowing optimizations for specific server board configurations and applications.
While Sun did not specify either the dimensions of the Rainbow Falls chip or its pin count, Patel did say that the forthcoming chip would have a larger pin count than its predecessors to support more memory and a larger number of coherence links.
According to a report  in EE Times, the Rainbow Falls chip will be implemented in a 40-nanometer process being cooked up by new Sun chip wafer baker partner, Taiwan Semiconductor Manufacturing Corp.
The report also said that the T3 chip will be about the same size as the current T2+ chip, but will throw off about 30 percent more heat. That seems to suggest that Sun will try to push the clock speeds up on the T3s compared to the T2+ chips.
In addition to a revamped floating point unit, the Rainbow Falls chips will get a third-generation on-chip security coprocessor. The original T1 processor had a cryptographic coprocessor for each of its eight cores, which could be used to speed up public key cryptography such as the RSA algorithm.
With the T2 and T2+ chips, Sun added support for bulk encryption, secure hash, and elliptical curve cryptography. With the Rainbow Falls chip, Sun keeps one cryptographic coprocessor on each of the 16 cores, adds support for the Kasumi bulk cipher, and rounds-out its support for SHA-2 encryption.
The chips will also have non-privileged fast paths into the accelerators, which will be accessed by special instructions and which will allow for encryption and decryption to happen without having to waste cycles in the application, operating system, and hypervisor layers of software, which all meddle with security code. Sun reckons that the fast paths can take an encryption operation that might take tens of thousands of cycles down to maybe 500. That's a huge time-saver.
If Oracle doesn't want to make and sell Sparc iron using the Rainbow Falls chip, let's hope that Fujitsu does. Sun customers need these improvements. ®