Original URL: http://www.theregister.co.uk/2006/11/20/sicortex_sc06/
Startup takes Reg's coveted 'Top FLOP' award
SiCortex's supercomputing sauce
SC06 SiCortex has bucked one of the more disturbing supercomputing trends - the disconnect between form and function.
The Massachusetts server start-up last week unveiled a system that would moisten the eyes of both Seymour Cray and John De Lorean. Its SC5832 crams 5,832 processor cores into a "Pimp My Cluster" chassis with lighted cabinet doors that rise like wings. The elegant SiCortex box stood out with ease among the mishmash of "industry standard" cluster jobbies at the Supercomputing '06 conference.
As mentioned, however, the SC5832 garnered attention as much for its function as its form.
"There is room in this industry for nice design, but the systems won't make it if they cost more just to look good," SiCortex director of software Larry Stewart told us.
The high performance computing (HPC) world has gone through a dramatic change over the past few years. It has shifted from pricey, proprietary servers tightly glued together to cheaper systems daisy-chained to each other by the thousands. As a result, HPC customers have enjoyed a sharp fall in supercomputer prices, although that dip has come with tradeoffs.
Most of the cheap systems that make up HPC clusters run on chips from Intel and AMD. So, the clusters tend to show ever-increasing processing performance - especially with regard to floating point operations - as Intel and AMD produce ever-speedier chips.
Unfortunately, a number of issues have cropped up as vendors and customers charged after the faster chips at all costs.
For example, the software running on today's clusters spends an awful lot of time waiting for data from memory, since memory technology has not improved at the same rate as chips. In addition, clusters often waste a lot of processor cycles on communications between the thousands of chips and also suffer from myriad hardware failures as a result of both the quantity of servers in a cluster and the hot chips inside of them.
The SiCortex team has tackled all of these major issues affecting high performance computing clusters (HPC) at once by building an unorthodox box from scratch.
The most basic component of the SiCortex systems is a six-core chip. The TSMC-manufactured 64-bit MIPS cores run at 500MHz and can handle two instructions per cycle, while chewing up just .6 watts.
SiCortex then packs 27 of these chips - with two DDR DIMMs each - on a single server board. Its large SC5832 holds - wait for it - 36 of the server boards to give you the 5,832 cores. The half-rack SC648 can hold 108 chips or 2916 cores. (You can see the node layout here.)
Before getting to more of the technical details on the hardware, we'll note that the SiCortex boxes run Linux (a modified version of Gentoo), SiCortex MPI and the Lustre file system. So, despite the eccentricities of the hardware, the software remains standard for HPC customers.
Like us, many of you are probably wondering how well those MIPS cores stack up against much more powerful x86 chips. Sure, they save on power, and there are more of them, but at what cost to overall horsepower?
The SiCortex engineers - and we talked to just about all of them - argue that their system design creates a type of virtuous performance circle. By using low power cores, they can stick the cores closer together, which improves interprocessor communications. Meanwhile, each core is surrounded by plenty of memory and plentiful I/O. In addition, the chips produce less heat and consequently less system failures.
Now, there's plenty of detail on exactly how all this happens, but we've plucked a few of the choice descriptions from SiCortex's literature to give you a flavor.
First to the chip design.
The node chip contains six 64-bit processors, their L1 and L2 caches, two interleaved memory controllers (one for each DIMM), the interconnect fabric links and switch, a DMA Engine, and a PCI Express (PCIe) interface. The PCIe is used for external I/O devices, and is only enabled on some nodes.
And then to the board design.
Physically, 27 node chips and their associated memory DIMMs are packaged on a single board, called a module. Of the 27 nodes on a module, three have their PCIe busses connected to PCI Express module slots, and a fourth is attached to an on-board PCIe dual gigabit-Ethernet controller. The PCIe interfaces are disabled on the other nodes.
With nodes close together, we could build interconnect links that use electrical signals on copper PC board traces, driven by on-chip transistors instead of expensive external components. With short links, we could reduce electrical skew and use parallel links, giving higher bandwidth. And with a small, single-cabinet system we were able to use a single master clock, resulting in reduced synchronization delays.
Our low-power design also has cascading benefits in reducing infrastructure costs such as building and air conditioning, and in reducing operational costs for electricity.
SiCortex also relies on a Kautz topology networking fabric that provides multiple 2 gigabyte per second direct connections between each chip while requiring no external cables or switches. All told, each of the 972 nodes in the large server can get across the entire system network in at most six hops. (You can see the network diagram here, and there's more on SRI's Bill Kautz here.)
The end result of all this engineering is a 6 Teraflop system with 8TB of memory, 6TB per second of interconnect and 250GB per second of I/O that consumes 18 Kilowatts of power. And, while much of the hardware strays from the industry standard path, customers do not need to rewrite their software to run on this box.
Now here's the bad news.
SiCortex has made the classic start-up mistake of announcing product before its ready. We assume that the company wanted to make its big splash at the Supercomputing conference or that its investors forced management to rush the kit out the door, as so often happens when the venture vultures are involved. So, you can look at the eye candy but not receive a demo until the Spring of 2007 or buy a production unit until the Summer.
And the bit you've been cringing for?
Well, the big papa will cost around $1.5m with a "typical memory" configuration and stretch much higher if you're an 8TB kind of animal. The smaller system will start out at $200,000.
"Our goal is to pump data centers full of a lot of 6TFlop machines," one of the SiCortex family told us.
It's a noble goal, but one the company will struggle to achieve. How big is the market for $1.5m boxes and how much smaller does that market get when SiCortex rather than IBM, HP, Sun or Dell is selling the system? Who wants to go MIPS and proprietary everything in an age of x86 even if the software runs just the same?
It's while looking at those very serious questions that we're saddened by the state of the technology industry. Here we have 40 very bright engineers that have created a true marvel - a beautiful system that captures and improves upon all of the major trends in the server industry over the past four years. These guys will have the toughest of time cracking the hardware market, and, even if they do, could be confined to a world of low margins. In the meantime, some nimrod who just polished off a Java class at community college can crank out a Google toolbar add-on for the Web 2.0 sophists and retire early.
But pity not the SiCortex crew because they have venture funds of their own and a heck of a lot talent to tap. Most tellingly, Chevron's venture capital arm has pumped a few million into the start-up, paving the way for a possible commercial supercomputing play.
We tend to abhor start-ups and rarely dish out anywhere close to this amount of ink on them. SiCortex, however, has separated itself from the start-up plebs with a system that's the clear winner of our coveted and inagural Supercomputing 'Top FLOP' 2006 award.
The company may just have enough brass pairs to succeed. ®