ClearSpeed commits to 5x floating point boost
Fresh silicon commeth
SC06 Floating point specialist ClearSpeed faces an enormous challenge. Its products must run at least 5 times faster than the best general purpose chips being pumped out by Intel and AMD, while at the same staying faster, cheaper and more energy efficient than a coming onslaught of rival, nichey chips.
According to Intel import and new ClearSpeed COO Stephen McKinnon, the company is up to this daunting battle.
For the past year, ClearSpeed has been pumping out the 96-core, 250MHz CSX600 co-processor. Close to 50 per cent of the IBM manufactured chip's logic is dedicated to floating point units, which today allows ClearSpeed to show dramatic speed ups to the mathematic and scientific codes that run on high performance computing (HPC) customers' server clusters. All told the CSX600 eats up just 10 watts.
ClearSpeed puts two of the CSX600 chips on it current PCI X board. Early next year, the company will ship a new PCI Express board and then in the second half of 2007 it will revamp the silicon behind its flagship product.
The company's strengths revolve around pure floating point performance, performance per watt (4 times that of general purpose chips) and solid 64-bit, double floating point precision calculations. These attributes helped ClearSpeed last year secure a major win by slotting into half of the Sun servers currently powering the new Tokyo Institute of Technology cluster that's the 9th fastest supercomputer on the planet. Were TiTech willing to add a few more CSX600 boards, it could vault with ease to the sixth slot.
ClearSpeed wants to extend this lab success to the growing world of commercial HPC customers in segments such as financial services and the life sciences. That process will require some software tuning.
"We don't have to enable the whole world of software, but we do have to help out those markets that can benefit from our technology," McKinnon said.
To help speed up the software work around its technology, ClearSpeed ships a software development kit with an ANSI C compiler, gdb/ddd-based debugger and drivers for both Linux and Windows. It has also formed partnerships with Intel and AMD and ISVs working in the HPC market.
The company faces oncoming challenges from the single precision floating point dynamo Cell chip and accelerators based on FPGA and GPGPU (general purpose GPU) designs.
McKinnon attacked the hefty power consumption costs tied to the Cell chip (about 200 watts per accelerator board) and its double precision floating point limitations. The Cell chip won't see a major double precision speedup until about the middle of 2008. The ClearSpeed COO also knocked FPGAs and GPGPUs saying developers are underestimating the work and transistor budget necessary to support double precision floating point.
"How do you do your rounding, handle infinites and other complex operations," McKinnon asked. "You can't do that overnight. These are pretty hard devices."
To McKinnon's point, there's hardly a surplus of chip and software wizards able to tackle these problems.
All told, McKinnon thinks the company can maintain a 5x performance boost over rivals and that that edge will keep ClearSpeed relevant.
ClearSpeed also contends that its TiTech result has put server accelerators on the map for good, after years of unfulfilled hype around similar technology.
At $8,000 per board, ClearSpeed will need to keep a close eye on how it stacks up from a price/performance perspective moving forward. It should be noted though that the company claims to offer large discounts on volume purchases.
On a more visceral level, customers with an intimate knowledge of ClearSpeed's product tend, in our experience, to adore the gear. We've been reprimanded by customers and ClearSpeed staff for ignoring the company in articles about large clusters and accelerators. The ClearSpeed letters tend to stand out for their aggression and condescending tone.
It's that type of heated support that you want to see in a start-up, and the ClearSpeed vitriol has proved impressive.
Customers can, of course, buy pricey gear from SGI, Cray or IBM today that show rather dramatic floating point and scientific code gains. Or they can wait for x86 server-ready FPGAs and GPGPUs.
Between those two options, ClearSpeed seems the most practical choice for customers looking to take advantage of cheaper x86 servers while showing dramatic floating point and performance per watt improvements. ®
Sponsored: Are DLP and DTP still an issue?