Original URL: http://www.theregister.co.uk/2010/02/22/cray_q4_2009_numbers/

Cray swings profit on Q4 revenue dive

Thanks, Uncle Sam

By Timothy Prickett Morgan

Posted in HPC, 22nd February 2010 06:14 GMT

Supercomputer maker Cray finished out 2009 better than many might have expected it to do, reversing to a modest $3m profit on a 43 per cent revenue decline to $88.3m in the fourth quarter ended in December.

It is hard to blame your biggest customer for cutting into your sales and profits, and Cray came as close as it dared to pointing the finger at Uncle Sam for its diminished numbers in the fourth quarter.

As Cray said back in January, the Defense Advanced Research Projects Agency, the R&D arm of the U.S. military, has revamped its contract for a prototype of the "Cascade" massively parallel super, which was originally conceived as a hybrid system mixing X64 and unspecified accelerator nodes with a new generation of interconnect called Aries.

Cascade machines are intended to span up to hundreds of petaflops, compared to the single petaflops XT5 systems Cray can build today. While neither Cray nor DARPA are being clear about what still remains in the Cascade system, what is known is that DARPA chopped $60m from the Cascade prototype project, which was originally awarded a total of $293.1m in two phases of funding. Cray had $152.5m in money that was still expected for the Cascade project, and DARPA busted it down to $92.5m.

This money from DARPA is not booked as revenue, but rather as reimbursement for research and development expenses, explained Peter Ungaro, Cray's president and chief executive officer, in a conference call with Wall Street analysts going over the fourth quarter numbers. And Ungaro said further that DARPA is expected to get its Cascade prototype at the end of 2012 and that Cray will commercialize the product and sell it to others "shortly thereafter."

This is more or less how the XT line of massively parallel supers came into being, with Uncle Sam's Sandia National Laboratories ponying up $90m to build the "Red Storm" parallel Opteron-Linux super that Cray eventually commercialized as the XT3 and improved as the XT4 and XT5.

Despite the wrangling with DARPA over the Cascade project, things have looked far worse for Cray than they did as 2009 came to a close. Given the lumpiness of the supercomputer racket, where vendors have only dozens of key customers for a system and research and development costs kick in way ahead of product deliveries, it is not surprising that Cray's sales fell in the fourth quarter.

While Cray's product revenues were cut by more than half to $65.2m, this was to be expected given that Advanced Micro Devices is getting ready to put new twelve-core Opteron 6000 processors into the field and Cray is getting ready to plunk these chips into its XT6 supers and their midrange brethren, the XT6m supers. The XT6 and XT6m compute blades will offer roughly twice the number-crunching oomph as the current XT5 and XT5m blades, which are based on six-core Opteron 2400 series chips.

That extra compute power is worth waiting for, particularly if it doesn't cost that much extra. For the year, Cray's product sales fell by 9.1 per cent, to $199.1m.

One of the saving graces in the quarter and for all of 2009 was Cray's custom engineering business, where it does bespoke systems and data center design for a fee. This business, which accounted for more than $30m in revenues in 2009 according to Ungaro, grew at more than a 400 per cent rate last year and was a key reason why services revenues were up in Q4 and for the entire 2009 year. In the fourth quarter, Cray's overall services revenues were up by 27.9 per cent, to $23.1m.

For the full year, services revenues rose by 33 per cent to $84.9m. If you do some math on what Ungaro said, then the custom engineering biz had to be miniscule in 2008, and without it in 2009, services revenues would have been off by around 14.5 per cent, to about $54.5m.

Ungaro said that the three growth areas that Cray initiated in 2009 to broaden its product portfolio and to expand its addressable market in the $10bn supercomputing market - that would be custom engineering, the XT5m mini-MPP, and the CX1 personal supercomputer - accounted for more than $40m in revenues in 2009. That means the XT5m and the CX1 accounted for around $10m, and it is our guess that the CX1 was not very much of that.

Dell 'wildcard'

Ungaro said that Cray has inked deals with 40 reseller partners, including server and PC maker Dell, which has its very own Windows 7-based baby cluster, called the Cray CX1-iWS. Ungaro did not provide any kinds of shipment or revenue figures for the CX1s, but said merely that Cray needed to get resellers out there selling and that it was seeing a pipeline building for the Dell product.

He conceded it was a "wildcard" as to how much money the CX1 line could generate, but added this was also the case with the custom engineering business a few years back too. Cray hopes, Ungaro said in the call, that both the XT5, and soon the XT6m mini-MPP and the CX1 lines would deliver "significant results" that Cray could report on in the future.

As Cray has said in past quarters, the company is looking for its XT6 compute nodes, due early in the second quarter on the tail of the Opteron 6000 launch from AMD, and the next-generation "Baker" supers and their "Gemini" interconnect. Gemini is the kicker to the SeaStar2+ interconnect in the XT5 supers, which implements a 3D torus interconnect between Opteron server nodes.

Ungaro said that the Baker systems, which will pair the XT6 blades with the Gemini interconnect as well as making unspecified improvements in the system software stack on the machines to boost scalability and performance, are on track for initial development and test systems to be delivered to customers in the third quarter.

An early version of the Gemini interconnect is being tested now, and what Cray hopes to be the final version of the chip (which has been tweaked as chips always are as they are stepped through the development and manufacturing process) is out being manufactured now. "Thinks are going well, and I am knocking on wood," Ungaro said and you could hear him knock three times on his desk.

Because of the improvements in scalability and performance in the Baker systems, Cray is warning investors that "a significant majority" of revenues in 2010 will be booked in the fourth quarter. That is the inverse of 2009, which will mean that the compares for Q4 2010 will be sweet and that investors should, as Ungaro said in the call, always look at Cray on an annual basis, not quarterly.

Assuming that that the Baker machines come off without a hitch in Q3, then Cray is anticipating overall revenues of between $305m and $325m for 2010, with about $110m of that coming from services. The company exited 2009 with $113.2m in cash and short-term investments (more than double the levels as 2008 ended) and will have to burn cash to build up inventories of parts and to pay for development of the Baker machines between now and the third quarter when they start shipping.

To help push more products before the end of 2010, Ungaro hinted that Cray would be introducing a new class of super that sits between the CX1 personal supercomputer and the XT6m mini-MPP boxes. He did not elaborate on what it might be.

But if you cut a base XT6m in pieces and choose one piece, you will probably get close to what Cray has in mind rather than building up from a CX1.

The XT5 and XT6 machines implement the SeaStar2+ interconnect in a 3D torus (with six ports on each SeaStar2+ chip), while the XT5m and XT6m minis implement it in a much shallower and less scalable 2D torus that has four ports on each SeaStar2+ chip activated. One can envision a SeaStar2+ chip with maybe two or three ports active being connected in a flatter architecture but still being a cluster running the same Linux stack as the other XT machines from Cray.

There are a number of possible tree, ring, or star topologies that could be used in such a baby super. If you wanted to, with three SeaStar2+ ports turned in each node, on you could link any of number of XT6m nodes into a ring (using two ports on each chip) and use the third link to lash multiple rings together. (This would be weird, obviously).

You could also lash together four two-socket XT6 blades into something that smelled like a NUMA cluster with only two ports per SeaStar2+ chip turned on. Why this would be better than an eight-socket Opteron 6000 box is immediately obvious. AMD is not making an eight-socket Opteron 6000 box. Its SR56X0/SP5100 chipsets will top out at four sockets. Intel's Xeon 5500 and future Westmere-EP chips are only good for two-socket configurations.

If you want NUMA-oid clustering and don't want tree clustering using InfiniBand or 10 Gigabit Ethernet, Xeons don't really provide options. Unless Cray builds its first Intel-based super on the upcoming eight-core Nehalem-EX machine, which can scale well beyond two sockets, perhaps as high as 16, 32, or 64 sockets using SeaStar2+ chips as glue. This is all idle speculation, of course, for the sake of amusement. ®