Sun measures HPC backorders in petaflops
Layoffs? Let's talk new iron
SC08 Hot on the heels of job cuts that will see some 5,000 to 6,000 company employees given pink slips, John Fowler, the executive vice president in charge of the newly constituted Systems Platforms group at Sun Microsystems, was on hand at the SC08 supercomputing trade show to give a preview of products that Sun will be rolling out in the coming months.
While HPC server clusters have not traditionally been a strong area for sales for Sun, the company certainly sold a lot of workstations and servers in its day to government labs and academia in the 1980s and 1990s, and the advent of the "Starfire" 64-processor E10K servers in the mid-1990s put the company on the map in the data center and in a large number of HPC centers around the world. The lateness of the UltraSparc-III systems, the disappointing performance of its "Wildfire" interconnect, and the rapid adoption of commodity clusters running Linux took some of the wind out of Sun's HPC business, which Fowler estimates (very roughly) to be around $1bn today.
"Sales are definitely going up, but it is still a small part of Sun," Fowler says. But the good news, if you are a Sun executive or shareholder, is that Sun's backlog in HPC orders is now measured in petaflops.
With its substantial investment in hardware engineering for InfiniBand switches, blade servers, dense storage arrays, and sophisticated software, such as ZFS and the Lustre file system for HPC workloads, and its embracing of Linux, Sun is hoping it can catch hold of some of the growth in the HPC market, which is outpacing the overall server market and is, in some ways, immune to the current downturn because funding for HPC systems is already in place and budgeted. (Two years from now may be an entirely different story for all HPC system vendors if the economy doesn't improve). Part of winning future deals is to talk about future products, something that Sun needs to do not just because of HPC budget horizons, but because of its worrisome financial circumstances.
First up in the Sun preview at the SC08 trade show is a forthcoming two-socket blade server that packs two whole servers onto a single blade as well as native quad data rate InfiniBand links coming right off the board (QDR for short, and that's 40 Gb/sec). The current Sun blades used in its "Constellation" InfiniBand blade clusters rely on PCI-Express cards to plug into the blades and interface with the "Magnum" InfiniBand switch that is at the heart of the Constellation setup.
Speaking of which: The Magnum switch is being given a rev and will soon have 648 ports running at QDR. The current Magnum switch has 3,456 ports running at the dual data rate. Fowler says that despite the advances of 10 Gigabit Ethernet compared to Gigabit Ethernet, Sun picked InfiniBand for HPC workloads and is sticking with it. "InfiniBand will be the performance leader," Fowler says. "If you want the best numbers - lowest latency, lowest watts per byte, what have you - you are going with InfiniBand."</p.
Moreover, InfiniBand is available at 40 Gb/sec today and will be at 80 Gb/sec soon enough, and customers can gang up InfiniBand lanes to double up bandwidth as well. Moreover, InfiniBand doesn't drop packets, as Ethernet does, says Fowler. "Still, we're not going to position this as an InfiniBand versus Ethernet thing."
No, but Sun is certainly expecting customers to do that, particularly when the higher bandwidth of flash-assisted storage arrays start to really put pressure on interconnection networks in server clusters.
Which brings us to second product Sun was previewing at the show: Yet another variant in its so-called open storage lineup, which is a rack of storage arrays with hundreds of hard disks and flash disks delivering very high bandwidth. More details of this product, which is apparently code-named "Genesis," were not available. (And if you peek into the cabinet, you can't see anything).
Blades and water jackets
Sun is also previewing its impending Opteron blade server, an upgrade to the four-socket blade server it currently sells with a new southbridge part of the chipset delivering the native QDR InfiniBand and employing the latest "Shanghai" quad-core Opterons from Advanced Micro Devices. Sun says that this upgraded X6440 blade server will be available by the end of the year.
The company was also showing off its new water jackets for the back of server racks, code-named "Glacier," that will be shipping by the end of the year along with the new AMD blade server.
Sun also announced a preconfigured HPC cluster offering, called the Sun Compute Cluster, which is a completely integrated rack of servers, networking, and software that comes pre-configured from Sun - ready to run. (This is similar to preconfigured racks for HPC that IBM, Hewlett-Packard, and others sell). The offering scales from one to eight racks of servers, and the Sun integration can reduce the time to deploy an HPC cluster by 90 per cent.
The racks come with 32 X2250 two-socket rack servers or up to 30 Sun Blade blade servers, whichever customers want. Sun is offering a basic compute cluster setup based on this hardware as well as two others, one designed to support structural analysis applications and another aimed at supporting modeling applications in the financial services sector.
On the storage side, Sun is offering the Sun Storage Cluster, which is a rack of preconfigured storage servers and adjunct disk arrays equipped with the Lustre file system. The setups scale to over 100 gigabits of bandwidth between the servers and the storage and to several petabytes of disk capacity. This Storage Cluster bundle uses a mix if X4250 servers running Linux and the Lustre file system.
In both cases, the idea is to not only make HPC clusters easy to buy, but easier for Sun and its channel partners to sell. And because they save customers time and presumably a little bit of money, too, that will also help make Sun's case in the market. "We're actually a good way to save money on storage," Fowler explains. "We're a great economic solution, but people don't think of Sun that way. We have to get people to look at us and to understand that, which is why we have the Try and Buy program."
Somewhere between 70 and 80 per cent of the customers who do the trial server and storage program convert to a Sun sale either through Sun or a channel partner (the hedging in that number is because some customers try one product and buy a different one in the Sun lineup).
On the HPC software front, Sun upgraded a bunch of its tools, including Lustre 1.8, HPC ClusterTools 8.1, HPC Software (now with a Linux Edition 1.1 that allows deployment of the software stack on Red Hat Enterprise Linux 5.2), and the Studio Express 11/08 compilers.
The one thing that Fowler was not yet ready to talk about was the effect of the just-announced layoffs on the systems software, server, and storage business that Fowler now controls - and which constitutes the vast bulk of Sun's sales. As far as anyone knows, all Sun server and storage lines are still on track, full steam ahead. With so many job cuts, it is hard to believe that there won't be at least some product changes. But who can tell? Only Sun knows where it actually gets sales and where its real costs are. Sun is under tremendous pressure to clarify its plans, and it will surely have to do so soon. But Fowler made it clear that this week was not going to be that time. ®