Sun and Fujitsu hint at Sparc futures
Better roadmaps needed for safe speeding
As part of the launch of the Sparc T5440 midrange server this week in San Francisco, top brass from both Sun Microsystems and Fujitsu spent some time assuring customers that the companies' chip and systems partnership going strong and that both were working away on Sparc processors that would end up in future systems.
The details, however, were vague, and for many, they didn't inspire the kind of confidence that a three-year roadmap with lots of details would have. But you get what they give in IT. That's how you know you are the customer and not the vendor.
Tatsuo Tomita, corporate senior executive vice president (and board member) at Fujitsu, said the company has enjoyed being a Sun partner for 20 years, liked being able to sell both its Sparc64-based M series machines and Sun's Sparc T-based entry and midrange boxes, and looked forward to a partnership that would last for 20 years or longer.
Tomita said that another Sparc Enterprise M server would be launched by the end of the month - presumably the 2U "Ikkaku" single-socket machine that will be sold as the Sparc Enterprise M3000. The server will have a single "Jupiter" Sparc64-VII processor with the four cores running at 2.53 GHz and 5 MB of L2 cache on the chip. It is expected to have eight DDR2 memory slots (for a maximum of 64 GB), four PCI-Express slots, four on-board Gigabit Ethernet ports, an on-board RAID disk controller, and room for four 2.5-inch SAS drives in the chassis.
Tomita also lifted the curtain on the future Sparc64 chips under development, kickers to the "Jupiter" quad-core Sparc64 VII processors announced in July. The Jupiter chips use a 65 nanometer process and basically have twice the core count and more on-chip cache because Fujitsu moved from the 90-nanometer process with the dual-core Sparc64 VI chips. According to past roadmaps, the Jupiter chips were designed to scale to 2.7 GHz, but these were only delivered thus far at 2.15 GHz, 2.4 GHz, and 2.52 GHz speeds.
The most recent Sun roadmap I can find (from January 2008) shows its kicker, the Jupiter+ or Sparc64 VII+ chip, coming out in mid-2009 at 3 GHz with the same core and thread count (four cores, eight threads total per chip) and using the same 65 nanometer process. But Fujitsu is apparently backing off on this speed a little and maybe pushing the delivery time out too.
"In a year, we will be introducing the Sparc64 VII+ processor running at 2.8 GHz to our customers," explained Tomita. "After this release, we plan to deliver another enhancement to the Sparc64 product line."
Presumably that means a shift to 45 nanometer or 40 nanometer processes and maybe more cores. Fujitsu has not yet said.
One more interesting thing that Tomita did not say while he was up on stage: That Fujitsu would sell servers based on next year's "Rock" UltraSparc-RK multicore processors.
As for the kickers to the "Victoria Falls" Sparc T2+ chips used in the latest entry and midrange T-class gear designed by Sun, there will be some kickers too, and Rock, which should have been here by now, is still on the way for a second half 2009 delivery.
Rick Hetherington, chief technology officer at Sun's Microelectronics group, said that Rock was in "post-silicon analysis." That may sound like therapy for servers, but it's one of many steps on the way to a final chip. The Rock chip will have 16 cores and will support up to 256 TB of main memory (in theory, that is) and will be used in a line of servers code-named "Supernova" that were supposed to be on the market by now, if not earlier. The delay in the Rock chip, one of many that Sun has suffered through, has given new life to the Sparc64 VI and Sparc64 VII products, which were themselves delayed.
Sun's January 2008 roadmap showed Rock chips being available in the late third quarter or early fourth quarter of 2009, running at 2.1 GHz. Heaven knows how accurate this is today, particularly since the scout threading and transactional memory technologies that Rock uses are so far untested.
Hetherington said at the T5440 event that Sun was approaching the tape out moment on the kickers to the Sparc T2+ chips and that the company will be adding more cores and more threads with the next generation.
Back in June, we reported on the third generation of Niagara chips, code-named KT but generally called Niagara-3 and probably sold as the Sparc T3 when it gets here in late 2009. The Niagara-3 chips are expected to come out with 16 cores with 16 threads each, and servers are expected to scale to eight in a single system image for a total of 2,048 threads. That is a lot of logical domain partitions, at one per thread. Service providers, long fans of Sparc iron and Solaris, might eat that right up. Anyway, tape out was expected by the end the year or so for the Niagara-3. ®
AC - thanks
AC, thanks for the info. It will help investors to make better decisions for 2009. I hope JS reads your post and considers your "lean" and "kaizen" recommendations. Best of luck for the future.
No FUD, ex-employee chooses to speak
First, let me tell you why I post anonymously. Too many people in the hardware areas know
me, and some of them will be able to deduce who I am just from what I write below. Don't get me wrong, I loved working at SUN, but there are huge problems that layoffs and yet another reorganization aren't going to fix until the people in charge recognize that they are the actual problem.
Let's review shall we, a bit of history first.
UltraSPARC-V (code name - Millenium) was indeed too little, too late. The 1.0 tapeout mask set was indeed sent to TI, when the cancellation came down from on-high, and the layoffs began.
Gutted the processor hardware engineering department.
Let's buy Afara (good move), as the Afara people mainly stayed, and out popped
Niagara I and Niagara II and VF on-time.
Moving to 2008:
UltraSPARC-T2+ (Victoria Falls) - There is still a sneaky little bug with a CSR (control status register) i.e. the NCX timeout register, but luckily another CSR has a bit that when turned on actually managed to correct the problem. Slight performance degradation (See earlier reg article, now you know why). The Zambezi chip was late (used in the 4-way systems), or this problem might have been caught during pre-silicon verification.
New philosophies began to creep into the picture (verification began to be performed by pure random coverage, and directed functional diagnostics were left to rot, even though they had huge numbers of pre-canned diagnostics already written)
UltraSPARC-T3 (KT) - Tapeout was scheduled for early September, and the project was nowhere close to being on track. Yes, 16 cores with 16 threads per core, with provisions to make an 8-way system. Reasons for delay: The 0-in coverage and assertions weren't written, and 16 core model builds weren't working two months before the supposed tape-out date.
Now, if SUN had a lick of sense, they would have moved the project manager from UltraSPARC-T2+ to UltraSPARC-T3. (UltraSPARC-T2+ did tapeout on time.) I won't go into too many details at this time, but it was more office politics, clashing egos, and the SUN culture than anything else.
Which only leaves ROCK. Rick Hetherington states "post-silicon analysis". This is normally known as the validation phase, although it's highly questionable if SUN knows the difference between verification and validation. Post-Silicon analysis in this case means looking at actual silicon. Tapeout 1.0 yielded actual silicon. Transactional memory did not work. They even had a problem with a legacy register, the "Y" register. There were a huge number of 1.0 bugs. They had new silicon about the time the July 2008 layoffs began. So let's do a bit of post-mortem on ROCK. First, ROCK underwent pre-silicon verification using only random test generation. (This would be ok, as long as your random code generators are top-notch) SUN's simply aren't and many of them are still being tweaked on. There are only a few people that really know how to use them to their maximum potential and the documentation on how to use them is horrible. Generally speaking, SUN's technical documentation for their processors is horrible. Anyone who has actually sat down and read a Programmer's Reference Manual for SUN knows this.
My post-mortem fixes:
Hire a large number of technical writers. All employees must document and formalize procedures fully so anyone can do your job. Demonstrate the correctness of your documentation through independent verification. Switching to fully-randomized testing has
already proven itself to be a failure. Hire people that can actually write SPARC assembly code.
Create a huge bank of directed code diagnostics. (To test the chips on the actual chip tester)
Random testing is great for catching a lot of bugs in the pre-silicon verification phase, but
it is horrible to take these same diags and use them on chip testers to screen actual parts. This
is where well thought out directed diags can make a world of difference in terms of coverage.
Organize a team that is truly in charge of the tester diagnostics. Switch to industry-standard
verification procedures. Send the design and verification engineers back to school on the company dime.
Finally, management must change it's thinking, and using the principles of Lean and Kaizen wouldn't hurt SUN in the least, starting with open and honest communication.
I think a quick spot check on posts will show that for this and most of the Sun articles there are more anti-Sun views posted than pro-Sun (and they even post real names!). Are you denying Sun dragged out UltraSPANKED V until it was patently obvious it was just too little too late, then killed it, and that sounds spookily like what is happening with Rock? I remember at least one Sun exec telling me it was "taped out" only weeks before they canned it, but then I'm not sure where that comes in the Sun design process in relation to "in silicon analysis" (probably somewhere between "fantasy vapourware design phase" and "create cancellation excuses for the market analysts").
It looks like SPARC64 has a healthy future and will be a lot more interesting, so I struggle to see why Sun just don't can Rock now and go with the SPARC64. At least it would let them put out a credible roadmap.