Feeds

Ex-Cray supercomputer interconnect guru Scott leaves Nvidia for Google

Working on new systems at the Chocolate Factory

High performance access to file storage

Steve Scott, the system interconnect expert who was the lead designer for the three most current generations of node-lashing routers and server interconnect interfaces for Cray supercomputers, has a new gig at hyperscale data center operator Google. For the past two years, Scott has been the chief technology officer for Nvidia's Tesla GPU coprocessor business.

A spokesperson at Google confirmed that Scott has indeed joined the Chocolate Factory "team", but said that Google is not able to comment further. Contacted by El Reg about what he would be doing in Mountain View, Scott replied thus:

"I'll be working on new Google systems. Great work, but not so interesting to the outside world."

To which your systems desk hack replied:

"Don't kid yourself, man. People love this stuff. What do you mean, 'working on new Google systems'? Can you be even a little more precise?"

And then the Google static cut in and put an end to information exchange.

Scott spent 19 years designing systems and interconnects at three different incarnations of Cray after getting his BS in electrical and computing engineering, his MS in computer science, and his PhD in computer architecture at the University of Wisconsin.

At the time he left Cray, Scott held 27 US patents in the areas of interconnection networks, processor microarchitecture, cache coherence, synchronization mechanisms, and scalable parallel architectures. He was the lead designer on Cray's X1 parallel vector machine, and was one of the key designers for the "SeaStar" interconnect used in the "Red Storm" super created by Cray for Sandia National Laboratory and commercialized in the XT line of machines. He was also one of the key designers for the follow-on "Gemini" and "Aries" interconnects funded by the US Defense Advanced Research Projects Agency and used in the XE and XC series of machines, respectively.

Scott left Cray in early August 2011, and with the hindsight of history, we know why: Cray was going to get out of the interconnect business.

In April 2012, Intel bought the intellectual property for the Gemini and Aries interconnects from Cray for $140m, and brought on board the people who worked on them as well. (Excepting Scott, who had already left the building.) At the moment, the plan is for Intel to fund the development of the components used in a follow-on system, code-named "Shasta", that was set to employ a kicker interconnect called "Pisces", slated for delivery in 2016 or so. (Cray originally thought it could get the Pisces interconnect into the field in 2015).

A spokesperson for Nvidia said that Scott left the GPU chip maker two weeks ago, and was unsure when he started at Google. Nvidia has started a search for a new CTO for the Tesla GPU coprocessor line, and in the meantime has Bill Dally, the guy from Stanford University who literally wrote the book on networking, backfilling alongside his role as chief scientist at Nvidia. Jonah Alben, who is the GPU architect at Nvidia, Ian Buck, who is the general manager of GPU computing software, and Sumit Gupta, who is general manager of the Tesla Accelerated Computing business unit, will all be kicking in to keep the Tesla roadmap on track.

By the way, the other architect of the Aries interconnect, Mike Parker, is a senior research scientist at Nvidia, and Dally had a hand in the Aries design even though he didn't work for Cray at the time and was a professor at Stanford. And El Reg has contended that a market wanting cheap floating point computing systems might push Nvidia into creating its own dragonfly-style interconnect.

So what does Google want with Scott?

In an interview with El Reg published back in April, we chatted with Scott about the future Project Denver ARM processors from Nvidia and how they might be used in future supercomputers or hyperscale data centers with a dragonfly interconnect much like the Aries that Intel controls or the "Echelon" interconnect that was to be part of a US Defense Advanced Research Projects Agency effort to create exascale systems.

Echelon was interesting in that it had a dragonfly point-to-point topology to link all server nodes directly to all other server nodes, but it also added a global address space that synchronized data as it moved on cores inside one processor or across multiple processors in the system.

When asked about the applicability of future HPC systems to general purpose systems, this is what Scott had to say, and it is telling. In fact, it lays out the case for why Scott ended up at Google:

I would be delighted if the stuff that we are creating for HPC will be useful for general purpose data centers. We definitely think about such things. And from a networking perspective, a lot of things that are good for HPC are also good for scale-out data centers.

The systems that Google, Amazon, Facebook, and others are fielding are bigger than HPC supercomputers. And as they get rid of disks and start trying to run everything in memory, all of a sudden the network latency is starting to matter in a way that it didn't.

They care about global congestion and communication, with jobs like MapReduce, and the amount of bandwidth per node is small compared to HPC. But if you build a network right, it is sliced anyway. Both networks will have good global congestion control, good separation for different types of jobs, and good global adaptive routing – all with low latency.

Google already builds its own servers and plain vanilla Ethernet switches. Maybe Mountain View has its eye on a homegrown interconnect that, in the long run, might end up being commercialized by vendors such as Nvidia as a competitive alternative to the Aries and Pisces interconnects from Intel.

That's a big maybe, of course, but Google is not averse to investing in its own hardware designs when it ends up with better data centers. And it has a guaranteed customer – and possibly more – if it does it right.

It is a very intriguing thought. ®

High performance access to file storage

More from The Register

next story
Seagate brings out 6TB HDD, did not need NO STEENKIN' SHINGLES
Or helium filling either, according to reports
European Court of Justice rips up Data Retention Directive
Rules 'interfering' measure to be 'invalid'
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
Cisco reps flog Whiptail's Invicta arrays against EMC and Pure
Storage reseller report reveals who's selling what
Bored with trading oil and gold? Why not flog some CLOUD servers?
Chicago Mercantile Exchange plans cloud spot exchange
Just what could be inside Dropbox's new 'Home For Life'?
Biz apps, messaging, photos, email, more storage – sorry, did you think there would be cake?
IT bods: How long does it take YOU to train up on new tech?
I'll leave my arrays to do the hard work, if you don't mind
prev story

Whitepapers

Mainstay ROI - Does application security pay?
In this whitepaper learn how you and your enterprise might benefit from better software security.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.