Texan researchers cheer tera-op chip endurance test

'We'll rock 2012'

The Essential Guide to IT Transformation

The University of Texas plans next week to wow processor aficionados with a new chip that can chew through software at an unprecedented clip. The easily excited, however, will want to temper their enthusiasm, since the so-called TRIPS (Tera-op Reliable Intelligently adaptive Processing System) project seems to move at an un-lubed snail's pace.*

So far, UT researchers have crafted a two-core chip where each core can handle 16 out-of-order integer or floating point operations. All told, the TRIPS chip can stomach 1,024 instructions at the same time. Such a chip should speed up consumer, business and high performance computing workloads with no changes needed to current software, according to the researchers.

The UT crowd has been hammering away at the technology for seven years and once predicted a product capable of one trillion calculations per second (a tera-op) would arrive by 2010. Now, the group has pegged 2012 for its tera-op part.

Seven years ago it looked like Intel and AMD would just keep plugging away at high GHz chips that handled single software threads well, while consuming tons of energy. Over the past three years, however, all of the major chipmakers have shifted strategies to focus on multi-core designs that combine slower individual processor cores together to make overall chips able to push through multi-threaded software at a solid rate.

Intel is one company hoping to take this technology to the extreme. It's already demonstrating an 80-core processor that has reached 2 teraflops of performance. The company plans to start showing off a similar processor that uses its popular x86 instruction set next year – obviously well ahead of the UT crowd.

But the TRIPS researchers claim Intel and other mainstream chip makers may have overshot the mainstream market with their multi-core designs.

“They have made a big gamble that people writing software will figure out a way to write software that can use those processors with parallel programming,” said Steve Keckler, an associate professor at UT who has worked on TRIPS in conjunction with Doug Burger and Kathryn McKinley. “I think we will see a big wall as they try to go from 8 cores to 16 cores.”

To Keckler's point, software makers have already started to gripe about the multi-core chips from Intel and AMD. Such products require the coders to embrace multi-threaded software programming, which is quite different from what they're used to in the single thread world. Start-ups such as PeakStream and RapidMind have stepped in to solve this problem with code that allows single-threaded software to run very fast on multi-core processors, but it remains unclear if the software industry as a whole will move at pace to the new designs.

“So, while we recognize that there is a need for parallel programming, we would like to build the most powerful uniprocessor that we can,” Keckler said.

IBM, which has been helping out with the TRIPS project, reckons that the big technology breakthrough here revolves around “block-oriented execution.”

“Instead of operating on only a few computations at a time, the TRIPS processor operates on large blocks of computations mapped to an array of execution units on the chip,” IBM said in a 2003 statement. “This approach allows many more instructions to execute in parallel, thus offering higher performance.”

The prototype motherboard to be shown next week contains four 366MHz TRIPS chips, along with 8GB of memory. (The architecture can support up to 32 chips and 64GB of memory.) This test system should reach 45 gigaflops.

“The processor core is composed of multiple copies of five different types of tiles interconnected via microarchitectural networks,” UT says on its website. “Each core may be configured in a single threaded mode or in a 4-thread multithreaded mode in which instructions from multiple threads may execute simultaneously. A TRIPS processor core is fundamentally distributed for technology scalability and to provide high bandwidth to the instruction cache, data cache, and register file through partitioning and replication.”

The researchers plan to follow the motherboard release by filling a rack with 8 systems linked together - a setup that should reach 375 gigaflops.

One always needs to approach these research projects with a healthy amount of caution. Academia – even when backed by $11m in Defense Department funding and IBM – tends to move very, very slow, and the bright ideas of researchers often fail to pan out.

Less cynical types who really want to see the future now, can travel over to the TRIPS web site.

“I believe there is a strong need for very capable, high performance uniprocessor cores,” Keckler said. “Will they look exactly like TRIPS? That's a good question. I think we have a very credible case.”

In the coming years, the TRIPS group hopes to convince a commercial partner to pull the technology out of the labs.®



I enjoyed our discussion today. However, I'm not sure how you came up with this statement:

"The easily excited, however, will want to temper their enthusiasm, since the so-called TRIPS (Tera-op Reliable Intelligently adaptive Processing System) project seems to move at an un-lubed snail's pace."

Those familiar with the semiconductor industry would recognize that the industrial design cycle for a leading-edge microprocessor is 3-4 years, and that's given the fact that the instruction set architecture is already in place and that the company already has substantial experience building several previous generations. The TRIPS research cycle has been: 3 years for research concepts and early proof of concept (not part of industry design cycle), 3 years of implementation/fabrication of a never-before-built processor with a newly invented instruction set, and 6 months of silicon bringup and system implementation.

Also along the way, we developed a new compiler for the new ISA. Thus the time from research concept to prototype has actually been quite un-snail-like given what we set out to accomplish.

The early projections of a Tera-OP by 2010 made some assumptions about clock rate scaling (10GHz) and better device scaling than actually came to pass. That said, 8 slightly modified TRIPS cores could easily fit on a current generation 65nm chip, which running at 3GHz would achieve a peak performance of 768 billion instructions per second. Thus it is not unreasonable to expect the technology to support a trillion instructions per second in 2010.


Steve Keckler

Associate Professor

Computer Architecture and Technology Lab

The University of Texas at Austin


Build a business case: developing custom apps

More from The Register

next story
Nice computers don’t need to go to the toilet, says Barclays
Bad computers might ask if you are Sarah Connor
4K video on terrestrial TV? Not if the WRC shares frequencies to mobiles
Have your say with Ofcom now, before Freeview becomes Feeview
YES, iPhones ARE getting slower with each new release of iOS
Old hardware doesn't get any faster with new software
iPad? More like iFAD: We reveal why Apple fell into IBM's arms
But never fear fanbois, you're still lapping up iPhones, Macs
PEAK LANDFILL: Why tablet gloom is good news for Windows users
Sinofsky's hybrid strategy looks dafter than ever
You didn't get the MeMO? Asus Pad 7 Android tab is ... not bad
Really, er, stands out among cheapie 7-inchers
Apple winks at parents: C'mon, get your kid a tweaked Macbook Pro
Cheapest models given new processors, more RAM
VMware builds product executables on 50 Mac Minis
And goes to the Genius Bar for support
Leaked Windows Phone 8.1 Update specs tease details of Nokia's next mobes
New screen sizes, dual SIMs, voice over LTE, and more
prev story


Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.