Texan researchers cheer tera-op chip endurance test

'We'll rock 2012'

Mobile application security vulnerability report

The University of Texas plans next week to wow processor aficionados with a new chip that can chew through software at an unprecedented clip. The easily excited, however, will want to temper their enthusiasm, since the so-called TRIPS (Tera-op Reliable Intelligently adaptive Processing System) project seems to move at an un-lubed snail's pace.*

So far, UT researchers have crafted a two-core chip where each core can handle 16 out-of-order integer or floating point operations. All told, the TRIPS chip can stomach 1,024 instructions at the same time. Such a chip should speed up consumer, business and high performance computing workloads with no changes needed to current software, according to the researchers.

The UT crowd has been hammering away at the technology for seven years and once predicted a product capable of one trillion calculations per second (a tera-op) would arrive by 2010. Now, the group has pegged 2012 for its tera-op part.

Seven years ago it looked like Intel and AMD would just keep plugging away at high GHz chips that handled single software threads well, while consuming tons of energy. Over the past three years, however, all of the major chipmakers have shifted strategies to focus on multi-core designs that combine slower individual processor cores together to make overall chips able to push through multi-threaded software at a solid rate.

Intel is one company hoping to take this technology to the extreme. It's already demonstrating an 80-core processor that has reached 2 teraflops of performance. The company plans to start showing off a similar processor that uses its popular x86 instruction set next year – obviously well ahead of the UT crowd.

But the TRIPS researchers claim Intel and other mainstream chip makers may have overshot the mainstream market with their multi-core designs.

“They have made a big gamble that people writing software will figure out a way to write software that can use those processors with parallel programming,” said Steve Keckler, an associate professor at UT who has worked on TRIPS in conjunction with Doug Burger and Kathryn McKinley. “I think we will see a big wall as they try to go from 8 cores to 16 cores.”

To Keckler's point, software makers have already started to gripe about the multi-core chips from Intel and AMD. Such products require the coders to embrace multi-threaded software programming, which is quite different from what they're used to in the single thread world. Start-ups such as PeakStream and RapidMind have stepped in to solve this problem with code that allows single-threaded software to run very fast on multi-core processors, but it remains unclear if the software industry as a whole will move at pace to the new designs.

“So, while we recognize that there is a need for parallel programming, we would like to build the most powerful uniprocessor that we can,” Keckler said.

IBM, which has been helping out with the TRIPS project, reckons that the big technology breakthrough here revolves around “block-oriented execution.”

“Instead of operating on only a few computations at a time, the TRIPS processor operates on large blocks of computations mapped to an array of execution units on the chip,” IBM said in a 2003 statement. “This approach allows many more instructions to execute in parallel, thus offering higher performance.”

The prototype motherboard to be shown next week contains four 366MHz TRIPS chips, along with 8GB of memory. (The architecture can support up to 32 chips and 64GB of memory.) This test system should reach 45 gigaflops.

“The processor core is composed of multiple copies of five different types of tiles interconnected via microarchitectural networks,” UT says on its website. “Each core may be configured in a single threaded mode or in a 4-thread multithreaded mode in which instructions from multiple threads may execute simultaneously. A TRIPS processor core is fundamentally distributed for technology scalability and to provide high bandwidth to the instruction cache, data cache, and register file through partitioning and replication.”

The researchers plan to follow the motherboard release by filling a rack with 8 systems linked together - a setup that should reach 375 gigaflops.

One always needs to approach these research projects with a healthy amount of caution. Academia – even when backed by $11m in Defense Department funding and IBM – tends to move very, very slow, and the bright ideas of researchers often fail to pan out.

Less cynical types who really want to see the future now, can travel over to the TRIPS web site.

“I believe there is a strong need for very capable, high performance uniprocessor cores,” Keckler said. “Will they look exactly like TRIPS? That's a good question. I think we have a very credible case.”

In the coming years, the TRIPS group hopes to convince a commercial partner to pull the technology out of the labs.®



I enjoyed our discussion today. However, I'm not sure how you came up with this statement:

"The easily excited, however, will want to temper their enthusiasm, since the so-called TRIPS (Tera-op Reliable Intelligently adaptive Processing System) project seems to move at an un-lubed snail's pace."

Those familiar with the semiconductor industry would recognize that the industrial design cycle for a leading-edge microprocessor is 3-4 years, and that's given the fact that the instruction set architecture is already in place and that the company already has substantial experience building several previous generations. The TRIPS research cycle has been: 3 years for research concepts and early proof of concept (not part of industry design cycle), 3 years of implementation/fabrication of a never-before-built processor with a newly invented instruction set, and 6 months of silicon bringup and system implementation.

Also along the way, we developed a new compiler for the new ISA. Thus the time from research concept to prototype has actually been quite un-snail-like given what we set out to accomplish.

The early projections of a Tera-OP by 2010 made some assumptions about clock rate scaling (10GHz) and better device scaling than actually came to pass. That said, 8 slightly modified TRIPS cores could easily fit on a current generation 65nm chip, which running at 3GHz would achieve a peak performance of 768 billion instructions per second. Thus it is not unreasonable to expect the technology to support a trillion instructions per second in 2010.


Steve Keckler

Associate Professor

Computer Architecture and Technology Lab

The University of Texas at Austin


Boost IT visibility and business value

More from The Register

next story
Report: American tech firms charge Britons a thumping nationality tax
Without representation, too. Time for a Boston (Lincs) Macbook Party?
iPad? More like iFAD: We reveal why Apple fell into IBM's arms
But never fear fanbois, you're still lapping up iPhones, Macs
Apple gets patent for WRIST-PUTER: iTime for a smartwatch
It does everything a smartwatch should do ... but Apple owns it
For Lenovo US, 8-inch Windows tablets are DEAD – long live 8-inch Windows tablets
Reports it's killing off smaller slabs are greatly exaggerated
Cheer up, Nokia fans. It can start making mobes again in 18 months
The real winner of the Nokia sale is *drumroll* ... Nokia
Microsoft unsheathes cheap Android-killer: Behold, the Lumia 530
Say it with us: I'm King of the Landfill-ill-ill-ill
Seventh-gen SPARC silicon will accelerate Oracle databases
Uncle Larry's mutually-optimised stack to become clearer in August
prev story


Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Reducing security risks from open source software
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.