Tilera throws 64-core meshy chip at video and security tasks
Faster than a Xeon, smaller than a bread box
Hot Chips The multi-core chip revolution advanced this week with the emergence of Tilera - a start-up using so-called mesh processor designs to go after the networking and multimedia markets.
The Silicon Valley-based start-up's first product links together 64 RISC-like cores running at up to 1.0GHz. The real magic, however, stems from the five-lane switches used to link each core in an 8X8 grid that provides up to 32 terabits per second of data bandwidth across the whole chip. You end up with a product - Tile64 - that can tear through software threads.
Tilera looks to go after the FPGAs and DSPs used today in embedded devices, claiming performance, performance per watt and programming edges over rivals. For example, Tilera expects its chip to appear in security appliances that want to handle more-detailed analysis on packets, routers, surveillance DVRs (digital video recorders), video conferencing systems and boxes for encoding high definition video.
Unlike most start-ups that start telling the world how great they are before having product or customers, Tilera already has a solid story. Using a 90nm manufacturing process, TSMC is pumping out product for the company, which is then going into the hands of customers such as 3Com, GoBackTV and Codian. Tilera has ten customers in all, and products based on its chip should appear next year.
The mesh concept serves as a replacement for some of today's processors that require a central bus to manage data traffic. Some companies have moved past the bus concept, in AMD's case by creating its own high-speed interconnect called Hypertransport. Tilera extends that work by giving each core - or tile - five, independent networking lanes.
Intel has talked up a similar product when showing an 80-core demo unit over the past year. The chipmaker, however, does not expect to get out a trial product with x86 cores until next year, and who knows when it will actually ship commercial product.
In the meantime, Tilera claims that the Tile64 chip shows a 30X performance per watt and a 10X performance per sq. inch edge over dual-core Xeons. More importantly for Tilera's target markets, Tile64 has a 10X performance per sq. inch edge over TI's DM648 DSP, according to the start-up.
Tilera founder and CTO Anant Agarwal has assured us that programming for the Tile64 unit is within the reach of the average customer and notes that existing code written in C will run on the 32-bit Tile64 chip.
"If you have an application written for any multi-core or single processor architecture that's written to work with Linux, you can take it, compile it and have it running on our chip in minutes," he said. "Now, if you want to ratchet up the performance, we provide libraries and interface mechanisms that customers can use to tune code."
This process should prove easier than throwing out existing single-threaded code in favor of a parallelized rewrite or writing software for complex FPGAs, according to Agarwal.
Members of Tilera's team have roots that stretch to the first MIPS chip, MIT and Sun Microsystems' Sparcle chip and, of course, DEC's Alpha chip.
The 64-person company sells its chip to customers along with a TILExpress appliance card that plugs into PCI Express slots. You can see customers using this card as an accelerator in existing systems.
In the coming years, Tilera expects to ship products with hundreds and even thousands of cores if the market shows demand for such kit.
You can have a peek at the processor design here. ®
People ignorant to history are bound to repeat it
"Some companies have moved past the bus concept, ..."
As others already pointed out, INMOS did this nearly 30 years ago. It is possible that the author was not interested in IT at that era but surely should have noticed the internal architecture of IBM Power4/5/6 Multi-Chip Modules (MCMs)! Al right, I do agree that 64-core p590/p595 is lot bigger than a single chip but that is evolution, not revolution. OTOH there is no info what should one do if H.M. The Customer wants (65+)-core box, while with transputers much bigger scalability was achieved (I repeat, 30 years ago).
What's with this trans-puter obsession? Y'all are serious perverts!
Cap, Trenchcoat, Door...
you can calculate from the data they provide:
-5 Mb on chip cache means 80Kb of cache per cpu (might be 8k/8k/64k L1code/L1data/L2)
-900 Mhz, 3 pipelines and 64 cpus mean 172800 Mips peak performance (they write 192000 on the site, but that would need 1Ghz of core clock frequency)
This is not a transputer design, they just used a routed cellular mesh for memory i/o requests instead of a full crossbar one. (also allows message based nonblocking memory i/o, very similar to what amd uses for it's cpu interconnect) This is a very nice multicore design, but nothing new. It's just a bit more integrated than intel's multicore chips, but this was possible due to the simplicity of the cpu cores. As a comparision a gf8800 running at 500Mhz with 128 cores is just around 128000 mips, but uses a full crossbar memory controller (with 6 channels) 16 channels of pcie and 3 network interconnect ports.
If we can get this 64 core chip for a resonable price as a standalone system with memory and pcie slots and a sata controller added, then this would make a pretty nice 64 core linux box.
No Architecture Type
I notice that their website doesn't specify the instruction set. Probably an unlicensed MIPS derivative, given the number of cores they were able to squeeze on one die. In particular, they neglect to mention if it has a floating-point processor (I'm assuming it does not). Otherwise, it's very interesting since it already runs Linux and has on board PCI-e as well as GbE controllers.
Didn't they write a C compiler for it in Occam, I seem to recall?