Tilera preps many-cored Gx chips for March launch
Taking on all x86 comers and ARM challengers
Updated Upstart multicore RISC chip maker Tilera is timing the launch of its third generation of Tile processors to rain a little on Intel's forthcoming parade, and to try to blunt all of the excitement that is building for ARM-based alternatives for servers.
Tilera will today begin sampling of its Tile-Gx series of processors. As El Reg suspected back in June 2011  - when Tilera announced it was actually launching three different lines of Tile-Gx chips: Gx3000s for servers, Gx5000s for heavy media processing, and Gx8000s for network equipment makers – all three lines are based on Gx8000 chips with certain features deprecated and different pricing.
That means Tilera can offer variants of the chips with 16, 36, 64, and 100 cores and only have to do four chip layouts instead of as many as a dozen. It is the full-on Tile Gx8000 chips with 16 and 36 cores that are in fact sampling now at 1.2GHz, Bob Doud, director of marketing at the upstart chippery, tells El Reg.
A Tilera GX chip wafer, etched by TSMC in 40 nanometers
All three generations of Tilera processors have the same idea behind them: use simple RISC cores tuned for Linux infrastructure workloads, put lots of them on a chip, and link them together using an on-chip a mesh network that makes all of those cores look like a single, monster, multithreaded processor to the Linux kernel.
The big change with the Tile-Gx family is that Tilera is moving from 90 nanometer wafer baking processes from foundry partner Taiwan Semiconductor Manufacturing Corp (used with the two prior Tile generations) to the same 40 nanometer processes used by AMD and Nvidia for their GPUs and by Oracle for its Sparc T4 RISC processors.
As part of the process shrink, Tilera is stepping up to 64-bit processing and 40-bit physical memory addressing, which means it can put 1TB of memory on the single socket processor. A few of the models have 39-bits, which mean they can only address 512GB. The Tile-Gx chips include on-chip DDR3 main memory controllers as well as virtual networking and encryption acceleration units (the latter only in the Gx8000 series).
Here's how the Tile-Gx 3000 server processors stack up against the Tile-Gx 8000 network processors:
Tilera's Gx3000 and Gx8000 processors
There have been some changes to the expected chip lineup since last June. The Gx3000s will now be available in 1GHz and 1.2GHz clock speeds, not the faster 1.5GHz speeds, to keep the thermal envelope down. The two top bin Gx3000s will have three PCI-Express 2.0 slots running at x8 lanes, not a mix of x8 and x4 lanes. Tilera is also adding 1.2GHz options for the top-bin Gx8000 chips.
Tilera does not do SMP to increase the performance of a server node, but rather uses the on-chip mesh to build a bigger socket image with more physical threads.
Each core on the new Tile-Gx chip has three instruction threads and has 32KB of L1 data cache and 32KB of L1 instruction cache, and also has a 256KB L2 cache; the mesh network is used to link those L1 and L2 caches into a single, coherent L3 cache shared by all the cores on the chip - so the top-end, 100-core variant of the Tile-Gx chip has 32MB of total cache.
Tilera Tile-Gx block diagram
The Tile-Gx also has math instructions that allow a floating point operating to be done in five cycles instead of hundreds of cycles when done in software, and believe it or not, this is important for some hyperscale Web applications built using PHP.
Doud says that the ramp for the Tilera chips has been pretty steep, with over 80 engagements with system and network equipment vendors of all colors and stripes, and 20 design wins where the company has committed to use a Tile processor. Embedded system maker Mercury Computer and video streaming equipment maker Harmonic have gone public admitting that they are using Tile chips in their gear.
Going into hyper drive
Ihab Bishara, director of cloud computing applications at Tilera, says that three of the largest hyperscale data centers in the world have deployed Tile-based servers. With the Tile-Gx line, the 64-bitness and floating point instructions are attracting more interest, with a number of OEMs and ODMs placing orders for the chips even before they were sampling - even though the Tile chips have their own proprietary interconnect.
"Our view is, it is our ISA, get over it," says Doud, and for the Linux crowd that compiles its own applications anyway, he makes a good point. (Jumping to ARM chips will require a recompile, too, after all.) "Once you have a chip that is supporting C, C++, Java, and PHP and you're running Linux, it doesn't matter. People are not writing assembler programs."
Well, there are probably a few card-wallopers out there who are in mainframeland.
Tilera is putting the finishing touches on a Java JIT compiler, which should be done by the end of the first quarter, according to Bishara – and just in time to take on big Java workloads like Hadoop. The Tilera Linux stack is based on a derivative of CentOS that has around 2,000 packages ported over to run natively on the chips.
Tilera doesn't just expect to sell Tile-Gx processors as the main engines inside of systems. In some cases, customers will want to use them as offload engines. To that end, the company has cooked up an evaluation adapter card that slips into a PCI-Express 1.0 or 2.0 slot and runs the Tilera Linux environment.
The Tilencore-Gx36 development card comes with a 36-core Gx8036 processor with two DDR3 SODIMM memory slots that can support 1GB to 4GB of memory each. The card has four external SFP/SFP+ ports that can run at either Gigabit or 10 Gigabit Ethernet speeds. There's a big fat PHY that converts the virtual I/O on the Tile-Gx chip so it can talk to the physical Ethernet ports on the card, and it has a USB 2.0 port and a SATA interface. The Gx8036 chip runs at 1.2GHz on this adapter.
If you want to go all out and play around with a full development system instead of a coprocessor for an x86 or RISC server of your choosing, then Tilera is sampling the chips inside of the Tilempower-GX development platform, seen here:
This development server puts a single Tile Gx8036 running at either 1.2Ghz or 1.5GHz into a system board with four memory slots- this fits Intel's definition of a microserver, by the way. The server is equipped with low-profile DDR3 main memory and has the same four Ethernet ports and a dual-port SATA disk controller and a single PCI-Express 2.0 x8 slot coming off the virtual I/O on the TileGx chip.
If you are really serious about putting the Tile-Gx chips through the server paces, Tilera will get you what it calls its Liberty-Gx platform, which crams four of these microserver boards into a single 1U rack machine.
The Tile-Gx processors sampled last July in limited quantities to selected partners, and alpha evaluation boards shipped in September. The company racked up ten design wins for the chip by November and has decided to "open up the flood gates" and do much more sampling in February with volume shipments to begin in March. The full-on Gx8016 is expected to cost around $450, with the Gx8036 at around $650. Presumably the parts aimed at servers will cost less, since some features are deactivated.
The 64-core and 100-core variants of the Tile-Gx chips will sample in late 2012 and ship sometime in the first half of 2013, according to Doud, and the company is on track with its 200-core "Stratton" chips with a shrink to 28 nanometers.
Bishara says that Tilera is not threatened by ARM contenders in the server racket, such as Calxeda with its 32-bit ARMv7 variant, called EnergyCore  or Applied Micro Circuits with its 64-bit ARMv8 variant, called X-Gene .
"We're here today shipping a 64-bit processor core and we are what looks like two years ahead of ARM," says Bishara. "The architecture of the Tile-Gx is aligned to the workload and gives one server node per chip rather than a sea of wimpy nodes not acting in a cache coherent manner. We have been in this market for two years now and we know what hurts in data centers and what works. And 32-bit ARM just is not going to cut it. Applied Micro is doing their own core, and that adds a lot of risks."
Tilera should know a thing or two about that. It didn't just do its own cores, but its own instruction set and what really is a system on a chip.
No one knows how this will turn out, with server makers just trying to make a buck and take as few risks as possible. But one thing is for sure. Intel and AMD have a lot more problems than just each other from here forward. ®