Feeds

Adapteva ships Kickstarted baby supercomputer boards

Forget GPUs – use ARMs plus FPGAs plus Epiphany RISCies

Combat fraud and increase customer satisfaction

Upstart RISC processor and coprocessor designer Adapteva is shipping the first of its Parallella system boards, which its Epiphany multicore processors with ARM processors to create a spunky and reasonably peppy hybrid compute engine that doesn't cost much and is very energy efficient for certain kinds of processing.

It is not cheap to design and fab coprocessors or to make system boards that make use of them, so Adapteva's cofounder and CEO Andreas Olofsson fired up a project on fund-raising site Kickstarter last fall to raise the money to fab the chips, instead of going the traditional route of raising venture funding and trying to get design wins.

While Adapteva did not meet its pie-in-the-sky dream of raising $3m to fully fund a set of multi-core Epiphany RISC coprocessors and Parallella system boards that make use of them, the company does have 4,965 backers who pledged $898,921 and have ordered over 6,300 boards using various Epiphany processors matched up with Zync dual-core ARM Cortex A9 processors from Xilinx, which peddles those ARM chips mashed up with its field programmable gate arrays (FPGAs).

The Epiphany core embodies the essence of Reduced Instruction Set Computing, with a mere 35 instructions, and has a dual-issue core with 64 registers. It has one arithmetic-logic unit (ALU) and one floating-point unit, and a 32KB static RAM on the other side of those registers. Each core also has a router that has four ports that can be extended out to a 64x64 array of cores for a total of 4,096 cores.

Block diagram of the Epiphany chip

Block diagram of the Epiphany RISC chip

The Epiphany-III chip is implemented in a 65 nanometer process and sports 16 cores, and the Epiphany-IV is implemented in a 28nm process and offers 64 cores. This latter chip delivers about 102 gigaflops of performance at 2 watts, or 51 gigaflops per watt. (Adapteva has chosen GlobalFoundries as its wafer baker, by the way.)

The Epiphany memory architecture allows any core to access the SRAM of any other core on the die because the SRAM is mapped as a single address space across the cores. This greatly simplifies memory management, and it has a direct memory access (DMA) unit that can prefetch data from external flash memory.

How the computing elements of the Parallella board come together

How the computing elements of the Parallella board come together

At the moment, this DMA support is not extended to InfiniBand or Ethernet network adapters with Remote Direct Memory Access (RDMA) on top of those two network protocols, but Olofsson concedes to El Reg that this presents an interesting set of possibilities to link multiple coprocessors in a Parallella cluster together and have the Epiphany coprocessors share data directly over the network as they chew on data. (You would use the RDMA over Converged Ethernet, or RoCE, over the Ethernet links.)

The board does not have a SATA port or a fast InfiniBand or Ethernet link, but three of the four 10Gb/sec expansion ports can be ganged up together for a maximum of 30Gb/sec of bandwidth for attaching other kinds of ports to the Parallella board. You would have to create the daughter card to do this and write its drivers.

The Parallella-16 ARM-FPGA-Epiphany triple hybrid board

The Parallella-16 ARM-FPGA-Epiphany triple hybrid board

The Epiphany-IV design is meant to scale to 64 cores at 1GHz and burn about 25 milliwatts per core. The current chip runs at 800MHz and delivers that 51 gigaflops per watt performance on the number-crunching work mentioned above. At 1GHz, the Epiphany-IV can do an estimated 70 gigaflops per watt.

If you participated in the Kickstarter program, you will get a Parallella-16 board with a Zync-7020 processor from Xilinx, which has two Cortex-A9 cores that run at 800MHz and an FPGA on the same package with 85,000 logic cells and 220 programmable digital signal processing slices. This board has one of the 16-core Epiphany-III processors on it as well, and sports 1GB of SDRAM main memory, a MicroSD card slot, four expansion connectors, a Gigabit Ethernet network interface card, and an HDMI connector.

If you want to buy a Parallella-16 board and you did not participate in the Kickstarter program, you can get one from the online store that Adapteva has set up, but you will get a Zync-7010 processor instead, which has only 29,000 logic cells and 80 DSP slices on the FPGA side of the Xilinx chip.

It will take about twelve weeks to fulfill those orders because Adapteva is not pre-manufacturing boards. That will cost you $99, just like the base level of the Kickstarter support did. You will eventually be able to order the Zync chip with the fatter FPGA, but pricing is not yet set for this upgrade.

A 42-node cluster of Parallella-16 boards from Adapteva

A 42-node cluster of Parallella-16 boards from Adapteva

If you don't want to do much work at all and want to start playing with a baby cluster of these Parallella-16 system boards, Adapteva is selling those as well for $575. That includes four of the Parallella-16 cards with connectors, four 16GB SD cards loaded up with Canonical's Ubuntu Server 12.04, a power supply, and 20 metal standoff legs to screw the boards into a tower of computing power. The Parallella-16 card is a mere 3.4 inches by 2.1 inches.

The Parallella design required the Epiphany chip packaging to be redesigned, Olofsson tells El Reg, and the software drivers and SDK were also improved and made to work better with the FPGAs on the Xilinx chips. That stack includes a C compiler, a multicore debugger, the Eclipse IDE, an OpenCL SDK and compiler set, and runtime libraries.

Just for fun, Olofsson grabbed two 24-port Gigabit Ethernet switches and 42 of the Parallella boards to create a 42-node cluster that is about the size of a tower PC. It will cost around $5,000 and burn less than 500 watts (all in, including the three kinds of processing, memory, flash storage, and Ethernet ports).

Such a machine delivers around 1.1 teraflops of oomph, and by shifting to the 64-core Epiphany-IV would push that up to 4.3 teraflops. That's not a lot of teraflops, and a bunch of GPU coprocessors can match that in a much smaller form factor to be sure. But the Epiphany RISC coprocessor is more than twice as energy efficient, according to Adapteva.

Adapteva still wants to be an exascale player in the high performance computing arena, and as El Reg has previously reported, has set its sights on creating two chips by 2018 to reach its exascale aspirations. One future Epiphany chip is an entry coprocessor with 1,000 cores on a die that delivers 2 teraflops of performance in a 2 watt thermal envelope. The second is a massive chip with 64,000 cores with 1MB of SRAM per core that can deliver 100 teraflops of floating point coprocessing at 100 watts. The plan is to have both chips deliver 1 teraflops per watt using the 7 nanometer wafer baking processes that are expected to be generally available by 2018.

The Kickstarter program for these future Epiphany chips will probably require some support from big government agencies. And with those kinds of performance and thermal numbers, the US Defense Advanced Research Projects Agency is probably sniffing around, and maybe the Department of Energy, too. ®

Combat fraud and increase customer satisfaction

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Kingston DataTraveler MicroDuo: Turn your phone into a 72GB beast
USB-usiness in the front, micro-USB party in the back
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
BOFH: Oh DO tell us what you think. *CLICK*
$%%&amp Oh dear, we've been cut *CLICK* Well hello *CLICK* You're breaking up...
AMD's 'Seattle' 64-bit ARM server chips now sampling, set to launch in late 2014
But they won't appear in SeaMicro Fabric Compute Systems anytime soon
Amazon reveals its Google-killing 'R3' server instances
A mega-memory instance that never forgets
Cisco reps flog Whiptail's Invicta arrays against EMC and Pure
Storage reseller report reveals who's selling what
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Top three mobile application threats
Learn about three of the top mobile application security threats facing businesses today and recommendations on how to mitigate the risk.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.