Feeds

Core Wars: Inside Intel's power struggle with NVIDIA

Kepler takes Knights Corner?

Secure remote control for conventional and virtual desktops

GPU Technology Conference Intel and NVIDIA are battling for the hearts and minds of developers in massively parallel computing.

Intel has been saying for years that concurrency rather than clock speed is the future of high performance computing, yet it has been slow to provide the mass of low-power, high-efficiency CPU cores needed to take full advantage of that insight.

Another angle on this is that GPUs are already designed for power-efficient massively parallel computing, and back in 2006 NVIDIA exploited its potential for general-purpose computing with its CUDA architecture, adding shared memory and other features to the GPU and providing supporting libraries and the CUDA SDK. CUDA is primarily a set of extensions to C, though there are wrappers for other languages.

jen_hsung_huang nvidia kepler gpu cuda

Huang's Tesla K20 will serve intense computing

At NVIDIA’s GPU Technology Conference in San Jose, California, last week, the company announced new editions of its Tesla GPU accelerator boards based on its “Kepler” architecture. These boards are designed for accelerating general-purpose computing rather than for driving displays. The Tesla K10, available now, has two Kepler GK104 GPUs, 3,072 cores in total, and performs at up to 4,577 gigaflops (2,288 gigaflops per GPU).

The Tesla K20, expected in the fourth quarter of 2012, uses two of the forthcoming Kepler GK110 GPU, which promises over 1,000 gigaflops double precision. “It’s intended for applications like computational fluid dynamics, finite element analysis, computational finance, physics, quantum chemistry, and so on,” explained chief executive Jen-Hsun Huang in his keynote speech.

Power efficiency, which is the true limitation on supercomputer performance, has also been a focus, and NVIDIA states a three times improvement in performance per watt, compared to the previous “Fermi” generation.

The not-yet-available K20 is really the one you want, and not only because of its better performance. Although both the GK104 and the GK110 are called Kepler, there are several key advances that only appear in the GK110. A Grid Management Unit in the GK110 enables a feature called Dynamic Parallelism, which means that the GPU can schedule its own work. Previously only the CPU could schedule work on the GPU. Dynamic Parallelism means that more code can run entirely on the GPU, for greater efficiency and simplified code.

Another GK110 advance is Hyper-Q, which provides 32 simultaneous connections between CPU and GPU, compared to just one in Fermi. The result is that multiple CPUs can launch work on the GPU simultaneously, greatly improving utilisation.

NVIDIA now projects that by 2014, 75 per cent of HPC customers will use GPUs for general purpose computing.

The rise of GPU computing must be troubling to Intel, especially as the focus on power efficiency raises interest in combining ARM CPUs with GPUs, though implementation is unlikely until we have 64-bit ARM on the market. Intel’s response is an initiative called Many Integrated Core (MIC, pronounced Mike). It has similarities with GPU computing, in that MIC boards are accelerator boards with their own memory, and developers need to understand that parts of an application will execute on the CPU, parts on MIC, and that data has to be copied between them.

Prototype Knights

Knights Ferry is the MIC prototype, available now to some Intel partners, and has 32 cores and up to 128 threads (four Hyper Threads per core). Knights Corner will be the production MIC and has more than 50 cores and over 200 threads. The processor in Knights Ferry, codenamed Aubrey Isle, is based on an older Pentium design for power efficiency, but includes over 100 additional x86 instructions including a Vector Processing Unit, important for many HPC applications. Knights Corner is expected in late 2012 or early 2013.

Intel is supporting MIC with its existing suite of tools for concurrent programming: Parallel Studio XE and Cluster Studio XE. Key components are Threading Building Blocks (TBB), a C++ template library, and Cilk Plus which extends C/C++ with keywords for task parallelism. Intel is also supporting OpenMP, a standardised set of directives for parallel programming, on MIC, though in doing so it is getting ahead of the standard since OpenMP does not yet support accelerators. Intel’s Math Kernel Library (MKL) will also be available for C and Fortran. OpenCL, a standard language for programming accelerators, will also be supported on MIC.

Internet Security Threat Report 2014

More from The Register

next story
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
NASA launches new climate model at SC14
75 days of supercomputing later ...
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
DEATH by COMMENTS: WordPress XSS vuln is BIGGEST for YEARS
Trio of XSS turns attackers into admins
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
prev story

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
10 threats to successful enterprise endpoint backup
10 threats to a successful backup including issues with BYOD, slow backups and ineffective security.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Protecting users from Firesheep and other Sidejacking attacks with SSL
Discussing the vulnerabilities inherent in Wi-Fi networks, and how using TLS/SSL for your entire site will assure security.