Feeds

Nvidia details GF100 graphics beastie

Minus the price - and the speed

Choosing a cloud hosting partner with confidence

Nvidia has released additional details on its upcoming GF100 graphics processor, and if the GPU performs as well in reality as it does on paper, AMD/ATI's Radeon HD 5000 series may have a worthy competitor.

The GF100 will be Nvidia's first to be based on the company's muscular Fermi architecture, which features such niceties as scores of CUDA (compute unified device architecture) cores and ECC (error-correcting code) support. Fermi will find its way into a variety of products destined for both desktops and HPC rigs. The GF100 will be the first game-centric part.

According to Nvidia, the GF100 is "designed for gaming performance leadership." To help accomplish this goal, the GF100 implements all of Windows 7's DirectX 11 hardware APIs. Nvidia is especially proud of the GF100's support for DirectX 11's tessellation capabilities, which it asserts will allow for more-complex geometry and animation, including enhanced fluid effects and more-realistic hair effects.

In contrast to Nvidia's earlier GT200 architecture, the GF100 takes a more-distributed approach to tessellation. This improved distribution and parallelization results in a 8X improvement in tessellation performance than the GT200, according to the company's internal benchmarks.

Also supported will be DirectX 11's DirectCompute APIs, which developers can use to offload such highly parallelized tasks such as media processing from a system's CPU to the GF100.

Although GF100 technology will find eventually find its way into less-ambitious parts, the full-bore spec released this Sunday includes 512 CUDA cores arrayed in four graphics processing clusters (GPCs), each of which contain four streaming multiprocessors (SMs).

Nvidia GF100 - full die

Each of those wee green squares is a processing core - there are 512 of them

Each SM contains 32 CUDA processors, four times more than the company's previous SM designs. Each CUDA processor has both an arithmetic logic unit (ALU) and a floating point unit (FPU). The FPUs are based on the IEEE 754-2008 floating-point standard using the fused multiply-add (FMA) instruction, which Nvidia claims provides improved precision over the older multiply-add (MAD) instruction, minimizing rendering errors in closely overlapping triangles.

Nvidia GF100 - graphics processing cluster

Four GPCs each have four SMs communicating with a with a unified raster engine

Each SM also includes four special function units (SFUs), which Nvidia says are used for such functions as sine, cosine, reciprocal, square root, and graphics interpolation. All the SFUs' math mojo, according to Nvidia, is especially helpful for complex procedural shaders.

Nvidia GF100 - streaming multiprocessor

Each SM has 32 CUDA cores - that's 4X the cores of its previous generation

Also inside those 16 SMs is what Nvidia call its PolyMorph Engine, which includes, among other items, the GF100's tesselators. Placing a tesselator in each SM allows the bandwidth of the tessellation to be greatly increased - which accounts for much of that aforementioned 8X bump over the tesselation performance of the GT200.

Each SM also has its own 64KB of L1 cache, plus the GF100 as a whole has 768KB of fully coherent, read/write L2 cache - a step up from the GT200, where the 256KB L2 was read-only for the texture engine. According to Nvidia, this improved cache architecture will not only help texture coverage, but will also boost the GF100's compute performance.

Word on the street is that the GF100 will be available in late March. Unfortunately, Nvidia has remained silent about how much the part will cost and how much power it will consume - meaning how much of a power-supply and cooling-system upgrade you may be facing. Even the part's clock rate remains under wraps.

For more detail on the GF100, check out HardOCP's excellent "Deep Dive," or download Nvidia's own white papers detailing the GF100 and the Fermi compute architecture. ®

Beginner's guide to SSL certificates

More from The Register

next story
Ex-US Navy fighter pilot MIT prof: Drones beat humans - I should know
'Missy' Cummings on UAVs, smartcars and dying from boredom
Don't wait for that big iPad, order a NEXUS 9 instead, industry little bird says
Google said to debut next big slab, Android L ahead of Apple event
Xperia Z3: Crikey, Sony – ANOTHER flagship phondleslab?
The Fourth Amendment... and it IS better
Netscape Navigator - the browser that started it all - turns 20
It was 20 years ago today, Marc Andreeesen taught the band to play
A drone of one's own: Reg buyers' guide for UAV fanciers
Hardware: Check. Software: Huh? Licence: Licence...?
The Apple launch AS IT HAPPENED: Totally SERIOUS coverage, not for haters
Fandroids, Windows Phone fringe-oids – you wouldn't understand
Apple SILENCES Bose, YANKS headphones from stores
The, er, Beats go on after noise-cancelling spat
Here's your chance to buy an ancient, working APPLE ONE
Warning: Likely to cost a lot even for a Mac
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.