Feeds

Nvidia details GF100 graphics beastie

Minus the price - and the speed

Secure remote control for conventional and virtual desktops

Nvidia has released additional details on its upcoming GF100 graphics processor, and if the GPU performs as well in reality as it does on paper, AMD/ATI's Radeon HD 5000 series may have a worthy competitor.

The GF100 will be Nvidia's first to be based on the company's muscular Fermi architecture, which features such niceties as scores of CUDA (compute unified device architecture) cores and ECC (error-correcting code) support. Fermi will find its way into a variety of products destined for both desktops and HPC rigs. The GF100 will be the first game-centric part.

According to Nvidia, the GF100 is "designed for gaming performance leadership." To help accomplish this goal, the GF100 implements all of Windows 7's DirectX 11 hardware APIs. Nvidia is especially proud of the GF100's support for DirectX 11's tessellation capabilities, which it asserts will allow for more-complex geometry and animation, including enhanced fluid effects and more-realistic hair effects.

In contrast to Nvidia's earlier GT200 architecture, the GF100 takes a more-distributed approach to tessellation. This improved distribution and parallelization results in a 8X improvement in tessellation performance than the GT200, according to the company's internal benchmarks.

Also supported will be DirectX 11's DirectCompute APIs, which developers can use to offload such highly parallelized tasks such as media processing from a system's CPU to the GF100.

Although GF100 technology will find eventually find its way into less-ambitious parts, the full-bore spec released this Sunday includes 512 CUDA cores arrayed in four graphics processing clusters (GPCs), each of which contain four streaming multiprocessors (SMs).

Nvidia GF100 - full die

Each of those wee green squares is a processing core - there are 512 of them

Each SM contains 32 CUDA processors, four times more than the company's previous SM designs. Each CUDA processor has both an arithmetic logic unit (ALU) and a floating point unit (FPU). The FPUs are based on the IEEE 754-2008 floating-point standard using the fused multiply-add (FMA) instruction, which Nvidia claims provides improved precision over the older multiply-add (MAD) instruction, minimizing rendering errors in closely overlapping triangles.

Nvidia GF100 - graphics processing cluster

Four GPCs each have four SMs communicating with a with a unified raster engine

Each SM also includes four special function units (SFUs), which Nvidia says are used for such functions as sine, cosine, reciprocal, square root, and graphics interpolation. All the SFUs' math mojo, according to Nvidia, is especially helpful for complex procedural shaders.

Nvidia GF100 - streaming multiprocessor

Each SM has 32 CUDA cores - that's 4X the cores of its previous generation

Also inside those 16 SMs is what Nvidia call its PolyMorph Engine, which includes, among other items, the GF100's tesselators. Placing a tesselator in each SM allows the bandwidth of the tessellation to be greatly increased - which accounts for much of that aforementioned 8X bump over the tesselation performance of the GT200.

Each SM also has its own 64KB of L1 cache, plus the GF100 as a whole has 768KB of fully coherent, read/write L2 cache - a step up from the GT200, where the 256KB L2 was read-only for the texture engine. According to Nvidia, this improved cache architecture will not only help texture coverage, but will also boost the GF100's compute performance.

Word on the street is that the GF100 will be available in late March. Unfortunately, Nvidia has remained silent about how much the part will cost and how much power it will consume - meaning how much of a power-supply and cooling-system upgrade you may be facing. Even the part's clock rate remains under wraps.

For more detail on the GF100, check out HardOCP's excellent "Deep Dive," or download Nvidia's own white papers detailing the GF100 and the Fermi compute architecture. ®

The essential guide to IT transformation

More from The Register

next story
Apple's iWatch? They cannae do it ... they don't have the POWER
Analyst predicts fanbois will have to wait until next year
Barnes & Noble: Swallow a Samsung Nook tablet, please ... pretty please
Novelslab finally on sale with ($199 - $20) price tag
Apple to build WORLD'S BIGGEST iStore in Dubai
It's not the size of your shiny-shiny...
Just in case? Unverified 'supersize me' iPhone 6 pics in sneak leak peek
Is bigger necessarily better for the fruity firm's flagship phone?
Steve Jobs had BETTER BALLS than Atari, says Apple mouse designer
Xerox? Pff, not even in the same league as His Jobsiness
Apple analyst: fruity firm set to shift 75 million iPhones
We'll have some of whatever he's having please
TV transport tech, part 1: From server to sofa at the touch of a button
You won't believe how much goes into today's telly tech
The agony and ecstasy of SteamOS: WHERE ARE MY GAMES?
And yes it does need a fat HDD (or SSD, it's cool with either)
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
BYOD's dark side: Data protection
An endpoint data protection solution that adds value to the user and the organization so it can protect itself from data loss as well as leverage corporate data.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?