Feeds

Nvidia details GF100 graphics beastie

Minus the price - and the speed

Build a business case: developing custom apps

Nvidia has released additional details on its upcoming GF100 graphics processor, and if the GPU performs as well in reality as it does on paper, AMD/ATI's Radeon HD 5000 series may have a worthy competitor.

The GF100 will be Nvidia's first to be based on the company's muscular Fermi architecture, which features such niceties as scores of CUDA (compute unified device architecture) cores and ECC (error-correcting code) support. Fermi will find its way into a variety of products destined for both desktops and HPC rigs. The GF100 will be the first game-centric part.

According to Nvidia, the GF100 is "designed for gaming performance leadership." To help accomplish this goal, the GF100 implements all of Windows 7's DirectX 11 hardware APIs. Nvidia is especially proud of the GF100's support for DirectX 11's tessellation capabilities, which it asserts will allow for more-complex geometry and animation, including enhanced fluid effects and more-realistic hair effects.

In contrast to Nvidia's earlier GT200 architecture, the GF100 takes a more-distributed approach to tessellation. This improved distribution and parallelization results in a 8X improvement in tessellation performance than the GT200, according to the company's internal benchmarks.

Also supported will be DirectX 11's DirectCompute APIs, which developers can use to offload such highly parallelized tasks such as media processing from a system's CPU to the GF100.

Although GF100 technology will find eventually find its way into less-ambitious parts, the full-bore spec released this Sunday includes 512 CUDA cores arrayed in four graphics processing clusters (GPCs), each of which contain four streaming multiprocessors (SMs).

Nvidia GF100 - full die

Each of those wee green squares is a processing core - there are 512 of them

Each SM contains 32 CUDA processors, four times more than the company's previous SM designs. Each CUDA processor has both an arithmetic logic unit (ALU) and a floating point unit (FPU). The FPUs are based on the IEEE 754-2008 floating-point standard using the fused multiply-add (FMA) instruction, which Nvidia claims provides improved precision over the older multiply-add (MAD) instruction, minimizing rendering errors in closely overlapping triangles.

Nvidia GF100 - graphics processing cluster

Four GPCs each have four SMs communicating with a with a unified raster engine

Each SM also includes four special function units (SFUs), which Nvidia says are used for such functions as sine, cosine, reciprocal, square root, and graphics interpolation. All the SFUs' math mojo, according to Nvidia, is especially helpful for complex procedural shaders.

Nvidia GF100 - streaming multiprocessor

Each SM has 32 CUDA cores - that's 4X the cores of its previous generation

Also inside those 16 SMs is what Nvidia call its PolyMorph Engine, which includes, among other items, the GF100's tesselators. Placing a tesselator in each SM allows the bandwidth of the tessellation to be greatly increased - which accounts for much of that aforementioned 8X bump over the tesselation performance of the GT200.

Each SM also has its own 64KB of L1 cache, plus the GF100 as a whole has 768KB of fully coherent, read/write L2 cache - a step up from the GT200, where the 256KB L2 was read-only for the texture engine. According to Nvidia, this improved cache architecture will not only help texture coverage, but will also boost the GF100's compute performance.

Word on the street is that the GF100 will be available in late March. Unfortunately, Nvidia has remained silent about how much the part will cost and how much power it will consume - meaning how much of a power-supply and cooling-system upgrade you may be facing. Even the part's clock rate remains under wraps.

For more detail on the GF100, check out HardOCP's excellent "Deep Dive," or download Nvidia's own white papers detailing the GF100 and the Fermi compute architecture. ®

Build a business case: developing custom apps

More from The Register

next story
The agony and ecstasy of SteamOS: WHERE ARE MY GAMES?
And yes it does need a fat HDD (or SSD, it's cool with either)
Apple's iWatch? They cannae do it ... they don't have the POWER
Analyst predicts fanbois will have to wait until next year
Barnes & Noble: Swallow a Samsung Nook tablet, please ... pretty please
Novelslab finally on sale with ($199 - $20) price tag
Kate Bush: Don't make me HAVE CONTACT with your iPHONE
Can't face sea of wobbling fondle implements. What happened to lighters, eh?
Apple to build WORLD'S BIGGEST iStore in Dubai
It's not the size of your shiny-shiny...
Just in case? Unverified 'supersize me' iPhone 6 pics in sneak leak peek
Is bigger necessarily better for the fruity firm's flagship phone?
Steve Jobs had BETTER BALLS than Atari, says Apple mouse designer
Xerox? Pff, not even in the same league as His Jobsiness
Apple analyst: fruity firm set to shift 75 million iPhones
We'll have some of whatever he's having please
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.