Feeds

Nvidia GeForce 7800 GTX

More evolution the revolution?

  • alert
  • submit to reddit

Review A year goes by pretty quickly for me these days. I can vividly remember the last few days before Nvidia's NV40 launch 14 months ago, and the last few days have brought some serious déja vu. I've run much the same tests and much the same analysis. Pretty much the same pixels are being painted. The last part of that statement is the telling one - today's hardware is more evolution than revolution compared to what Nvidia delivered last time.

Nvidia GeForce 7800 GTXThe G70 builds upon NV40's Shader Model 3.0 plus SLi foundation with some (admittedly significant) nips and tucks, a wider architecture (without actually going that wide in silicon) and more speed for the fps freaks to dine on.

The SKU Nvidia has chosen to debut G70 is the GeForce 7800 GTX. The GTX isn't the flagship hardware - Nvidia is saving that to force ATI's hand with R520 and Radeon X900 (if indeed it's called that when it shows up). Why release the big daddy product if you don't need to, giving you some time to play with clocks and (hopefully) stay on top of the performance league table? Those few percentage points when you're massively CPU-limited really matter. The trouble is, ATI might well do the same.

Regardless, there's a shiny new GPU to talk about. Nvidia's G70 represents its take on 3D hardware for at least the next nine months or so. It'll power, either directly or indirectly, a range of top-to-bottom products with the GeForce 7-series moniker and its core performance and feature set will define those products until the next, massively-faster-than-the-rest-of-your-system GPU comes along. There's no new base features compared to NV40 with G70 still a Shader Model 3.0 part, so where exactly are the differences? Let's take a look

The fragment processor - usually called a pixel processor - handles fragments output by the GPU's rasteriser, which in turn creates rasterised fragments from the geometry spat out by the vertex hardware. So vertex hardware is first in the render chain, but since G70's main differences compared to NV40 are in the fragment units, I'll cover those first.

NV40 and G70's fragment units are made up of a pair of sub-units. Sub-unit one in NV40 can texture (use 'texture' data as input to a fragment program being run by the fragment units, but it doesn't have to be a coloured image texture), and issue a MUL vector instruction or use its mini-ALU to issue a non-vector instruction like RSQ (reciprocal square root). Sub-unit two can issue a MADD vector instruction (single-cycle MUL and ADD combined) or use its own mini-ALU with the same capability as the mini-ALU attached to sub-unit one.

G70 differs in sub-unit one, which can now issue a MADD as well. Everything else is the same in terms of ALU ability (all mini-ALU instructions are still single-cycle). So G70 widens internally with the power to run two MADDs on a pair of vec4 vectors, in SIMD. That's twice the SIMD MADD power as NV40, per cycle. Nvidia's reasoning - which flies in the face of the reasoning they gave for not allowing sub-unit one to issue a MADD for NV40 - is that the majority of complex fragment shader programs being run today in released and upcoming games will make heavy use of the MADD instruction, which can be used for calculating vector dot products (indeed, the single-cycle vec4 MADD is the equivalent to a single-cycle DP4 instruction).

Calculation of vector dot product is an integral part of many fragment shader effects that it's desirable to run on a 3D GPU. NV35 could issue two MADDs per cycle, per fragment ALU and G70 regains that processing ability.

Next page: G70 vs NV40

Whitepapers

Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.