Nvidia GeForce 7800 GTX
More evolution the revolution?
Review A year goes by pretty quickly for me these days. I can vividly remember the last few days before Nvidia's NV40 launch 14 months ago, and the last few days have brought some serious déja vu. I've run much the same tests and much the same analysis. Pretty much the same pixels are being painted. The last part of that statement is the telling one - today's hardware is more evolution than revolution compared to what Nvidia delivered last time.
The G70 builds upon NV40's Shader Model 3.0 plus SLi foundation with some (admittedly significant) nips and tucks, a wider architecture (without actually going that wide in silicon) and more speed for the fps freaks to dine on.
The SKU Nvidia has chosen to debut G70 is the GeForce 7800 GTX. The GTX isn't the flagship hardware - Nvidia is saving that to force ATI's hand with R520 and Radeon X900 (if indeed it's called that when it shows up). Why release the big daddy product if you don't need to, giving you some time to play with clocks and (hopefully) stay on top of the performance league table? Those few percentage points when you're massively CPU-limited really matter. The trouble is, ATI might well do the same.
Regardless, there's a shiny new GPU to talk about. Nvidia's G70 represents its take on 3D hardware for at least the next nine months or so. It'll power, either directly or indirectly, a range of top-to-bottom products with the GeForce 7-series moniker and its core performance and feature set will define those products until the next, massively-faster-than-the-rest-of-your-system GPU comes along. There's no new base features compared to NV40 with G70 still a Shader Model 3.0 part, so where exactly are the differences? Let's take a look
The fragment processor - usually called a pixel processor - handles fragments output by the GPU's rasteriser, which in turn creates rasterised fragments from the geometry spat out by the vertex hardware. So vertex hardware is first in the render chain, but since G70's main differences compared to NV40 are in the fragment units, I'll cover those first.
NV40 and G70's fragment units are made up of a pair of sub-units. Sub-unit one in NV40 can texture (use 'texture' data as input to a fragment program being run by the fragment units, but it doesn't have to be a coloured image texture), and issue a MUL vector instruction or use its mini-ALU to issue a non-vector instruction like RSQ (reciprocal square root). Sub-unit two can issue a MADD vector instruction (single-cycle MUL and ADD combined) or use its own mini-ALU with the same capability as the mini-ALU attached to sub-unit one.
G70 differs in sub-unit one, which can now issue a MADD as well. Everything else is the same in terms of ALU ability (all mini-ALU instructions are still single-cycle). So G70 widens internally with the power to run two MADDs on a pair of vec4 vectors, in SIMD. That's twice the SIMD MADD power as NV40, per cycle. Nvidia's reasoning - which flies in the face of the reasoning they gave for not allowing sub-unit one to issue a MADD for NV40 - is that the majority of complex fragment shader programs being run today in released and upcoming games will make heavy use of the MADD instruction, which can be used for calculating vector dot products (indeed, the single-cycle vec4 MADD is the equivalent to a single-cycle DP4 instruction).
Calculation of vector dot product is an integral part of many fragment shader effects that it's desirable to run on a 3D GPU. NV35 could issue two MADDs per cycle, per fragment ALU and G70 regains that processing ability.