ATI Radeon X1900 XT and XTX
The world's fastest graphics cards?
Review The smart money says that most other articles on the launch of ATI Radeon X1900 XT and XTX - aka the R580 - will start with chatter about R5xx being late. It's a valid way to kick off the copy on a new ATI high-end GPU product. The delays in getting the silicon into a shipping state - mostly due to a problem in the wafer production that was out of ATI's hands - means that R580 and its associated SKUs are presented barely three months after the R520...
Generally, the major graphics chips companies want to wait at least two quarters before debuting a significantly new high end product, lest they risk pissing off the early adopters of the previous generation, not to mention the board partners who have to sell the things. At the first sniff of something new just around the corner, the knowledgeable crowd with the cash to drop on the flagship part will hold off spending.
The chip companies make no secret of the fact they make huge margins on the high end, almost boutique parts. So while they'll still sell the chips eventually - or at least they cross their collective appendages that they will - they'll do so at a much lower price. First-run silicon is an expensive business, especially when the chip is comparatively big. So they want the Big Pockets™ to splash out - and do so as early as possible.
All that combines to make the timing of the introduction of R580 and the first two SKUs it powers interesting. ATI and its board partners will tell you the X1800 XT, and especially X1800 XL, have sold strongly since their release. We've no doubt they have. But again, good money says there's bound to have been a tail-off in XT sales in recent weeks as information and rumours on R580 have begun to flow.
Things are what they are, though, and we'd rather not dwell too much on the business and industry side of things during a technical evaluation of a new pixel pusher. And what a pixel pusher R580 seems to be. There are significant differences in how the R580 goes about the business of data processing, compared to the R520. Be under no illusions - the R580 is no simple R520 speed bump.
For a more in-depth technical appraisal of the R580, head over to Hexus.net here.
The reference board
The ATI Radeon X1900 XT and XTX share an almost identical board to the X1800 XT, although the higher power consumption of the X1900 XTX and XT mean their respective voltage regulation circuits are different.
The XTX reference board uses an identical cooler to X1800 XT, which sadly means the same loud and annoying noise profile when the fan is doing its work. Even the slowest fan setting on X1900 XTX and X1800 XT is noisier than an Nvidia GeForce 7800 GTX 512 reference cooler at full tilt. I'll be looking to the board makers to change the cooler for something quieter.
It's dual-slot, dual DL-DVI-I, with VIVO ability thanks to an on-board ATI Rage Theater 200. Weighing some 700g and measuring the same 228mm long that the X1800 XT does, it's officially physically the heftiest desktop reference board ATI have ever created. A little label confirms that it's an XTX version.
Bereft of clothes you can see the eight 512Mb Samsung BJ11 GDDR3 DRAMs arranged around the chip package. You can find those particular devices on Nvidia GeForce 7800 GTX 512 boards and in Microsoft's Xbox 360.
The fastest GDDR3 that Samsung currently produce and rated to 900MHz, BJ11 is likely the last GDDR 3 speed grade the company will make before shipping GDDR 4 at 1GHz or more in H2 2006. Just one DRAM - 64MB on its own remember - has a data rate of 7.2GBps at 900MHz. This is the bit where you cackle out loud at the big number.
It's worth noting that the rear of the board doesn't get hot, both DVI ports are dual-link capable should you attach something that needs it, the heatsink design still uses skived fins and, while noisy, the heatsink design and fan are more effective than anything Nvidia has made since the infamous FX-Flow at getting heat out of your chassis.
For those things we salute ATI's board and thermal engineers.
This review, for those who've forgotten the article title, concerns itself with the XTX and XT variations. We'll do Crossfire soon. Doing away with the Platinum Edition moniker for the first two-tier release of the R580 - although don't bet on that moniker being dead and buried for ever, despite what you might read elsewhere - XTX is a 25MHz GPU and memory clock bump over XT.
With XT retail boards using the same BJ11 DRAMs and R580 GPU, at the same core voltage no less, XTX is a slightly curious SKU. ATI is asking for a fairly large leap of faith that XTX is something special on top of XT in order to extract the other $100 it wants for the second X in the name. We're not convinced.
For the purposes of this article we underclocked the XTX to generate our Radeon X1900 XT numbers. Comparison comes in the form of ATI's own Radeon X1800 XT - the outgoing red champion - and Nvidia's GeForce 7800 GTX 512. Heavyweight contenders for the champion-elect to try and smack in the mouth, just the way we like it.
The ATI hardware runs on ATI RD480 core logic, Nvidia hardware on nForce 4 SLI. Driver defaults were used throughout, and stock clocks were used for all cards unless noted. If in-game controls could be used for both antialiasing and anisotropic texture filtering, they were, otherwise the driver was used to force the required levels (if applicable and the game allowed it without rendering errors). Tested resolutions were 1024 x 768, 1280 x 1024, 1600 x 1200 and 1920 x 1200.
Game tests were run a minimum of three times at each setting, and the median value reported. In the case of manual 'run-through' testing with FRAPS, three consecutive runs that produced repeatable results, after further analysis, were used. If values weren't part of a repeatable set, they were discarded and obtained again.
The format of the graphs is the same throughout this preview. The line plots are the baseline scores without antialiasing or anisotropic texture filtering applied, the column plots the values with AA and AF applied.
Awesomely, Call of Duty 2 would crash intermittently with FX-60 on RD480, forcing us to drop it completely. Likewise with Chronicles of Riddick with GTX 512 on the nForce4 SLI with FX-60 and 81.89, causing the game to run at 800 x 600 regardless of what was chosen in the game. Need for Speed: Most Wanted had serious 1920 x 1200 performance issues with the 6.2 driver and dual-core. Swapping in a single-core processor fixed all issues.
F.E.A.R. is fast becoming a poster child for fragment-shader bound games, with an approximately 7:1 ALU:TEX ratio in its PS programs. The R580 is therefore well designed to take on F.E.A.R.'s game engine.
The X1900 XT is 17 per cent faster than X1800 XT at the highest resolution with IQ on, while the X1900 XTX is over 46 per cent faster than GTX 512. Yay for lots of ALUs!
Despite being a bit old, Splinter Cell: CT maintains a rough 5:1 ALU:TEX ratio in its main fragment shader programs on SM3.0 hardware. There's a 17 per cent win for the X1900 XTX over the GTX 512, the X1900 XT is 14 per cent faster even with a four per cent engine clock deficit.
The X1900 XT and XTX have incremental performance gains over X1800 XT in Far Cry at all resolutions, with all the ATI boards making the GTX 512 look rather daft in Crytek's FPS title. The ATI products are faster as a group with AA and AF applied than the GTX 512 without, in our benchmark (a section of Pier).
Another Direct3D 9 game with high ALU:TEX, Black and White 2 with all the graphical options at max, as we test it, is a killer of modern graphics hardware. The X1900 XTX is 15 per cent faster than the GTX 512 at 1920 x 1200, and over 20 per cent faster at the playable resolution at the settings we test at. The R580 offers up 30 per cent more performance than R520, XTX compared to XT.
With Call of Duty 2 throwing its toys out of the pram, CoR not working properly on Nvidia hardware (*titter*) and Need For Speed throwing hissy fits, Quake 4 is our last game test for the time being, until we fix the issues. Keep an eye out over the next couple of days.
If your choice is between the boards on test, and you love Quake 4 and you're looking forward to Quake Wars so much you drool in pints, Nvidia's GTX 512 is the hardware for you. The basic design of G70 lends itself to nice Doom 3 engine performance, and ATI's OpenGL ICD isn't the hottest component to come bundled with CATALYST, if we're all honest. R580 offers up usable increases in performance over R520, but X1900 XTX isn't a detectable performance win over X1900 XT.
Multitexturing is bandwidth-limited as texture layers increase, hence the large lead for the GTX 512. It makes good use of 850MHz memory in this test. Even with the same ROP count, texture units and memory bandwidth as X1800 XT, X1900 XT still exhibits less performance loss, although only to the tune of 3.5 per cent.
Floating-point texture bandwidth
We measure available texture bandwidth by asking the hardware to read from sequential locations in an FP16 texture, the second texel sample relying on the first, creating a dependant texture read.
For some reason the hardware can't execute the texture read as fast as the R520 and the X1800 XT can, despite there being no theoretical reason other than a driver problem or issue scheduling the texture fetch. Software defeats the hardware, likely. We'll come back to that theme later.
In our texture-fetch test the R580 is again slightly slower than the R520 when fetching texels from a small texture. The entire texture fits nicely inside the GPU's texture cache on all three tested chips, indicating G70 has a larger texel cache bandwidth than either ATI chip.
When using a texture that - even with compression - wouldn't fit inside the texture cache, ATI's high-end GPUs do better than the G70, although the R580 is still ever so slightly slower than the R520 at the same clocks.
PCI Express bandwidth
As a nod to the modern GPU's ability to act prefectly well as a generally programmable parallel stream processor, and with R5-series GPUs supporting scatter writes into card memory from shader programs via an OpenGL extension, testing the PCI Express bandwidth of these high-end graphics boards when pushing data back to the GPU is prudent.
The R580 and R520 have over 1000MBps of writeback bandwidth to use, given a suitable host platform. Indeed, the R580 is the first GPU, backed by nearly 50GBps of local memory bandwidth, we've tested that breaks the 1GBps barrier when squirting data back to the host.
ATI clearly designed the R580 to be extremely fast in one kind of processing, while keeping the rest of the chip in pretty much the same very healthy state as the R520, at the same clock speeds. Anticipating - which is the key word - games titles with very high reliance on fragment shader processing means that the R580 can almost come off looking too much like the R520.
The R580, moreso than any modern GPU that meets the Direct3D 9 spec, therefore relies on software to show it off. Even current synthetic benchmarks designed to show off theoretical rates in 3D hardware can have a hard time exploiting the tripling in fragment processing ability. That's not to say the performance increases at the same clock speeds as the R520 are invisible. Clearly they're not, especially at the higher resolutions, with gains of up to 30 per cent in the games we tested.
A good glimpse of shader rate throughput in our instruction issue test also gives a clue to where R580's strengths lie. Furthermore, it has no real-world weakness when it comes to comparison with the R520. Vertex processing rate is intact, performance drops for antialiasing and texture filtering are almost identical, and it shares exactly the same feature base, even besting the R520 with a working implementation of ATI's Fetch4 feature.
Therefore it's fair to sum up that the R580, clocked very conservatively, gives a staggering fragment shader rate first and foremost. Following that, its considered engineering means that's followed by balanced assistance to the other major facets of today's modern game rendering, general stream programming and Direct3D 9 games still to come. Double Z-only rate sustained with MSAA, plenty of memory bandwidth in XT and XTX configuration, more than 1GBps for GPU-to-host writebacks for the first time, and very low penalty PS branching seal this particular deal in a big way.
At $649 for the X1900 XTX and $549 for X1900 XT, it'll push X1800 XT and XL down in price in short order, putting the two GPUs and their SKUs in the kind of price place we'd have expected over time, given an earlier R520 introduction. It's just somewhat maddening to see it happen so soon, annoying the early adopter of R520 hardware. UK pricing before VAT is applied is confirmed at £399 for XTX and £349 for XT, on launch day.
The GeForce 7800 GTX 512 is generally bested in all modern games, and Radeon X1000-series products have enough significant image quality advantages to give X1900 XT the nod even if the performance difference was only slight better or even slightly worse than the Nvidia product. We're seeing all the early XT boards come with the 1.1ns BJ11 DRAMs of the XTX, making the XTX a choice only for those with carefree finances.
Will the software needed to show off the R580 to its best come in time, especially with Direct3D 10/Vista games programming already under way at most major developers? Even if it doesn't fully realise its potential, the X1900 is a blindingly fast 3D graphics product with the best IQ possible.
For a more in-depth, technology-focused version of this review, head over to Hexus.net here.