Ageia PhysX physics accelerator chip
Very special effects?
Review Accelerating game physics is a hot topic for gamers. The concept of using add-on hardware - be it GPU or even a new kind of dedicated physics processing unit (PPU) - to speed up physics calculations that would otherwise have to run on the CPU is back at the forefront of developers' discussions...
The idea isn't new - games middleware and individual developers have already been making CPU-hosted physics calculations go faster by multi-threading their game engines and letting game physics run alongside other parts of the engine. Multi-core CPUs will really help this technique.
Havok, a physics middleware developer, recently announced Havok FX, which performs game physics calculations on a Shader Model 3.0 graphics processor. As far as PPU acceleration goes, there is currently only one vendor out there. Ageia, a Californian start-up fresh from a round of venture capital and new employee hires, not only has a PPU design in production but it's selling the product too.
Recently shipping in high-end systems from a range of mostly boutique vendor systems, Ageia's PhysX PPU even has support in a few games. Worth a peek, then?
I've recently spent time with a board and a couple of supported games, and there's some data worth sharing with you. First, let's look at the PhysX PPU itself and see what it's capable of. Ageia's silicon is made by TSMC in Taiwan, on a 130nm process. Measuring 14 x 13.5mm and comprising around 125m transistors, the chip appears to be clocked in the range 250-266MHz or 500-533MHz.
PhysX is ahead of Havok FX in what it can accelerate. The PhysX PPU is able to process more than just large-scale collision or 'effects' physics via its API. There's support for limited fluid dynamics simulations, vehicles (wheel, torque and tire simulation), object raycasting and more, which the PPU can fully or partly accelerate, with Ageia moving more onto the hardware as time goes by.
At its core the chips is just a wide parallel-stream processor with a command core - sometimes called the control engine - to run it all and a memory controller to move data onto and off of the chip during processing.
The parallel elements are themselves made up of multiple fully 32-bit floating-point processing units in MIMD (Multiple Instruction, Multiple Data), each with an array of SIMD (Single Instruction, Multiple Data) vector units. Think of the hardware as an array of vector units, likely 4x4 for this first iteration of PhysX hardware, which fits in with their likely make-up and implementation in silicon. Ageia is reluctant to tell us how it works and we're left poking at patents and developer information to glean our ideas about the hardware.
The units aren't generally programmable, at least in the way you might be used to thinking about with a GPU. You can't (easily) 'shade' the physics interactions and be certain the hardware will (mostly) execute an instruction stream you can further control. There's no compilation of physics programs as you would a shader program on a programmable GPU.
In terms of its data rate, the hardware is supposedly capable of six 5D vector MADDs per cycle, per vector unit. In the 4x4 design we suspect first hardware has, that's a near 50Gflop (all 32-bit) rate when fully utilised, at 250MHz.
Our sample is one of BFG Tech's PhysX boards. Measuring 167x99mm, the PCB for these first BFG PhysX boards is pretty much identical in dimension to a modern sound card. Indeed, my Audigy 2 ZS is much the same size. I've yet to come across a sound card that needs an active cooler and auxiliary power input, though.
The 40mm fan and heatsink combo reminds me of those fitted to graphics boards in days gone by. It spins devoid of any thermostatic control and rotational regulation. It's loud enough to intrude on a silent PC and I estimate something in the order of 32-36dBA - yes, it's just a guess and we know it's logarithmic but we don't have the tools to accurately measure.
The heatsink gets pretty damn hot over time, indicating the aluminium heatsink and fan aren't the best pairing to get rid of the current PPU's heat, but a well ventilated chassis should see you right. Oh, and it lights up.
The PhysX PPU's memory controller supports the same GDDR 3 memory that current graphics hardware does, and the BFG sample is equipped with four 32MB 500MHz Samsung DRAMs. Each populates a 32-bit channel and they combine to give the chip a maximum read bandwidth of 16GBps and 128MB total on-board storage.
Game support just now is limited to Ghost Recon: Advanced Warfighter - my thanks to Ubisoft for access to the full-game prior to release - Bet on Soldier and Rise of Nations.
GR:AW is largely graphics limited at 1280x1024 - our chosen test resolution - so keep that in mind when you look at the following graph data.
You can see three pronounced dips in performance, corresponding each time to your author lobbing a grenade at a car and it being blown up. GR:AW uses the PhysX PPU to enable more bits of debris during explosions, enhancing their visuals. Each performance dip a combination of data being generated for the PPU to work on, moved to the PhysX board, sent back when done and then made use of. All that adds up to latency. Then the extra visuals to be rendered by the game, largely just some alpha-blended sprites.
The sprites are identical - if there's more than one variation I can't spot it - and just scaled, rotated and blended to give the illusion of randomness. They don't persist, fading out after a short time.
While it's impressive the first few times you witness it, you soon spot the illusion and the smaller particles not persisting ruins the realism. Videos abound on the web of the GR:AW effects in action. It's 'Ooooh'-worthy the first few times, but that's about it.
Bet on Soldier employs much the same tactic, using the PPU to enable visual effects. The PPU does work intermittently, but a large amount of game physics is still done on the CPU in both titles.
If you think that such a limited analysis is a bit off, it's largely because there's simply not much to show you, or explain. And therein lies the rub with this first wave of PhysX-enabled games. Let me explain.
The premise of physics acceleration by dedicated hardware is a solid one. While there's always a downside to adding an extra piece of hardware to a PC to do something better - cost, noise, heat and power consumption - the upsides usually make it worthwhile.
However, much like the early days of 3D graphics, the PhysX PPU needs traction with a killer title or two to make the world sit up and notice. GR:AW isn't the one, and neither is Bet on Soldier or Rise of Nations.
I remember the feeling the first time I fired up GLQuake on my Voodoo card. I get almost the same feeling when I fire it up now. I don't get that feeling when I play PhysX-enabled titles - the effects presented aren't overwhelming and certainly not persistent.
Developer support is pretty strong, though, with Unreal Engine 3.0 the biggest current proponent. There are over 20 PhysX-compatible titles in the works.
Technically, the API needs to mature to encourage developers to make the investment, and if you are making it you have to think hard about its integration, lest you run into problems with the latency of return data when using it. GPU effects physics - just Havok FX right now - is the easier bet for a developer looking to tack effects-based stuff onto an existing engine, especially if they already license Havok's physics API.
Outside of gaming there's some scope for the PhysX hardware to be used for general purpose parallel floating-point computation, but the API simply doesn't cater for that in any meaningful way just yet. The interconnect it's sat on somewhat limits its usefulness in that respect, and it only supports 32-bit Windows at the time of writing. The goodly chunk of on-board storage counts in its favour for such applications, though.
The limited number of titles and their disappointing use of the PhysX PPU means that, currently, there's no reason to spend the £200+ to acquire a PhysX card. The current effects in the supported games aren't worth the price and potential performance drop. Cell Factor and awesome Unreal Engine 3.0 games, where art thou? Without them, the PhysX hardware is merely a curiosity. But one to watch.