by Dustin Sklavos
One of the major technology buzzes right now is what’s called “heterogenous computing” or “General Purpose GPU” computing. The basic idea behind these strange terms is that graphics hardware has become so powerful and flexible that you can use it to do more than run games. In fact, your laptop’s graphics card (GPU) can actually augment and run alongside the CPU to substantially improve system performance in certain cases. Microsoft already began leveraging graphics processor power a bit with the Aero interface in Windows Vista, but I’m talking about moving beyond immediately visual tasks.
Nvidia has been a major proponent of using the GPU to handle other tasks, and have invested a lot of time and marketing into GPU computing … specifically with their own hardware. ATI has been playing a bit of second fiddle here, with their Stream software underperforming compared to Nvidia’s CUDA and reaching nowhere near the still limited market penetration. This stands to change thanks to Windows 7 and the release of more software that takes advantage of Stream.
So why is this relevant to you? That’s a good question. The answer is simple: Speed. By making your GPU work together with your CPU for general tasks you can accomplish those tasks in a fraction of the time it takes for your laptop’s CPU to do it alone.
The fact is that while both desktop and notebook users stand to benefit from this technology, notebook processing power is still at a bit of a premium, probably more than it’s been at some time. Intel is currently the only CPU manufacturer to offer mobile quad-core processors, but these quads still draw a lot of power and generate a lot of heat, making them unfit for general notebook implementation and sidelining them to larger desktop replacement machines. For the mobile user, being able to leverage some of the processing power of a GPU that ordinarily sits idle could be a welcome improvement.
GPU computing isn’t widespread yet, but it’s coming … at least if Nvidia and AMD have anything to say about it.
Prior to Windows Vista and DirectX 10, graphics hardware was largely fixed-function and really only useful for rendering 3D graphics. Without getting into too much technical detail about what the parts of the GPU did, we can simply say there was dedicated hardware for handling pixel shaders, dedicated hardware for handling vertex shaders, and dedicated hardware for handling texturing.
Even if Windows Vista and DirectX 10 can be considered marginal failures (DirectX 10 has seen woefully minimal market penetration), the technological jump that DirectX 10 predicated cannot be underestimated. DX10-class hardware had to shift to using unified shader hardware, shaders that could handle pixel, vertex, or the new geometry shaders. This jump had two major consequences.
The first impact was the immediate one: Now that pixel and vertex shaders are handled by the same hardware, GPU resources could be more adequately balanced and used in games, resulting in improved performance and less untapped potential in the GPU itself. Previous to DX10, ATI in particular had been playing a balancing act trying to figure out whether or not a GPU should have more pixel shaders than vertex, and how many more. With DX10, this became irrelevant.
The second is the more far-reaching impact: The shader hardware itself became substantially more programmable to the point where it could now be used for more generalized computing and calculation instead of just graphics. Though this change has very few immediate applications, it is rippling out gradually.
This gave rise to “General Purpose GPU” computing … and it’s changing the future of PCs.
The Present: CUDA and STREAM
Right now, GPU computing is fairly limited. As far as the average consumer is concerned there are basically two types of GPU computing options on the market: Nvidia’s CUDA and ATI’s Stream.
Nvidia has been pushing their CUDA platform very aggressively, but their proprietary standards have no doubt limited its appeal. I will admit to having a bias for ATI and AMD, and a large part of this is because while Nvidia has laid a substantial amount of groundwork for GPU computing in general, they’ve kept it wholly proprietary with CUDA and tend to be fairly aggressive about locking out the competition. Still, one can’t underestimate Nvidia’s efforts to make consumers aware of GPU computing.
One of Nvidia’s biggest wins with CUDA has been in the Badaboom Elemental video transcoding software, which is able to use the unified shaders in Nvidia dedicated graphics to substantially accelerate video encoding. Translation: You can transfer a massive video file from your computer to your iPod in a matter of seconds rather than a matter of minutes.
ATI has a similar application with its own AVIVO encoder (part of ATI’s Stream technology), but modern tests have shown that ATI’s Stream produces somewhat inferior end results to Nvidia’s CUDA. Still, video encoding is a task that can take a substantial amount of time and processing power, and with more people transcoding videos downloaded on the internet for use on iPhones, PSPs, Zunes, and so on, it’s a handy application of the technology.
The other major consumer-level victory Nvidia had with CUDA has been their PhysX middleware, formerly Ageia’s PhysX. PhysX is used for accelerating physics calculations in games which support it (games like Mirror’s Edge and Batman: Arkham Asylum) and it can produce a notably improved gaming experience. The key point here is that while the technology is being used in a game, the GPU itself is also actually handling raw physics calculations (traditionally only handled by the CPU).
This is all ignoring the major inroads Nvidia and CUDA have made in the enterprise and educational sectors, where Nvidia’s hardware is used to accelerate all kinds of different calculations. These applications aren’t as relevant to you, but they exist as proof of concept.
AMD/ATI’s Stream has seen somewhat less adoption, but marketing has never been AMD’s strong suit. I love them dearly, but the benefits of Stream just aren’t getting as much press as CUDA. Presently I can only name the mediocre in-house AVIVO video encoder and ArcSoft’s TotalMedia Theatre as software that uses Stream, and TotalMedia Theatre can also utilize CUDA for the same tasks.
What you end up with is a tenuous sort of situation where Nvidia’s done a boatload of footwork in trying to standardize GPU computing, and while their head is in the right place, their heart is not. Nvidia is more or less locking PhysX to Nvidia chips and pushing CUDA as a proprietary standard rather than trying to develop something that will run on all PCs. This kind of approach tends to be short-sighted and threatens to put an end to what would otherwise be a popular new technology.
The Future: OpenCL and DirectCompute
Of course, there is still plenty of potential to GPU computing. Two open standards have materialized and are gradually being accepted … standards which will run on any compliant hardware from ATI or Nvidia: OpenCL and DirectCompute.
OpenCL is handled by the Khronos Group, the same partnership that (mis)handles the OpenGL platform, but OpenCL has been integrated into the most recent release of Apple’s Mac OS X, Snow Leopard. ATI or Nvidia graphics hardware can run OpenCL code, and there’s no reason to believe that OpenCL won’t be utilized. OpenCL is also more platform agnostic, and OpenCL code can run in Linux or Windows in addition to Mac OS X.
The other standard is DirectCompute, which comes part and parcel with DirectX 11, which in turn comes with Windows 7 (and will be available on Windows Vista as well). As part of the DirectX 11 package, DirectCompute has the benefit of higher visibility for programmers in the Windows environment and allows for some impressive additional calculations even within games, similar to PhysX but with even greater flexibility.
The most important thing to note is that there are now two vendor-agnostic standards in place for GPU computing which will hopefully lead to the widespread use of GPU computing in notebooks in 2010. Budget notebook users in particular may want to take note: if you previously though it was okay to buy a laptop with Intel integrated graphics you may finally have a reason to buy a notebook with dedicated graphics.
In all of this, the big takeaway is that graphics hardware is growing beyond its original purpose. The GPU is turning into a useful coprocessor capable of doing some very impressive heavy-lifting. This serves to more fully utilize the hardware available in modern notebooks. GPU computing has the potential to provide another major leap forward in overall performance and reinforce “laptops” as worthwhile mobile workstations … regardless of whether you’re using an older 17-inch gaming notebook or a new 12-inch ultraportable with entry-level discrete graphics.
The best part is the price. Users who already have the hardware on hand capable of using OpenCL (Nvidia 8-series and newer, ATI Radeon HD 4000-series and newer) or DirectCompute (Nvidia 8-series and newer, ATI Radeon HD 2000-series and newer) will enjoy this increased functionality just by updating their video drivers.
Over the next year or so you can expect to see software which takes advantage of GPU acceleration gradually coming out of the woodwork … one more reason to buy notebooks with discrete graphics.