Secrets of Nvidia Maxwell, Pascal power efficiency revealed in targeted testing

This site may earn affiliate commissions from the links on this page. Terms of utilize.

GDDR5X-GPU

Back when Nvidia unveiled its second-generation Maxwell compages, it delivered a huge comeback in operation-per-watt, all while keeping the same 28nm process node that powered its previous GPU architecture, codenamed Kepler. While the company shared some details on how Maxwell improved on Kepler, including a larger L2 cache, improved memory efficiency and a new Streaming Multiprocessor (SM) configuration, it played a number of cards extremely close to its chest.

Thank you to some investigative work by David Kanter of RealWorldTech, nosotros at present know one of the critical details that Nvidia didn't disclose. Unlike Kepler, or the various GPUs manufactured past Intel and AMD, Maxwell and the current-generation Pascal both use tiled rasterization. The alternative — immediate-manner rasterization — has been the manufacture standard for years. The total video and explanation of the tools that Kanter uses is embedded beneath, but nosotros'll talk over the findings and their implications.

Those of you who take followed the graphics industry for the past 15 years may recall a GPU that emerged, for a brief time, equally a potential challenger to ATI and Nvidia. The Kyro II was a tile-based rendering solution built by PowerVR that won some fame and marketplace share as a stiff low-cost solution. Dissimilar firsthand-manner renderers, which draw the entire screen space left-to-right and meridian-to-bottom, tile-based renderers suspension a scene upwardly into a tiled grid.

tiling

This image from a PowerVR presentation shows a tile-based rendering filigree.

Past breaking a scene into tiles, the GPU tin work on each tile individually, rather than attempting to render the entire scene at once. The other reward of tile-based rendering is that the GPU tin can exam to see whether pixels will be visible in the last prototype earlier it textures and shades them. The classic distinction between tiled rasterizers and immediate-fashion rasterizers is that IMRs suffer from some degree of overdraw, pregnant that they spend fourth dimension shading, lighting, and drawing pixels that are and then thrown abroad without ever existence shown to the end user. At that place'due south a retentivity bandwidth and power cost to this, and apparently a significant chunk of Maxwell's heralded efficiency came from adopting a tiled approach.

IMR-AMD

Immediate manner rendering, AMD GPU

Past using targeted tests, Kanter was able to bear witness how AMD and NV GPUs differ. The image above is from an AMD GPU using standard immediate mode rendering. The test application draws triangles beyond the screen — and what we see above is one triangle replacing another, top-to-lesser, right-to-left.

TileBased-NV

Tile-based rendering, Nvidia

Hither'due south an case of the same exam running on a Maxwell GPU. Instead of a contiguous surface, we see a serial of blocks — tiles — across the screen. Kanter spends fourth dimension stepping through diverse exam scenarios, illustrating how the size of the tiles shrinks as the amount of data within each tile grows. The current theory is that Maxwell and Pascal dynamically conform tile size depending on how much work needs to be done in each tile. This keeps the amount of information that needs to exist stored most each completed tile within whatever buffer or cache limits Nvidia has set.

There's still much nosotros don't know nearly Nvidia's solution. For starters, the typical tile-based renderers from Imagination Technologies are deferred renderers, while Maxwell and Pascal are idea to use a tile-based firsthand-mode rasterizer. Exactly how this impacts ability or GPU efficiency is unknown. In the forum thread, Kanter notes that: "

Real-world implications

There'south no style to know exactly how much of Nvidia's power savings in Maxwell were specifically the result of this new rendering method, only we'd bet it's a non-trivial part of the equation. Nvidia has played this particular bill of fare tightly because switching from an IMR to a tile-based renderer gave it a significant reward over AMD — an advantage the visitor wouldn't want its main competitor to ferret out.

Then again, major tech ideas have a way of making their way to products from both companies, and tile-based rendering isn't a new idea — information technology'southward only unusual to see it in desktop hardware. Companies like PowerVR have been edifice tile-based rendering solutions for many years. Nosotros'll have to await for Vega to find out what AMD is planning, merely the company'south get-go new architecture in five years is expected to debut at the end of this twelvemonth or in early 2017. We'll have to expect and see if AMD leaps for a similar design or has some other tricks upwardly its sleeve.