NVIDIA Announces Fermi Architecture

Today at the GPU Technology Conference (the successor to last year’s NVISION), NVIDIA announced Fermi, their new GPU architecture (exactly one week after AMD shipped the first GPU from their new Radeon HD 5800 architecture).  NVIDIA have published a Fermi white paper, and writeups are popping up on the web.  Of these, the ones from Real World Technologies and AnandTech seem most informative.

With this announcement, NVIDIA is focusing firmly on the GPGPU market, rather than on graphics.  No details of the graphics-specific parts of the chip (such as triangle rasterizers and texture units) were even mentioned.  The chip looks like it will be significantly more expensive to manufacture than AMD’s chip, and at least some of that extra die area has been devoted to things which will not benefit most graphics applications (such as improved double-precision floating-point support and more general programming models).  With full support for indirect branches, a unified address space, and fine-grained exception handling, Fermi is as general purpose as it gets.  NVIDIA is even adding C++ support to CUDA (the first iterations of OpenCL and DirectCompute will likely not enable the most general programming models).

Compared to their previous architecture, NVIDIA has shuffled around the allocation of ALUs, thread scheduling units, and other resources.  To make sense of the soup of marketing terms such as “warps”, “cores”, and “SMs”,  I again recommend Kayvon Fatahalian’s SIGGRAPH 2009 presentation on GPU architecture.