Tag Archives: Morphological Antialiasing

More on God of War III Antialiasing

Since my recent post discussing the antialiasing method used in God of War III, Cedric Perthuis (a graphics programmer on the God of War III development team) was kind enough to email some additional details on how the technique was developed, which I will quote here:

“It was extremely expensive at first. The first not so naive SPU version, which was considered decent, was taking more than 120 ms, at which point, we had decided to pass on the technique. It quickly went down to 80 and then 60 ms when some kind of bottleneck was reached. Our worst scene remained at 60ms for a very long time, but simpler scenes got cheaper and cheaper. Finally, and after many breakthroughs and long hours from our technology teams, especially our technology team in Europe, we shipped with the cheapest scenes around 7 ms, the average Gow3 scene at 12 ms, and the most expensive scene at 20 ms.

In term of quality, the latest version is also significantly better than the initial 120+ ms version. It started with a quality way lower than your typical MSAA2x on more than half of the screen. It was equivalent on a good 25% and was already nicer on the rest. At that point we were only after speed, there could be a long post mortem, but it wasn’t immediately obvious that it would save us a lot of RSX time if any, so it would have been a no go if it hadn’t been optimized on the SPU. When it was clear that we were getting a nice RSX boost ( 2 to 3 ms at first, 6 or 7 ms in the shipped version ), we actually focused on evaluating if it was a valid option visually. Despite of any great performance gain, the team couldn’t compromise on quality, there was a pretty high level to reach to even consider the option. And as for the speed, the improvements on the quality front were dramatic. A few months before shipping, we finally reached a quality similar to MSAA2x on almost the entire screen, and a few weeks later, all the pixelated edges disappeared and the quality became significantly higher than MSAA2x or even MSAA4x on all our still shots, without any exception. In motion it became globally better too, few minor issues remained which just can’t be solved without sub-pixel sampling.

There would be a lot to say about the integration of the technique in the engine and what we did to avoid adding any latency. Contrarily to what I have read on few forums, we are not firing the SPUs at the end of the frame and then wait for the results the next frame. We couldn’t afford to add any significant latency. For this kind of game, gameplay is first, then quality, then framerate. We had the same issue with vsync, we had to come up with ways to use the existing latency. So instead of waiting for the results next frame, we are using the SPUs as parallel coprocessors of the RSX and we use the time we would have spent on the RSX to start the next frame. With 3 ms or 4 ms of SPU latency at most, we are faster than the original 6ms of RSX time we saved. In the end it’s probably a wash in term of latency due to some SPU scheduling consideration. We had to make sure we could kick off the jobs as soon as the RSX was done with the frame, and likewise, when the SPU are done, we need the RSX to pick up where it left and finish the frame. Integrating the technique without adding any latency was really a major task, it involved almost half of the team, and a lot of SPU optimization was required very late in the game.”

“For a long time we worked with a reference code, algorithm changes were made in the reference code and in parallel the optimized code was being optimized further. the optimized version never deviated from the reference code. I assume that doing any kind of cheap approximation would prevent any changes to the algorithm. There’s a point though, where the team got such a good grip of the optimized version that the slow reference code wasn’t useful anymore and got removed. We tweaked some values, made few major changes to the edge detection code and did a lot of testing. I can’t stress it enough. every iteration was carefully checked and evaluated.”

So it looks like my first impression of such techniques – that they are too expensive to be feasible on current consoles – was not that far off the mark; I just hadn’t accounted for what a truly heroic SPU optimization effort could achieve. I wonder what other graphics techniques could be made fast enough for games, given a similar effort?

Morphological Antialiasing in God of War III

Eric wrote a post back in July about a paper called Morphological Antialiasing which had been presented at HPG 2009 (source code for the paper is available here).  The paper described a post-processing algorithm which could smooth out edges as if by magic.  Although the screenshots were impressive, the technique seemed too expensive to be practical for games on current hardware; also there were reportedly bad artifacts when applied to moving images.  For these reasons I didn’t pay too much attention to the technique.  It was reported (including by us) that the game The Saboteur was using this technique on the PS3 but this turned out to be a false alarm.

However, God of War III is actually using Morphological Antialiasing.  I’ve looked closely at the game and the technique they use appears not to exhibit significant motion artifacts; it definitely looks better than the MSAA2X technique it replaced (which was used in the E3 2009 demo).  According to the game’s art director, the technique used “goes beyond” the original paper; this may mean that they improved it in ways that reduce the motion artifacts.

My initial impression that the technique is too expensive did not take into account the impressive horsepower of the PS3’s Cell chip.  After optimization, the technique runs in 20 milliseconds on a single SPU; running it on 5 SPUs in parallel enables it to complete execution in 4 milliseconds.  Most importantly, turning off MSAA saved them 5 milliseconds of GPU time, which on the PS3 is a significant gain (the GPU is most often the bottleneck on PS3 games).