Tag Archives: performance

Don’t be mean

[Some on Twitter noted that I should be using milliseconds instead of FPS. This kind of misses the point, but let’s avoid distractions, here’s the article with that change. The sad part is that you then miss my hilarious joke about how I use FPS in the article, because if I used SPF you’d think I was talking about tanning. Which makes me think of another joke about rendering cows and the time it then takes to tan their hides. I’m full of great dad jokes.]

I think I’m reading “The Economist” too much, as I keep trying to come up with punny article titles. Sorry.

So, how do you measure a representative value for milliseconds per frame?

I don’t care about the mechanics, which timer call you use, etc. Just assume you successfully start timer/end timer and get some length of time in milliseconds for the frame. What do you do with these timings?

I usually see things such as an average, or a running average (average of last 20 or 50 or 100 or whatever frame times). I think this is mostly bad. As someone pointed out, almost everyone has more than the average number of legs. I find the same: in a given run there can sometimes be some frames where things noticeably slow down for whatever reason, some load on the computer. What you’re often trying to measure (as a graphics developer) is the performance of the rendering system itself, not the computer’s overall performance.

So, I currently use one of these two, or both: shortest time, or median time, over whatever set of frame times I have. Both have their uses. Shortest time is justifiable (to me, at least) because, assuming you have a very fine-grained timer, your best time is in some sense the “purest” measurement of the time a frame takes. Whatever other processes in your system are slowing down the other frames isn’t your concern. The timer doesn’t lie, you really did go that fast for one frame.

The other measure I’m OK with is the median. If your benchmarking system is going through a series of different frames (an animation or simulation is running, or the camera is orbiting, etc.), then grabbing the median frame is good. Choosing it instead of the average then doesn’t give so much weight to outliers. Better yet, graph the results and see whether the outliers are consistent.

Update: A number of game and VR developers pointed out that their major interest is maximum frame time. Makes sense: for a good experience (especially with VR) you don’t want to drop below your target of 30 FPS, 60 FPS, or 90 FPS.

My point is that the average, the mean, is not so good: often external slowdowns throw off the average enough and at random enough intervals that the average is very noisy and so, pretty useless. Taking the median, the central time of the sorted set, cuts out much of this variance, making each sample have an equal effect on the result.

Anyway, that’s where I’m at with benchmarking. What do you do? Comment here, tweet-reply, or email me at erich@acm.org and I’ll summarize.

p.s. pro tip: walk through your rendering pipeline every once in awhile, watching each step. It’s hard to really know where the time goes without doing so. I did this last week while looking at another bug and found a little logic error was causing a certain path to always do an additional post-process when it usually wasn’t needed. Free performance boost with a two-line fix! But, not something discoverable by benchmarking, because the variance is too much to notice “just” a few frames of difference.

This happens every few years. My favorite lucky find was around 15 years ago, walking through code in an established project and seeing that it was rendering twice for each time it displayed. A one-line change gave us 2x performance.

Tools, tools, tools

Last month I mentioned gDEBugger being free and the joys of cppcheck. Here are some others that have crossed my path for one reason or another. Please do let me know (and so let us all know) about any worthwhile tools and libraries I haven’t blogged about – part of the reason for putting out this list is in hopes of learning of tools I haven’t heard of yet.

  • There is now a free version of AQTime, a commercial application that finds memory leaks and performance bottlenecks.
  • The Intel Graphics Performance Analyzers are supposed to be good stuff, and free – you just sign up for the Visual Adrenaline Program. I haven’t used them, but know people that have (hey, there’s Dan Baker on Intel’s page – nice).
  • Intel’s Parallel Inspector, despite its name, is particularly strong at finding memory leaks in any programs. Free month trial.
  • NVIDIA’s Parallel Nsight, also despite its name and focus of its advertising, is not just for CUDA and DirectCompute debugging and analysis, it also works on DirectX 10 and 11 shaders – you’ll need two machines networked together, one to run the shader and the other to control it. The Standard version is free, though when you sign up for it you also get a time-limited “we hope you get hooked” Professional license. Due to a currently-goofy pair of machines in my office (on different networks, and one’s a Mac I use purely as a Windows box), I haven’t gotten to try it out yet, but the demos look pretty great.
  • The Windows Performance Analysis Tools are evidently worthwhile for checking coarse-grained performance and bottlenecks for Windows programs. Again, free. I’ve heard that a number of groups have used xperf to good effect.
  • On an entirely different subject, HLSL2GLSL does a good job of translating most DirectX 9 (only) HLSL shaders to – wait for it – GLSL. Open source, and more info here, which discusses related efforts (like Mojoshader) and translation in the other direction.
  • Not really a tool per se, but still cool to see: here’s a way to find out how much free GPU memory is left for your OpenGL application. Anyone know any way to do this sort of thing with DirectX and Vista/Windows 7?
  • Will WebGL take off? Beats me, but it’s nice to see there’s an inspector, similar to gDEBugger and PIX.
  • GLM is a C++ math library particularly well-suited for use with (but not at all dependent on) OpenGL.
  • Humus points out that the old workhorse PIX now has new functionality that lets you assign names to objects, making debugging easier.
  • While I was messing with his binvox and viewvox programs, Patrick Min pointed out there’s a free 3DS file format library out there, lib3ds. I tried it out and it did the job well, taking very little time for me to integrate into my own private copy of binvox.