Tag Archives: optimization

“Video Game Optimization” – a good book

I had the chance to spend some quality time with Preisz & Garney’s recent book “Video Game Optimization” a few weeks back, as I was trapped in a 14 hour plane flight. I hardly spent all that time with it, though I probably should have spent more. Instead, “Shutter Island” and “It’s Complicated” (with bad audio) are four hours out of my life I’ll never get back.

This book goes from soup to nuts on the topic: types of optimization, how to set and achieve goals, discussion of specific tools (VTune, PIX, PerfHUD, etc.), where bottlenecks can occur and how to test for them, and in-depth coverage of CPU and GPU issues. Graphics and engine performance are the focus, including multicore and networking optimization, plus a chapter on consoles and another on managed languages. Some of the information is in the “obvious if you’ve done it before” category, but critical knowledge if you haven’t, e.g., the first thing to do when optimizing is to create some good benchmark tests and lay down the baselines.

There are many specific tips, such as turning on the DirectX Debug runtime and seeing if any bugs are found. Even if your application appears to run fine with problems flagged, the fact that they’re being flagged is a sign of lost performance (the API has to recover from your problem) or possible bugs. I hadn’t really considered that aspect (“code works even with the warnings, why fix it?”), so plan to go back to work with renewed vigor in eliminating these when seen.

I also liked reading about how various optimizing compilers work nowadays. The main takeaway for me was to not worry about little syntactic tricks any more, most modern optimizers are good enough to make the code quite fast.

There’s very little in this book with which I can find fault. I tested a few terms against the index. About the only lack I found was for the “small batch problem“,  where it pays to merge small static meshes into a single large mesh when possible. This topic does turn out to be covered (Chapter 8), but the index has nothing under “batch”, “batching”, “small batch”, etc. There is also no index entry for “mesh”. So the index, while present (and 12 pages long), does have at least one hole I could detect. There are other little index mismatches, like “NVIDIA PerfHUD Tool” and “NvPerfHud Tool” being separate entries, with different pages listed. Typo-wise, I found one small error on page 123, first line should say “stack” instead of “heap”, I believe.

Executive summary: it’s a worthwhile book for just about anyone interested in optimization. These guys are veteran experts in this field, and the book gives specific advice and practical tips in many areas. A huge range of topics are covered, the authors like to run various experiments and show where problems can occur (sometimes the cases are a bit pathological, but still interesting), and there are lots of bits of information to mull over. Long and short, recommended if you want to know about this subject.

To learn more: first, look inside the book on Amazon. We mentioned here before Eric Preisz’s worthwhile article on videogame optimization on Gamasutra. A very early outline of the book appears on vertexbuffer.com. For me, it’s great to see that this is a passion for the first author – that comes through loud and clear in this book. I’ve added it to our recommended books section.

One little update: Carmack’s inverse sqrt trick, mentioned in the book on page 155, is dated for the PC. According to Ian Ameline, “It has been obsolete since the Pentium 3 came out with SSE. The rsqrtss/rsqrtps instructions are faster still and have better and more predictable accuracy. Rsqrtss + one iteration of Newton/Raphson gives 22 (of 23) bits of accuracy, guaranteed.”

Upcoming Optimization Book

Eric Preisz has a book coming out in time for GDC, “Video Game Optimization.” I haven’t seen it yet, but judging by his article on optimization on Gamasutra, it should be pretty good—he knows what he’s talking about.

By the way, assuming you’re using Google Chrome for browsing (it’s what the cool kids use), I found AutoPager Chrome to be a nice little extension. Instead of needing to click at the end of every page of an article, it glues page after page into one long scroll.

More With the Links

I love the movie sequel title “2 Fast 2 Furious”. How clever, and a great way to guarantee there will never be a third movie. Well, there was, but they had to go the colon route, “… : Tokyo Drift”.

Which is indicative of nothing, as I don’t think I’ve ever actually seen any of these movies. I was reminded of the title as my goal today is to whip through the backlog of 72 potential blog resource links I’ve been gathering on del.icio.us. [Well, as it turns out, I got through 39 of them (the fresher ones), 33 to go…]

ShaderX^7 has been published. We hope to give it an overview sometime soon (mine’s on backorder from Amazon.com).

From various source I heard that OnLive got a bit of notice at GDC. Think: pure server-side computation of all graphics for a game, i.e., a cloud computing model. Now even your grandma’s computer or even a rigged-out TV can play Crysis, assuming the net bandwidth is there. Which of course makes me think: what about latency? Lag for how other players see your action is always there, and causes mismatches (“how did I instantly die?”). But increasing lag for you seeing the consequences of your own actions seems like a non-starter for shooters, at least.

Mark DeLoura has a great two-part article on what game engines are licensed for titles. First part is a general survey, second is about the technology involved. I found it interesting to see what people cared about, e.g. multicore is on people’s minds. Nothing too shocking here, but it’s fantastic to see what is getting used, and why, in this marketplace.

Related to this, I happened across a list of game engines on wikipedia. Not massively useful (e.g. no sense of what’s popular), but a starting place.

John Ratcliff has a graphics math library available for download with an unrestrictive reuse license. He recently added best fit methods for AABB’s and OBB’s.

I was interested to look at the open source, cross-platform (!) model viewer GLC. I’ve wanted something like this for doing some experiments with mesh manipulation. Not a bad viewer, but that’s all it is at this point, unfortunately: you can’t even export to a different 3D format. The search continues… If you know a reasonable open source 3D file viewer/converter out there, please tell me. I should probably bite the bullet and just use Blender, but this application is way overkill.

CUDA voxel rendering – pretty impressive!

I liked this post on optimization mainly because of the line “I went in and found out that some title bar was getting rendered 140 times every time you refreshed the screen”. I can entirely relate (though 140 must be some kind of record): too many times I’ve put output debugging statements showing updates, only to see 2,3,6 updates happening. I once started on a project and in the first few weeks increased performance by 100%, simply by noting the main draw path was being executed twice each frame.

Speaking of performance, there’s an article on volume rendering optimizations when using a ray-casting approach on the GPU.

Wolfenstein source code for the free iPhone version, along with Carmack’s documentation on the project, is available.

Software patents are only slightly dumber than business method patents, which are patently absurd. I hadn’t noticed until now, but there was recently a ruling on a business method patent, In re Bilski, which has been used to strike down software patents.

A detailed data and execution flow diagram for the new DirectX 11 pipeline front-end is available from Jolly Jeffers.

People are still making ray-tracing specific hardware; witness Caustic Graphics. They have a rather amazing claim: “The CausticOne, however, thrives in incoherent raytracing situations: encouraging the use of multiple secondary rays per pixel. Its level of performance is not affected by the degree of incoherence.” Good trick. That said, I can’t say I see any large customer base for such a product. This seems like a company designed for acquisition, similar to Ageia. Fine by me, best of luck to them.

I’m happy to learn that the Humus site now has a news blog. This is a great site for demos of advanced techniques, and for honest comments about strengths and limitations of various approaches.

Another blog: The Geeks of 3D. Tracks demos, APIs, SDKs, and graphics card releases. Handy – some of the links here I found there.

There was a nice little article on data alignment on Gamasutra. Proper alignment is a key element in getting high performance.

I was trying to find the name of the projection of equidistant latitude and longitude lines for a surrounding spherical environment. From this interesting page (click on the “Wall Maps of the World” text) I found it: Plate Carrée.

Predicting the future is so much more interesting than predicting the past. I love this: MIPS per $1000. It’s entertaining to equate raw computing power with structured processing. By the same equivalence, I should be able to hook up 1700 mice in parallel to get a human brain.

A great line from a GPU review: “Nvidia’s new line of unbelievably expensive cards will block out the sun, and ray-trace its own shadow in real time.”

Faber College’s motto is “Knowledge is Good”. Learning about the idea of metamers would have saved this article from confusion. Coming back to this article now, I see all the comments have been removed, and an apologia trying to convert confusion into enlightenment added, but I think this still misses the point. Sure, there is a color associated with a single wavelength of light. But, my guess is that 99.99% of the colors we perceive arrive at any location on the eye as light with a spectral mix of wavelengths, not a single wavelength (Naty will correct me if I’m wrong). Unless you’re Dr. Evil and deal with sharks with frickin’ laser beams on their heads on a daily basis. Hmmm, I’m probably forgetting some other single-wavelength phenomena, like fluorescence. Anyway, the article did lead me to look up more information on metamers on Wikipedia, where I learnt about metameric failure, a term I hadn’t heard before. One more reason a simple RGB representation of color isn’t sufficient.

Cute thing: Snapily lets you turn some set of images or video into lenticular prints.

I don’t have a lot to say about what I do at Autodesk. Here’s a tidbit.

Art for the day, crayons as pixels.

This and That

I’ll someday run out of titles for these occasional summaries of new(ish) resources, but in the meantime, this one’s “This and That”.

Christer Ericson’s article on dealing with grouping and sorting objects for rendering is excellent. It mostly depends on input latency, but has concepts that can be applied in immediate mode.

An element that continues to renew the field of computer graphics is that the rules change. This article is about taking Quake 2 (from 1997) and moving it to a modern GPU.

If you haven’t seen it yet, Farbrausch’s demo “debris” is truly impressive. It’s only 183,462 bytes, and is absolutely packed with procedural content. Download here (last link works). Or be lazy and watch on YouTube.

NVIDIA’s pulled together its resources for shadow generation and ambient occlusion all onto one handy page (plus ray tracing – just one entry so far, but it’s a good one).

How to deal with various rendering paradigms on multiple platforms? GRAMPS looks intriguing.

Gamasutra put a useful Game Developer article online, all about commercial middleware game engines currently available.

OpenGL will always exist, since Macs and Linux need it. It’s easier to use in college courses because of its clarity and readability. But otherwise the pendulum’s swung far towards DirectX. Phil Taylor comments on and gives some historical context to the controversy around the latest release, OpenGL 3.0.

A nice trend for OpenGL is that people continue to write useful bits, such as GLee, which manages extensions.

New info on older effects: blur and glow, volumetric clouds, and particle systems.

The glorious teapot. I like “a wireframe view”. Yes, the real thing is taller than the synthetic model, as the model makers were compensating for non-square pixels.

“What’s the future hold?” is always a fun topic, one we’ve used each edition to end our book. I liked this presentation on SlideShare for its sheer “here are a hundred things that hurtle us towards the Singularity” feel, though I don’t buy it for a minute. SlideShare, where it is hosted, is a pleasant medium-attention-span kind of place, with all sorts of random and fun slidesets.

Finally, I am pleased to find that LittleBIGPlanet is just as gorgeous as it looked like it would be. I’ve played myself for only a bit, but walking by when my kids are playing I find I have to stop and stare.