_ __ ______ _ __ ' ) ) / ' ) ) /--' __. __ , --/ __ __. _. o ____ _, / / _ , , , _ / \_(_/|_/ (_/_ (_/ / (_(_/|_(__<_/ / <_(_)_ / (_= 0) { tmin = (bounds[0].x() - r.origin.x()) / r.direction.x(); tmax = (bounds[1].x() - r.origin.x()) / r.direction.x(); } else { tmin = (bounds[1].x() - r.origin.x()) / r.direction.x(); tmax = (bounds[0].x() - r.origin.x()) / r.direction.x(); } etc. for y and z. ---- Williams: divx = 1 / r.direction.x(); if (divx >= 0) { tmin = (bounds[0].x() - r.origin.x()) * divx; tmax = (bounds[1].x() - r.origin.x()) * divx; } else { tmin = (bounds[1].x() - r.origin.x()) * divx; tmax = (bounds[0].x() - r.origin.x()) * divx; } Two multiplies are faster than the divide, but also handles the -0 case. 'divx' captures the sign of the r.direction even when it is zero. Consider 1/0.0 = +INF, 1/-0.0 = -INF. Willams then proposes a faster version that precomputes the inverse directions as well as the result of the sign tests (such as divx >= 0) in the ray structure. tmin = (bounds[r.sign[0]].x() - r.origin.x()) * r.inv_direction.x(); tmax = (bounds[1-r.sign[0]].x() - r.origin.x()) * r.inv_direction.x(); tymin = (bounds[r.sign[1]].y() - r.origin.y()) * r.inv_direction.y(); tymax = (bounds[1-r.sign[1]].y() - r.origin.y()) * r.inv_direction.y(); if ( (tmin > tymax) || (tymin > tmax) ) return false; if (tymin > tmin) tmin = tymin; if (tymax < tmax) tmax = tymax; tzmin = (bounds[r.sign[2]].z() - r.origin.z()) * r.inv_direction.z(); tzmax = (bounds[1-r.sign[2]].z() - r.origin.z()) * r.inv_direction.z(); if ( (tmin > tzmax) || (tzmin > tmax) ) return false; if (tzmin > tmin) tmin = tzmin; if (tzmax < tmax) tmax = tzmax; return ( (tmin < t1) && (tmax > t0) ); ---- I like Williams' approach of storing the extra data (1/dir and signs of dir) with the ray, but all the tests still worried me. There is a lot of branch misprediction in there. So I adopted the following approach of using SSE's min/max and horizontal min/max (which is only available in SSE 4). Manny version: static const SSE::Vec3 kINDVec3 = SSE::Invert4( SSE::Vec3(0.f,0.f,0.f) ); //(-1.0#IND,-1.0#IND,-1.0#IND) BBox::ClipRay(const Ray& r, float* tmin, float* tmax, float raymin, float raymax) const { const SSE::Vec3& rayorig = (r.GetOrigin().m_v); const SSE::Vec3& rayinvdir = (r.GetInvDir().m_v); SSE::Vec3 t0v, t1v; t0v = ( (m_points[0] - rayorig) * rayinvdir ); //tmins - same as Williams t1v = ( (m_points[1] - rayorig) * rayinvdir ); //tmaxs - same as Williams if (result != 0) { //handle #IND ClampTs( t0v, t1v, indmask ); } SSE::Vec3 t0v_n, t1v_n; t0v_n = SSE::Min( t0v, t1v ); //_mm_min_ps t1v_n = SSE::Max( t0v, t1v ); //_mm_max_ps SSE::Scalar interval_min, interval_max; SSE::Scalar t0, t1; //help keep them within SSE registers t0 = SSE::MaxHorz3( t0v_n ); //horizontal-max t1 = SSE::MinHorz3( t1v_n ); //horizontal-min interval_min = MAX1( t0, raymin ); //SSE 1-compoment Max interval_max = MIN1( t1, raymax ); //SSE 1-compoment Min *tmin = interval_min * kBBEPSMIN; //* 0.9999f *tmax = interval_max * kBBEPSMAX; //* 1.0001f return (interval_min <= interval_max); } This code runs very fast and hardly takes any time at all. Instead of adding an epsilon to prevent a degenerate interval, I prefer to multiply by 0.9999 and 1.0001. Adding an epsilon is not always robust due to the nature of floating point math. See [Goldberg91] or Christer Ericson's book [Ericson05] for a more detailed exposition. The horizontal min/max can be replaced by several SSE instructions on older CPUs and the code still runs well. I also found doing: t0 = MaxHorz3( t0v_n ); //horizontal-max t1 = MinHorz3( t1v_n ); //horizontal-min interval_min = MAX1( t0, raymin ); //SSE 1-compoment Max interval_max = MIN1( t1, raymax ); //SSE 1-compoment Min is a little bit faster than: t0v_n = MaxVec3( t0v_n, Vec3(raymin) ); t1v_n = MinVec3( t1v_n, Vec3(raymax) ); interval_min = MaxHorz3( t0v_n ); //horizontal-max interval_max = MinHorz3( t1v_n ); //horizontal-min ---- References ---------- [Ericson05] Real-Time Collision Detection. Morgan Kaufmann, 2005. [Goldberg91] "What Every Computer Scientist Should Know About Floating-Point Arithmetic", ACM Computing Surveys, Vol 23, No 1, March 1991. [Smits99] "Efficiency Issues for Ray Tracing", JGT, vol 3:2, 1998, pp. 1-14. [Smits02] "Efficient Bounding Box Intersection", Ray Tracing News 15:1 (2002). [Williams03] "An efficient and robust ray-box intersection algorithm", JGT, vol 10:1 (2005), pp. 49–54. ---------------------------------------------------------------------------- Puzzle Answer: Triangle/Bounding Box Ratio, by Eric Haines Last issue I asked what the average ratio of an arbitrarily-oriented equilateral triangle's area is to its bounding box's surface area, http://tog.acm.org/resources/RTNews/html/rtnv21n1.html#art4. To be clearer, I should have said "take the set of all possible orientations of an equilateral triangle, uniformly distributed, and make an axis-aligned bounding box for each one; what's the average surface area ratio?" This ratio should be proportional to the chance that a ray that pierces the bounding box (ignore rays with endpoints inside the box) then hits the triangle. It's trivial to show the ratio must be less than 1/2: if you put an oriented bounding box around any triangle and the area of the triangle will be half of base times height, and any set of axis-aligned boxes will naturally be larger than this oriented bounding box. Unfortunately, no one took the bait and solved it for me. Gilbert Bernstein did reply about this problem, proving that the length of the perimeter of any triangle which is randomly oriented is proportional to the average mean width of the set of axis-aligned bounding boxes put around this triangle. The mean width of a bounding box is the sum of its three axes, height + width + depth (see puzzle answer RTNews, http://tog.acm.org/resources/RTNews/html/rtnv21n1.html#art3). From testing below, it looks like the ratio is 4/3. To summarize: Sum of triangle's edge lengths = 4/3 * ( height + width + depth ) of the "average summed" bounding box. I like that this is true for any triangle, but it wasn't the answer I wanted - cold, hard numbers is what I was hoping to get. I decided to hack a quick perl program to give me an approximation, if not a formula. I tried it on a few common triangles: Equilateral triangle: 0.267 - 1 in 3.75 Isosceles right triangle (edges 1,1,sqrt(2): 0.243 - 1 in 4.12 Golden ratio right triangle (short edges 1,1.618): 0.226 - 1 in 4.42 Thin right triangle (short edges 1, 0.1): 0.069 - 1 in 14 Very thin right triangle (short edges 1, 0.01): 0.0077 - 1 in 130 The equilateral triangle seems like it would be a "best case", since it has the highest area to perimeter-squared ratio, and this indeed appears to be so, it giving the highest hit ratio of the triangles tested. The isosceles right triangle is a worthwhile case, in that two of these form a square, a common element in a mesh primitive. I recall reading in some "Mathematical Games" column that some person went around and measured rectangle ratios, wherever he'd see them. The average ratio was around the golden ratio, phi. A golden ratio right triangle forms such rectangles, so should also be a fairly common triangle type. The thin triangles were run just to see how bad things can get; no surprise, a randomly-oriented triangle that is needle-like, where area/perimeter is low, is likely to be contained by a proportionally huge bounding box. I think this is worth knowing, as many of us are constantly putting bounding boxes around triangles. For axis-aligned triangles, the best we can ever do is 0.5, half the time the triangle will be hit by a ray piercing the bounding box. For arbitrary triangles, the best is 0.267 for an equilateral, and more likely the triangle will be hit by a box-piercing ray less than one time in four. This hit ratio has a bearing on the performance of various efficiency schemes. For example, it appears BIH (bounding interval hierarchy) works well on laser- scanned data, where the triangles are more equilateral and rarely thin. Synthetic model data, such as long pipes in the Factory model, or even endcaps of a cylinder, produce long, thin triangles that are less likely to be hit. It seems to me an important characteristic of any scene is what its hit ratio would be for its triangles, weighted by their areas (their chance of getting hit). For a static scene, this is easily done: look at every triangle, compute its bounding box, and get the ratio of areas directly. This would be an interesting number to know: sum of triangle areas / sum of bounding box areas. Where it gets trickier is when objects in the scene are animated by rotation, which is where a formula would be handy, to know how bad a model can become when it is rotated. A model full of arbitrarily-oriented equilateral triangles to start won't be worse when rotated; a model with long, thin, axis-aligned triangles will have tight bounds initially, but terrible bounding boxes once rotated. The paper "Edge Volume Heuristic - Robust Triangle Subdivision for Improved BVH Performance" by Holger Dammertz and Alexander Keller discusses the effects of such rotation and tackles the problem of subdividing triangles that give poor triangle/box hit ratios. A tiny project, maybe good for an undergrad's study project: what's the optimal triangulation of a polygon with the goal of minimizing the sum of the bounding box areas? For example, a polyline forming a circle could be triangulated into a fan (choose one vertex and fan from there) or a triangle strip (a zig-zag of triangles), or some mix of these, or some entirely arbitrary pattern. Which is best? I think there's much more work to be done here; what is of interest to me is whether we can find statistical characterizations of a scene's data and determine any correlations with the effectiveness of various efficiency schemes. It's probably impractical (and as the Arnold article in this issue shows, other factors such as motion blur can entirely change the choice of efficiency scheme), but I like the idea that a firm could feed some typical models into an analyzer to find out what sort of efficiency scheme would do their data the most good. ---------------------------------------------------------------------------- Where's the Rest of the Issue? by Eric Haines A lot of my energy these days is put into our "Real-Time Rendering" website and blog. Some of the articles there would normally go here, instead, as they pertain to ray tracing or graphics in general. The intersections page, http://realtimerendering.com/intersections.html, continues to be somewhat relevant to ray tracing (though, really, ray/triangle and ray/bezier are probably all you need). Our graphics books recommendation section, http://realtimerendering.com/books.html#recommended, also has some relevant bits. For blog entries, first of all, I edited a "book" on ray tracing, called "Another Introduction to Ray Tracing", http://pediapress.com/books/show/another-introduction-to-ray-tracing-eric/. 471 pages, cool cover, just $22.84, with one small caveat: it's a bunch of Wikipedia articles and took me two hours to make. The free version is here: http://en.wikipedia.org/wiki/User:Erich666/Books/AnotherIntroRT. Don't like my book? Simply grab the contents of that page and make your own! I wrote an article about it and other (much scammier) print-on-demand schemes, see http://www.realtimerendering.com/blog/tag/ray-tracing/. Writing this today, I decided to actually order a physical copy, goofy as it is, just to see what that's like. It's sort of like an existence proof: a print-on-demand book doesn't really exist until you print at least one physical copy. I do recommend taking the quiz: http://www.realtimerendering.com/blog/questionable-content/, answers here: http://www.realtimerendering.com/blog/questionable-answers/ There's an entry about a trick that everyone should know, but many don't, about looping through a list of vertices: http://www.realtimerendering.com/blog/looping-through-polygon-edges/. This would have been an RTNews article if I hadn't put it in the blog. My article on morphological antialiasing (MLAA), http://www.realtimerendering.com/blog/morphological-antialiasing/, gives a brief rundown of the idea and some experiments. For scenes with sharp edges in need of antialiasing, this is an interesting option. This can be quite cheap for ray tracing, compared to the cost of shooting more rays. I find it falls apart on thin objects that span a pixel or two in width and pixels at the edges of the image are problematic, but otherwise the results are pretty convincing. Naty has a few followup articles of this type of technique being used in a game, http://www.realtimerendering.com/blog/morphological-antialiasing-in-god-of-war-iii/ and http://www.realtimerendering.com/blog/more-on-god-of-war-iii-antialiasing/. I wrote two articles on left-handed vs. right-handed coordinate systems: http://www.realtimerendering.com/blog/left-handed-vs-right-handed-world-coordinates/ and http://www.realtimerendering.com/blog/left-handed-vs-right-handed-viewing/. I'm not really satisfied with them, they're still a work in progress for me. I think this area still needs a thorough *clear* treatment from world to screen; I ignore the Y-flipping problem often run into going from Cartesian coordinates to screen coordinates. Still, I got a bit sick of seeing "just flip the Z coordinate" as the flip answer for converting from one coordinate system to another. Yes, this can work, but you have to understand what in the world you're doing and what you're giving up. Nothing to do with ray tracing, but I liked the idea of a constant luma palette: http://www.realtimerendering.com/blog/constant-luma-palette/ The Octane 2 renderer mentioned here, http://www.realtimerendering.com/blog/7-things-for-may-2/ The next-to-last entry here shows the dangers of ray tracing: http://www.realtimerendering.com/blog/clearing-the-queue/ Our most popular post over the years has been this simple one: http://www.realtimerendering.com/blog/thought-for-the-day/ Finally, some visual treats, all the stuff I didn't stick in the Roundup: http://www.realtimerendering.com/blog/visual-treats/, http://www.realtimerendering.com/blog/visual-stuff/, and http://www.realtimerendering.com/blog/shanghai-3d/. ---------------------------------------------------------------------------- END OF RTNEWS