_ __ ______ _ __
' ) ) / ' ) )
/--' __. __ , --/ __ __. _. o ____ _, / / _ , , , _
/ \_(_/|_/ (_/_ (_/ / (_(_/|_(__<_/ / <_(_)_ / (_= 0) {
tmin = (bounds[0].x() - r.origin.x()) / r.direction.x();
tmax = (bounds[1].x() - r.origin.x()) / r.direction.x();
}
else {
tmin = (bounds[1].x() - r.origin.x()) / r.direction.x();
tmax = (bounds[0].x() - r.origin.x()) / r.direction.x();
}
etc. for y and z.
----
Williams:
divx = 1 / r.direction.x();
if (divx >= 0) {
tmin = (bounds[0].x() - r.origin.x()) * divx;
tmax = (bounds[1].x() - r.origin.x()) * divx;
}
else {
tmin = (bounds[1].x() - r.origin.x()) * divx;
tmax = (bounds[0].x() - r.origin.x()) * divx;
}
Two multiplies are faster than the divide, but also handles the -0 case.
'divx' captures the sign of the r.direction even when it is zero. Consider
1/0.0 = +INF, 1/-0.0 = -INF. Willams then proposes a faster version that
precomputes the inverse directions as well as the result of the sign tests
(such as divx >= 0) in the ray structure.
tmin = (bounds[r.sign[0]].x() - r.origin.x()) * r.inv_direction.x();
tmax = (bounds[1-r.sign[0]].x() - r.origin.x()) * r.inv_direction.x();
tymin = (bounds[r.sign[1]].y() - r.origin.y()) * r.inv_direction.y();
tymax = (bounds[1-r.sign[1]].y() - r.origin.y()) * r.inv_direction.y();
if ( (tmin > tymax) || (tymin > tmax) )
return false;
if (tymin > tmin)
tmin = tymin;
if (tymax < tmax)
tmax = tymax;
tzmin = (bounds[r.sign[2]].z() - r.origin.z()) * r.inv_direction.z();
tzmax = (bounds[1-r.sign[2]].z() - r.origin.z()) * r.inv_direction.z();
if ( (tmin > tzmax) || (tzmin > tmax) )
return false;
if (tzmin > tmin)
tmin = tzmin;
if (tzmax < tmax)
tmax = tzmax;
return ( (tmin < t1) && (tmax > t0) );
----
I like Williams' approach of storing the extra data (1/dir and signs of dir)
with the ray, but all the tests still worried me. There is a lot of branch
misprediction in there. So I adopted the following approach of using SSE's
min/max and horizontal min/max (which is only available in SSE 4).
Manny version:
static const SSE::Vec3 kINDVec3 = SSE::Invert4( SSE::Vec3(0.f,0.f,0.f) ); //(-1.0#IND,-1.0#IND,-1.0#IND)
BBox::ClipRay(const Ray& r, float* tmin, float* tmax, float raymin, float raymax) const
{
const SSE::Vec3& rayorig = (r.GetOrigin().m_v);
const SSE::Vec3& rayinvdir = (r.GetInvDir().m_v);
SSE::Vec3 t0v, t1v;
t0v = ( (m_points[0] - rayorig) * rayinvdir ); //tmins - same as Williams
t1v = ( (m_points[1] - rayorig) * rayinvdir ); //tmaxs - same as Williams
if (result != 0) { //handle #IND
ClampTs( t0v, t1v, indmask );
}
SSE::Vec3 t0v_n, t1v_n;
t0v_n = SSE::Min( t0v, t1v ); //_mm_min_ps
t1v_n = SSE::Max( t0v, t1v ); //_mm_max_ps
SSE::Scalar interval_min, interval_max;
SSE::Scalar t0, t1; //help keep them within SSE registers
t0 = SSE::MaxHorz3( t0v_n ); //horizontal-max
t1 = SSE::MinHorz3( t1v_n ); //horizontal-min
interval_min = MAX1( t0, raymin ); //SSE 1-compoment Max
interval_max = MIN1( t1, raymax ); //SSE 1-compoment Min
*tmin = interval_min * kBBEPSMIN; //* 0.9999f
*tmax = interval_max * kBBEPSMAX; //* 1.0001f
return (interval_min <= interval_max);
}
This code runs very fast and hardly takes any time at all. Instead of adding
an epsilon to prevent a degenerate interval, I prefer to multiply by 0.9999
and 1.0001. Adding an epsilon is not always robust due to the nature of
floating point math. See [Goldberg91] or Christer Ericson's book [Ericson05]
for a more detailed exposition. The horizontal min/max can be replaced by
several SSE instructions on older CPUs and the code still runs well.
I also found doing:
t0 = MaxHorz3( t0v_n ); //horizontal-max
t1 = MinHorz3( t1v_n ); //horizontal-min
interval_min = MAX1( t0, raymin ); //SSE 1-compoment Max
interval_max = MIN1( t1, raymax ); //SSE 1-compoment Min
is a little bit faster than:
t0v_n = MaxVec3( t0v_n, Vec3(raymin) );
t1v_n = MinVec3( t1v_n, Vec3(raymax) );
interval_min = MaxHorz3( t0v_n ); //horizontal-max
interval_max = MinHorz3( t1v_n ); //horizontal-min
----
References
----------
[Ericson05] Real-Time Collision Detection. Morgan Kaufmann, 2005.
[Goldberg91] "What Every Computer Scientist Should Know About Floating-Point Arithmetic",
ACM Computing Surveys, Vol 23, No 1, March 1991.
[Smits99] "Efficiency Issues for Ray Tracing", JGT, vol 3:2, 1998, pp. 1-14.
[Smits02] "Efficient Bounding Box Intersection", Ray Tracing News 15:1 (2002).
[Williams03] "An efficient and robust ray-box intersection algorithm", JGT, vol 10:1 (2005), pp. 49–54.
----------------------------------------------------------------------------
Puzzle Answer: Triangle/Bounding Box Ratio, by Eric Haines
Last issue I asked what the average ratio of an arbitrarily-oriented
equilateral triangle's area is to its bounding box's surface area,
http://tog.acm.org/resources/RTNews/html/rtnv21n1.html#art4. To be clearer, I
should have said "take the set of all possible orientations of an equilateral
triangle, uniformly distributed, and make an axis-aligned bounding box for
each one; what's the average surface area ratio?" This ratio should be
proportional to the chance that a ray that pierces the bounding box (ignore
rays with endpoints inside the box) then hits the triangle. It's trivial to
show the ratio must be less than 1/2: if you put an oriented bounding box
around any triangle and the area of the triangle will be half of base times
height, and any set of axis-aligned boxes will naturally be larger than this
oriented bounding box.
Unfortunately, no one took the bait and solved it for me. Gilbert Bernstein
did reply about this problem, proving that the length of the perimeter of any
triangle which is randomly oriented is proportional to the average mean width
of the set of axis-aligned bounding boxes put around this triangle. The mean
width of a bounding box is the sum of its three axes, height + width + depth
(see puzzle answer RTNews,
http://tog.acm.org/resources/RTNews/html/rtnv21n1.html#art3). From testing
below, it looks like the ratio is 4/3. To summarize:
Sum of triangle's edge lengths = 4/3 * ( height + width + depth ) of the "average summed" bounding box.
I like that this is true for any triangle, but it wasn't the answer I wanted -
cold, hard numbers is what I was hoping to get. I decided to hack a quick perl
program to give me an approximation, if not a formula. I tried it on a few
common triangles:
Equilateral triangle: 0.267 - 1 in 3.75
Isosceles right triangle (edges 1,1,sqrt(2): 0.243 - 1 in 4.12
Golden ratio right triangle (short edges 1,1.618): 0.226 - 1 in 4.42
Thin right triangle (short edges 1, 0.1): 0.069 - 1 in 14
Very thin right triangle (short edges 1, 0.01): 0.0077 - 1 in 130
The equilateral triangle seems like it would be a "best case", since it has
the highest area to perimeter-squared ratio, and this indeed appears to be so,
it giving the highest hit ratio of the triangles tested. The isosceles right
triangle is a worthwhile case, in that two of these form a square, a common
element in a mesh primitive. I recall reading in some "Mathematical Games"
column that some person went around and measured rectangle ratios, wherever
he'd see them. The average ratio was around the golden ratio, phi. A golden
ratio right triangle forms such rectangles, so should also be a fairly common
triangle type. The thin triangles were run just to see how bad things can get;
no surprise, a randomly-oriented triangle that is needle-like, where
area/perimeter is low, is likely to be contained by a proportionally huge
bounding box.
I think this is worth knowing, as many of us are constantly putting bounding
boxes around triangles. For axis-aligned triangles, the best we can ever do is
0.5, half the time the triangle will be hit by a ray piercing the bounding
box. For arbitrary triangles, the best is 0.267 for an equilateral, and more
likely the triangle will be hit by a box-piercing ray less than one time in
four.
This hit ratio has a bearing on the performance of various efficiency schemes.
For example, it appears BIH (bounding interval hierarchy) works well on laser-
scanned data, where the triangles are more equilateral and rarely thin.
Synthetic model data, such as long pipes in the Factory model, or even endcaps
of a cylinder, produce long, thin triangles that are less likely to be hit.
It seems to me an important characteristic of any scene is what its hit ratio
would be for its triangles, weighted by their areas (their chance of getting
hit). For a static scene, this is easily done: look at every triangle, compute
its bounding box, and get the ratio of areas directly. This would be an
interesting number to know: sum of triangle areas / sum of bounding box areas.
Where it gets trickier is when objects in the scene are animated by rotation,
which is where a formula would be handy, to know how bad a model can become
when it is rotated. A model full of arbitrarily-oriented equilateral triangles
to start won't be worse when rotated; a model with long, thin, axis-aligned
triangles will have tight bounds initially, but terrible bounding boxes once
rotated. The paper "Edge Volume Heuristic - Robust Triangle Subdivision for
Improved BVH Performance" by Holger Dammertz and Alexander Keller discusses
the effects of such rotation and tackles the problem of subdividing triangles
that give poor triangle/box hit ratios.
A tiny project, maybe good for an undergrad's study project: what's the
optimal triangulation of a polygon with the goal of minimizing the sum of the
bounding box areas? For example, a polyline forming a circle could be
triangulated into a fan (choose one vertex and fan from there) or a triangle
strip (a zig-zag of triangles), or some mix of these, or some entirely
arbitrary pattern. Which is best?
I think there's much more work to be done here; what is of interest to me is
whether we can find statistical characterizations of a scene's data and
determine any correlations with the effectiveness of various efficiency
schemes. It's probably impractical (and as the Arnold article in this issue
shows, other factors such as motion blur can entirely change the choice of
efficiency scheme), but I like the idea that a firm could feed some typical
models into an analyzer to find out what sort of efficiency scheme would do
their data the most good.
----------------------------------------------------------------------------
Where's the Rest of the Issue? by Eric Haines
A lot of my energy these days is put into our "Real-Time Rendering" website
and blog. Some of the articles there would normally go here, instead, as they
pertain to ray tracing or graphics in general.
The intersections page, http://realtimerendering.com/intersections.html,
continues to be somewhat relevant to ray tracing (though, really, ray/triangle
and ray/bezier are probably all you need). Our graphics books recommendation
section, http://realtimerendering.com/books.html#recommended, also has some
relevant bits.
For blog entries, first of all, I edited a "book" on ray tracing, called
"Another Introduction to Ray Tracing",
http://pediapress.com/books/show/another-introduction-to-ray-tracing-eric/.
471 pages, cool cover, just $22.84, with one small caveat: it's a bunch of
Wikipedia articles and took me two hours to make. The free version is here:
http://en.wikipedia.org/wiki/User:Erich666/Books/AnotherIntroRT. Don't like my
book? Simply grab the contents of that page and make your own! I wrote an
article about it and other (much scammier) print-on-demand schemes, see
http://www.realtimerendering.com/blog/tag/ray-tracing/. Writing this today, I
decided to actually order a physical copy, goofy as it is, just to see what
that's like. It's sort of like an existence proof: a print-on-demand book
doesn't really exist until you print at least one physical copy.
I do recommend taking the quiz:
http://www.realtimerendering.com/blog/questionable-content/, answers here:
http://www.realtimerendering.com/blog/questionable-answers/
There's an entry about a trick that everyone should know, but many don't,
about looping through a list of vertices:
http://www.realtimerendering.com/blog/looping-through-polygon-edges/. This
would have been an RTNews article if I hadn't put it in the blog.
My article on morphological antialiasing (MLAA),
http://www.realtimerendering.com/blog/morphological-antialiasing/, gives a
brief rundown of the idea and some experiments. For scenes with sharp edges in
need of antialiasing, this is an interesting option. This can be quite cheap
for ray tracing, compared to the cost of shooting more rays. I find it falls
apart on thin objects that span a pixel or two in width and pixels at the
edges of the image are problematic, but otherwise the results are pretty
convincing. Naty has a few followup articles of this type of technique being
used in a game, http://www.realtimerendering.com/blog/morphological-antialiasing-in-god-of-war-iii/ and
http://www.realtimerendering.com/blog/more-on-god-of-war-iii-antialiasing/.
I wrote two articles on left-handed vs. right-handed coordinate systems:
http://www.realtimerendering.com/blog/left-handed-vs-right-handed-world-coordinates/
and http://www.realtimerendering.com/blog/left-handed-vs-right-handed-viewing/.
I'm not really satisfied with them, they're still a work in progress for me.
I think this area still needs a thorough *clear* treatment from world to
screen; I ignore the Y-flipping problem often run into going from Cartesian
coordinates to screen coordinates. Still, I got a bit sick of seeing "just
flip the Z coordinate" as the flip answer for converting from one coordinate
system to another. Yes, this can work, but you have to understand what in
the world you're doing and what you're giving up.
Nothing to do with ray tracing, but I liked the idea of a constant luma
palette: http://www.realtimerendering.com/blog/constant-luma-palette/
The Octane 2 renderer mentioned here, http://www.realtimerendering.com/blog/7-things-for-may-2/
The next-to-last entry here shows the dangers of ray tracing:
http://www.realtimerendering.com/blog/clearing-the-queue/
Our most popular post over the years has been this simple one:
http://www.realtimerendering.com/blog/thought-for-the-day/
Finally, some visual treats, all the stuff I didn't stick in the Roundup:
http://www.realtimerendering.com/blog/visual-treats/,
http://www.realtimerendering.com/blog/visual-stuff/, and
http://www.realtimerendering.com/blog/shanghai-3d/.
----------------------------------------------------------------------------
END OF RTNEWS