_ __ ______ _ __
' ) ) / ' ) )
/' __. __ , / __ __. _. o ____ _, / / _ , , , _
/ \_(_/_/ (_/_ (_/ / (_(_/_(__<_/ / <_(_)_ / (_

A nice page on the classic point in polygon problem:
http://www.geometryalgorithms.com/Archive/algorithm_0103/algorithm_0103.htm
This whole site, http://www.geometryalgorithms.com/, has a lot of worthwhile
material and links.

A couple of people suggested that I put my Eurographics talk on "Rendering:
Input/Output" online, so it is now available at:
http://www.research.ibm.com/people/h/holly/pres.html.
It's in pdf, and with all the images still pretty bulky, so I divided it into
2 to 6 Mb pieces. I also put a couple other presentations, including my
attempt to be controversial for the "Newton's Nightmare" panel at SIGGRAPH.
 Holly Rushmeier

lRay, a ray tracer with caustics, soft shadows, and implicit surfaces, is
available at:
http://www.cfxweb.net/~lyc/lray.html

It had to happen eventually:
http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2061898269
This page may have disappeared by the time you look, so here's the gist:
Selling NURBS Sphere for Maya 4.0!!! Cheap!!
That's right, you lucky animators! I'm selling a collection of control
vertexes in 3D space! If you are the lucky bidder I will send you the MEL
script file containing the power to create the one, the only, the ...
NURBS Sphere!!! Payment will be accepted by PayPal only. Good Luck! May
the Best Animator Win!!!!!
With 4 days to go, the auction had 13 bids and the current price was $10.50.

A little realtime ray tracer that raytraces only cubes:
http://www.perilith.com/~lode/trixel/Files/Trixel002.zip
Seems a bit slow, but it's kind of cute.

A followup on RTNv9n2, which was all about the pitfalls of writing a book,
is this article, sent in by John Peterson:
http://www.mediachannel.org/views/oped/bookcontract.shtml
I should mention that my original "fear and loathing" column was not
representative of my personal views (I had never dealt with a publisher at
that point), but rather of a number of anonymous authors. My advice to
prospective authors is to ask other authors how they feel about the
various publishers  there are excellent publishers out there (and I'm not
going to tread on any toes by making a list here and inadvertently leaving
any out). That said, it does pay to be careful. Two other pieces of advice
are to (a) get a lawyer to read over the book contract and (2) work your
hardest to make your deadlines. If you're going to miss a deadline, tell
your publisher sooner rather than later; in this way they can schedule
resources better to meet everyone's needs.

For an entertaining use of global illumination (especially the Cornell Box)
in computer games, check out this page: http://www.sync.dk/~khn/stuff/  now
with extra color bleeding.

RealStorm: A Realtime Raytracing Engine, by Michael Piepgras
[You owe it to yourself to visit http://www.realstorm.com, download their
benchmark, and give it a whirl; it's pretty impressive. I find Michael's
approach to be very interesting in that it uses techniques normally used
in the graphics accelerator world, such as view frustum culling, shadow
volumes, and reflecting the viewpoint, instead applying them to the
acceleration of ray tracing.  EAH]
RealStorm is a RTRTengine based on polynomial primitives like planes and
quadrics. The idea is to build an engine for games, one that can handle
raytracing effects with large scenes at interactive rates. In this way we
could improve the quality of gameengines and give them a more realistic
look. As the number of polygons representing a curved surface gets higher
and higher, the model representation converges to polygons being smaller
than a screen pixel, which wouldn't make any sense at all for a traditional
Zbuffer to render.
The raytracing engine is based on adaptive supersampling, with a 4x4 pixel
grid for visibility, shadows, 1st recursion reflection, and shadows in 1st
recursion reflections, each with its own buffer. In other words, only
every 16th pixel is rendered, with more pixels sampled as needed. The
primitives used are polynomial objects like polygons, planes and quadrics,
because intersection calculations with these can be solved very rapidly.
Each primitive uses its own optimized intersection code.
In the text that follows, whenever "view frustum" is used, it means to
check the viewcone first (the cone that contains the view frustum), and if
the viewcone is intersected, then check the view frustum (both are tested
against the object's bounding sphere and bounding box). Checking against a
cone is faster than checking against a frustum, so we often can obtain
quick object rejection.
The engine lets each light be given a maximum radius of effect; beyond this
radius the light does not affect any surface's shade. A far plane distance
is also defined, as this is normally used in most game engines. These scene
attributes are typically set by the world designer.
Optimisation techniques per frame used are:
(a) build a local scene out of the global scene objects which have any
effect on the picture.
(b) visibility check with viewport and build up an orthogonal bsptree per
frame.
(c) find the minimum projected viewplane area for each object.
(d) "object receives light" check per light and object.
(e) global shadow checking: obj<>obj shadows, light<>obj shadows.
(f) shadow checking for 1st recursion shadows by checking the shadow volumes
with the view frustum and the bsp tree.
(g) for 1st recursion reflective surfaces, do a visibility check for possible
reflected objects by reflecting the view frustum.
(h) visibility checking for 1st recursion shadows in per object and shadow
volume by reflecting the viewport. In other words, reflect the viewport
on the reflective surface and do a shadow volume test with the reflected
viewport to see if an object on a reflected surface can receive shadow.
(i) As needed, tessellate the projection plane to 2^n by 2^n picture parts
(depending on the actual frame complexity) and do (b)(g) for each part.
Some more details about (a)(i):
for (a):
One possibility for checking an object against the local render space is:
if (object.distance + object.radius < max_view_distance + max_light_radius)
add_to_localscene;
In other words, if an object's distance from the eye is less than that of the
far clipping plane's distance plus the maximum radius a light is allowed to
affect, then that object potentially affects the scene.
for (b):
An orthogonal BSPtree is used because it can be calculated very quickly.
for (c):
This rectangle, i.e., the near plane of the view frustum, helps make the
visibility check for (i) faster.
for (d):
The "object receives light" test simply checks if the object's bounding sphere
is close enough to the light source, as each light is given a maximum radius
of effect. If not, this reduces the loopcalls for lighting and shadows during
rendering. That is, "for (l=0; lobj shadows" means finding what object can cause a shadow on another
object:
if ( distance(object_a,object_b) + BoundingSphereSize(object_a) +
BoundingSphereSize(object_b) > max_light_radius)
then object_a can't cause shadow on object_b (for all visibility and
reflection rays).
"light<>obj shadows" means checking if object_a can cause any shadow in the
picture. For example, if (ShadowVolume(object_a,light) == inside viewport)
object_a can cause a shadow; (works for visibility, not for reflection). In
other words, take the shadow volume for object_a and test it for intersection
with the viewport frustum (using, say, bounding spheres around each of these
as possible to try to obtain a quick exit).
for (f):
"1st recursion shadows" are those which are directly cast onto a visible
object, and not the shadows which can be seen on reflected objects. Surviving
shadow volumes from (e) are then checked with the orthogonal BspTree down to
the leaves. If the shadow volume does not overlap any (visible) receiving
object, then the cast shadow has no effect on the scene.
Another approach is to compute the minimum projected shadow volume, which
means to get the minimum shadow volume rectangle on the projection plane to
reduce the shadow tests in (i). In other words, I project the shadow volume
onto the view plane and find the 2D rectangle it covers on the screen. Using
this 2D rectangle, a simple "is pixel in rectangle" test tells if a shadow
test is need or not.
for (g):
This means building up a list of objects which can be seen in the reflection
object from camera. This can be done easily for planes by reflecting the
viewcone/frustum at the reflection plane. For quadrics I use something
different: calculate the maximum rectangle inside the quadric parallel to
the projection plane. For example, for a cylinder you can compute a rectangle
formed of the two silhouette edges of the curved sides of the cylinder, along
with two edges for the endcaps. This rectangle is then used to cull away
anything behind the cylinder. In other words, a rectangle inside the cylinder
is used to perform occlusion culling, and any objects behind the cylinder are
thus eliminated from consideration.
for (h):
Here the obj<>obj shadow lists are used.
for (i):
The list of visible objects has already been built for the whole image, as
well as lists of possible shadowing objects. The next step is to find if
image subdivision is needed. What I do is: subdivide the image if the number
of rays needed for visibility tracing is more than 4 times higher than the
resulting view frustum tests for the next subdivision level. For example:
if (needed_rays>4*view_frustum_tests_for_next_subdivision_level)
subdivide_subimage;
(I am sure this is not the best way, but it's the best I found so far ;). In
other words, say I know that 8000 rays will be needed for the image, and I
have a current list of 1700 objects. I know I'll need to do 1700 view
frustum/object tests. As 8000 is greater than 4*1700, the image will be
subdivided.
Every subdivision level uses the list of the level above it, so in the first
step you get 4 sublists from the single list, from each of these 4 you get 4
new sublists, and so on. Normally I don't use more than 3 subdivision levels,
to avoid getting too much overhead.
All these preframe checking techniques are optimized with special routines
for each primitive (sphere<>plane, sphere<>cylinder, sphere<>sphere,
sphere <>ellipsoid, ...) and the use of bounding boxes and bounding spheres.
With our benchmark at 320x180 during 2375 frames, (b)(i) reduces the number
of intersection calculations by a high ratio (of course, this depends on the
scene to be raytraced):
No. Of Rays  without (b)(i) with (b)(i) ratio

Visibility  8,589,935,274 84,251,189 101.9:1

Shadows  15,533,963,921 69,563,770 223.2:1

Visibility  6,444,228,737 52,383,307 123.0:1
in reflections 

Shadows  7,920,241,335 112,368,826 70.4:1
in reflections 

Avg. Fps  1.18 24.26 20.5:1
(on AMD1200)
The main code optimization used is to reduce cachemisses, so lots of lists
are copied around during rendering. For the rest the code is C, setting the
compiler options to do Pentium II optimizations. Assembler optimizations,
such as using MMX, SSE or 3Dnow, are not used yet.
About image quality:
To provide a realistic look it is important to handle different material
shaders, therefore 7 layers of textures are used. These are color, reflection,
bump, diffuse intensity, specular intensity, glow, and translucency (some of
these handle procedural textures). Another improvement to materials is to
handle more than Lambert and Phong (cos, cos()^n) for shading. In fact, 12
different shading models for diffuse and specular light are implemented.
To improve the lighting quality, volume lights (and radiosity in the next
benchmark) are implemented, but this is still too slow. Some more work has
to be done there to get it to be realtime with dynamic scenes.
For more information about RealStorm: http://www.realstorm.com

Efficient Bounding Box Intersection, by Brian Smits
[Brian commented on my article in the previous issue, RTNv14n1, in
which I give a somewhat crazy way of testing boxes for intersection.
Rereading that article, I realize that I did not stress enough that
the approach there was not meant to be considered efficient or
practical. I was more interested in the fact that an alternate approach
was possible. Anyway, the silver lining was that I received a nice
little article from Brian. It's something he's discussed in passing
elsewhere, and I'm glad to have it written out in full here.  EAH]
I looked over the BBox intersection idea at the end of the last RTN.
It seems way too complex.
As I recall:
intervalMin = ray.T.min;
intervalMax = ray.T.max;
t1 = (minX  originX ) * invDirX
t2 = (maxX  originX) * invDirX
if (invDirX > 0) {
if (t1 > intervalMin) intervalMin = t1;
if (t2 < intervalMax) intervalMax = t2;
if (intervalMin > intervalMax) return false;
}
else {
if (t2 > intervalMin) intervalMin = t2;
if (t1 < intervalMax) intervalMax = t1;
if (intervalMin > intervalMax) return false;
}
repeat for each direction.
Note that the first "if" statement is constant for all bboxes intersected by
the ray, so if you want to uglify your code you can create 8 versions of
this function [i.e., one for each octant, dependent on the ray's direction],
have no nested ifs (although you still have ifs that are dependent on
operations done in previous ifs), and each function gets a lot simpler.
bboxPosXDirPosYDirPosZDir
t1 = (minX  originX ) * invDirX
t2 = (maxX  originX) * invDirX
if (t1 > intervalMin) intervalMin = t1;
if (t2 < intervalMax) intervalMax = t2;
if (intervalMin > intervalMax) return false;
repeat for each direction.
It seems unlikely that anything involving projecting points and doing any
sort of polygon containment tests could be faster. Further argument is that
for me, my bounding box test is either 4 or 10 times (memory is going)
faster than my fastest ray triangle intersection (significant precomputation
for each triangle).
Let me know if this isn't clear or I made a mistake in the above code.
[While we're on the topic, I wanted to add this note. There is an interesting
little variant on ray/box testing. The article is:
Schroeder, Tim, "Collision Detection Using Ray Casting," Game Developer,
vol. 8, no. 8, pp. 5056, August 2001. Code at http://www.gdmag.com/code.htm
(it's the last routine).
It has an odd variant on ray/box intersection there, actually linesegment/box
intersection (but that's really what we all do, anyway). He takes the
endpoints of the line segment and forms clip codes; you know, are both
endpoints to the left or right of the bounding box, above or below, etc.
If they are, he's done, he's missed. If they aren't, then from the codes
he has knowledge of which faces could possibly be hit. Then test each possible
face for an intersection using Woo's method, essentially. For example, if he
finds the endpoints are in X.lo, X.middle; Y.middle, Y.hi; and both in
Z.middle, he knows he has to test only the Xmin and Ymax faces of the box.
Sounds like too much work for me, but maybe the clip codes give him an
earlyon high rejection rate and that's a win.  EAH]

Array Good, Linked List Bad, by Matt Pharr
[My title, not Matt's. Matt wrote me about my article in RTNv12n1, on
allocating grids.  EAH]
I was looking around for something else when I came across that RT news
from 1999 that had that suggestion about replacing linked lists in grids
with memory that was allocated contiguously. So I made the corresponding
changes to the little ray tracer project I've been working on here
and... blammo! 1020% faster! Who knew lists were *that* bad? Very
exciting!
My code used to be something like:
struct Mailbox {
int id;
Primitive *prim;
};
struct Grid {
...
list *voxels;
};
Where voxels is indexed by z*XRes*YRes + y*XRes + x.
I basically just changed that STL list<> in Grid to a vector<>, which is an
automaticallygrowing array, doubling in size whenever you fill it up. So
that 10% came from just keeping the Mailbox *'s contiguous in each voxel,
and getting rid of the linkedlist traversal overhead.

Questions and Answers, by Vlastimil Havran
[Vlastimil Havran had a number of comments on some questions I sent out to
the RTNews subscription list before SIGGRAPH 2001. I've excerpted two of his
responses here.  EAH]
Mailboxing, yea or nay?
I tested mailboxing within the GOLEM project, I can switch it on and off by
recompiling the source codes using macros for conditional compilation as
#ifdef ... My experience is that for the primitives in GOLEM the use of
mailboxing improves the time required by at most 56%, but it can also
increase the time by the same amount. This is heavily dependent on the
surfaces, scene geometry and camera setting used. For triangles, spheres,
cones, cylinders and other simpler primitives (as used within GOLEM)
together with a good ray shooting algorithm, the use of mailboxing does
not make sense or is completely questionable. For objects, where the
rayobject intersection is computationally demanding (as NURBS and other
spline, algebraic surfaces) it probably makes sense in most cases. The
worse the ray shooting algorithm, the higher probability of testing
unsuccessfully the same objects, and then logically mailboxing will be more
successful. What would be a possible topic for research: given some ray
shooting algorithm and the scene, predict whether the use of mailboxing
makes sense or not. But the improvements in the time can be not so
surprising. Since mailboxing also causes problems when uses in a parallel
environment and particularly requires writing the results in the first
access to the object into the main memory, my current feeling and all previous
experience is then "no mailboxes," or use them only for some class of complex
objects.
Shared platform:
Another very important question in the category of RT philosophy is that
people in the RTcommunity are able to use some common rendering framework,
at least in the research field. I am aware that about 25 frameworks have been
in use for this purpose, some of them are publicly available, some of them
not. In some sense, many of them have completely common features etc. It is
a phenomenon that everyone (including me) who gets to know about the existence
of ray tracing, must write his own ray tracer, usually very simple for the
first time, then another one, etc. But overall it is, in some sense, quite a
loss of time. When people start to collaborate on some good international
project for ray tracing and global illumination, it would be very fine to have
a common library which would cover at least the research community (about a
common RT library for commerce I am much more doubtful). Some of the diversity
of research makes this difficult, but I think that this is just the problem of
people. When people use a shared library, then their research work would be
100% reproducible (which has also minuses, since it decreases the advance past
competitive researchers). And it also requires a professional approach from
the people involved in such a project. I was thinking about the topic very
much in the last year, and I think it is really challenging problem.

Scanline vs. Ray Tracing, by Piero Foscari ,
Paul Kahler , and Matt Pharr
Piero Foscari writes, on the RealTime Raytracing email list
http://www.onelist.com/community/realtime_raytracing:
But one side question... why do people keep including scanline rendering in
commercial packages when even with actual medium sized scenes and unoptimized
engines raytracing is faster?
Paul Kahler responds:
Because change takes time. First you have to convince people, THEN they
can change the software :) Generally you must SHOW people in order to
convince them, which often involves showing them with their own system.
That means one of their own developers has to become convinced (on their
own of course) and have time to do a prototype to show management. There
is also the "if it ain't broke, don't fix it" problem.
Matt Pharr replies:
Frankly, I don't think that's it at all. Rather, the simple reason that
commercial packages still include scanline support is that raytracing the
eye rays isn't in fact faster than a good scanline algorithm. Obviously
this depends on how good the raytracer is versus how good the scanline
algorithm is, but I think that the gap is pretty significant, if you're
comparing with a scanline algorithm that does good occlusion culling, etc.
Particularly when you need to sample a pixel area very well, e.g. in order
to get smooth motion blur or good antialiasing of fine geometry (think
hair), raytracing just doesn't make sense, efficiencywisetracing
hundreds of rays per pixel just takes too much time.

13th Annual Eurographics Workshop on Rendering, by Nelson L. Max
[Nelson has been writing these worthwhile summaries of EGWR for a few years
now. Here's this year's installment.  EAH]
The Eurographics Workshop on Rendering is an established conference, and
the 13th annual session was held in Pisa, Italy, on June 26  28. Here
are summaries of all the papers presented.
Fernandez, Bala, and Greenberg computed "Local Illumination
Environments" which store in octree cells the fully and partially
visible light sources and potential occluders for the cell, accelerating
direct lighting computations. Wald, Kollig, Benthin, Keller, and
Slusallek do parallel global illumination by tracing coherent ray groups
with quasiMonte Carlo integration sampling. Dimitriev, Brabek,
Myszkowski, and Seidel do selective photon tracing for moving scenes,
first tracing "pilot photons" to detect regions of change, and then
tracing "corrective photons" for more accurate integration there.
Coconu and Hege have an octreebased point rendering system, which
composits in hardware RGBA splats, using the recursive back to front
sort for the octree cells, and the LDI occlusion compatible ordering for
the points in an octree node. Botsch, Wiratanaya, and Kobbelt replace
the points by the octree cells themselves, splatting the cell centers.
Since most cells will be empty for surfaces, they suggest an efficient
encoding taking about 2 bits per full cell. They also encode the normals
and colors, and shade the normal codes, rather than the individual
points.
Yang, Everett, Buehler, and McMillan described a system of 64 unsynchronized
cheap firewire video cameras connected to 6 PCs, which warp on graphics cards
only the parts of their input images that will be needed in the final novel
image, composited by a seventh PC. With no use of geometry to enhance the
light field reconstruction, multiple exposures are visible outside of the
plane of focus.
Sander, Gortler, Snyder, and Hoppe construct a metric tensor representing
the detail in a color or normal texture, and reparametrize the texture
coordinate charts to minimize the stretch in this metric, putting the texture
detail where it is needed most. Zelinka and Garland do very fast texture
synthesis from examples using a jump map, which lists for each input pixel
a set of matching input pixels. A new texture pixel is usually synthesized
by extending the input patch of one of its neighbors, but occasionally a
jump to a new patch is taken using the jump map. Lefebvre and Neyret generate
either color or bumpmap tree bark textures by an approximate simulation of
the cracking process of the inelastic bark layers as the tree grows in
diameter beneath them.
Ward and EydelbergVileshin render color accurately by defining the
surface colors in CIE XYZ space using spectral integration with the
spectrum of the dominant illuminant (good for direct illumination from a
single kind of source, but not for multiple bounces or spectrally
different sources) and then correct for whitepoint balancing in the
"Sharp color space", which is based on supersaturated RGB primaries.
Beckaert, Sbert, and Halton speed up path tracing by reusing all but the
first leg of the ray paths at nearby pixels. Cammarano and Wann Jensen
do time dependent photon mapping for correct motion blur of caustics by
storing time as well as color, energy, and direction at each photon hit,
and then averaging photons that are near in time as well as in space
when rendering with a distributed ray caster.
Ashikhmin maps high dynamic range input images to a limited output range
by determining a local adaptation level (the average over the largest
neighborhood without high contrast), tone mapping this adaptation level
using linear compression, and then putting back the detailed contrast by
multiplying by the ratio of the input pixel value to the local
adaptation level.
Sawhney, Arpa, Kumar, Samarasekera, Aggarwal, Hsu, Nister, and Hanna map
multiple realtime video surveillance images as textures onto a navigable
urban geometric environment model. Yamazaki, Sagawa, Kawasaki, Ikeuchi,
and Sakauchi extract RGBZ microfacets from a laserscanner / colorcamera
input images by clipping the color/range data to small cubical volumes,
and projecting it onto small quadrilaterals perpendicular to the output
viewing directions. The projection and depth clipping is done in hardware
by converting the depth to an opacity, and using register combiners and
alpha clipping.
Jeshke and Wimmer simplify a complex model into textured polygonal
meshes suitable for a specific view cell by scan converting the model
into a voxel representation using depthclipped hardware rendering into
layers. The layer spacing is chosen so that parallax from a moving
viewpoint within the view cell is no more than one pixel. The layers are
processed in fronttoback order, in order to delete all occluded
voxels, with special treatment for the onepixelwide border between
filled and empty pixels in each layer. An initial complex polygonal mesh
is constructed and then simplified, with the constraint that the
simplified mesh must cover exactly all the nonoccluded voxels.
Nirenstein, Blake, and Gain do exact polygontopolygon visibility by
representing all stabbing lines connecting the two polygons in a 5D
euclidean space using Plucker coordinates (with one of the six coordinates
normalized to 1). By computational geometry in this space, volumes are
subtracted for stabbing lines hitting each occluder, leaving the
nonoccluded stabbing lines. This is applied to give exact visibility
culling for a polyhedral viewpoint volume.
Baxter, Sud, Govindaraju, and Manocha achieve interactive walkthroughs
for huge complex environments, with a parallel pipelined system using
two graphics pipelines and three CPU processes, based on an axis aligned
bounding box scene hierarchy, hierarchical Z buffer occlusion culling,
and level of detail decisions. The first process drives one pipeline
renders an Item/Z buffer from the culled geometry for the previous
frame, which is read back for the hierarchical Z buffer occlusion cull.
Another process performs the view frustum and occlusion culls and level
of detail decisions in software, and the third process drives the final
rendering in the second graphics pipeline.
Secord, Heidrich, and Striet place point or stroke primitives for
illustrationstyle rendering of a grey scale image, by redistributing a
predefined Poissondisc or Halton sequence 2D point distribution. The
marginal y cumulative distribution of intensity in the input image
(found by integration in x along scan lines) is used to redistribute the
y coordinates, and then the cumulative distribution in x on the scan
lines is used to redistribute the x coordinates. Corrections are made to
account for the probability that multiple stokes are assigned to the
same pixel. Stroke direction can be determined from image gradients,
colors, or information from a 3D model, if available.
Freudenberg, Masuch, and Strothotte use blending operations on graphics
hardware to emulate the halftone screening process traditionally used in
the printing industry. Hertzmann, Oliver, Curless, and Seitz create line
drawing line styles from examples, by adapting ideas from texture synthesis
from examples. Masselus, Dutre, and Anrys capture the reflectance field of
an object by determining the position of a handmoved light source from its
effects on four small diffusely reflecting spheres. Furukawa, Kawasaki,
Ikeuchi, and Sakauchi compress a 4D bidirectional texture function
measured from images of an object under varying illumination, using a
tensor product expansion.
Matusik, Pfister, Ziegler, Ngan, and McMillan acquire low resolution
reflectance information for a reflecting and refracting object using a
traditional setup of multiple rotating lights, a fixed set of cameras,
and a rotating turntable for the object. But they add two large plasma
panel monitors, a horizontal one below the turntable, and a vertical one
behind the object, to obtain high resolution reflection and refraction
information, each represented by a gaussian weight distribution about a
reflection or refraction direction, as determined by fitting the response
to 1D sinusoidal wave patterns in three orientations on the monitors. These
monitor views are also used for detecting an alphamatted visual hull. The
parameters of the gaussians are determined for a new view using unstructured
lumigraph interpolation, and used to convolve with a new environment map.
Wexler, Fitzgibbon, and Zisserman solve a similar problem by using a
sequence of images of a transparent object, moving in front of a fixed
planar background image, but use a weighted combination of a few nearby
pixels instead of a gaussian, and consider only refraction.
Kautz, Sloan, and Snyder convolve a BRDF with an environment map by
representing both with spherical harmonics. The necessary rotations of
the spherical harmonic coefficients for the environment into the local
frame for the surface are done per vertex in software. The rotation of
the viewing vector, the lookup of the spherical harmonic coefficients
for expansion in the lighting direction from a texture function of the
viewing vector, and the dot product of the two aligned spherical harmonic
coefficients to do the convolution in frequency space, are done per pixel
in hardware.
AkenineMoller and Assarsson generalize Frank Crow's shadow volume
method to penumbras by following along profile contour polylines
creating a shadow wedge for each edge, bounded by the umbra plane, the
penumbra plane, and two side planes separating the wedge for the current
edge from those from the preceding and following edges on the contour.
If a surface point is found to lie inside a wedge, a light intensity is
found by bilinear interpolation within the quadrilateral where the plane
through the viewing ray parallel to the profile edge intersects the
wedge.

SSM and OOM Become Enigma, by Mark Manning
In regards to RTNews8b, I'm one of the original programmers for SSM (way back
in 1988 is when I started). I thought I'd bring you up to date a bit.
First  we have no control over what the University of Georgia does with
the software. We write it  they sell it. I've written to them recently
though and asked them why such a high price for the software. Especially
since SSM and OOM are now very very old in terms of 3D software. I've asked
them to reduce the cost to something like $50.00. Time will only tell if
they do this or not. :)
Next  SSM and OOM went by the wayside back in 1994 when we combined the
two pieces of software into a single package called Enigma. The match
hasn't gone all that well for many years but is now beginning to come
together at last. That is to say  we went through the change, then had
to redo Enigma so it worked with XWindows (SSM/OOM were written before
XWindows came along), then OpenGL caused another revamp, finally C++
appeared and we have had several problems converting several hundred
thousand lines of code from C to C++'s methods. Along the way Enigma
has won several awards at NASA both for its capabilities as well as its
general usefulness. We are also revamping several methodologies within
Enigma which will enhance its speed considerably. We hope to win even
more awards by doing this. Our highest achievement was to come in
second place over all of the other thousands of programs across America
which have been written and used at NASA. The first place program was
truly awesome and deserved first place.
In going to Enigma, we have combined the creation of models with the
animation of those models. This might not seem too big of a deal in
today's world of Everquest but this capability was put in back before
the Internet even began having an impact on everyone's lives. Remember
too that this is an engineering tool and not really meant for games or
movies. It's meant to be as accurate as possible. So we also sacrifice
speed sometimes for precision.
Some of the new additions to Enigma are the capability to adjust the
entire model, one of the objects within the model, or individual surfaces
(polygons). I am presently expanding upon this by making it so someone can
adjust an individual polygon and stretch all of the polygons around that
polygon, snap the polygon off from the rest and move it, or to be able to
stretch a polygon to fill holes and the like. We are also enhancing the
animation capabilities, speed of redrawing scenes, speeding up the location
of a particular model/object/surface, and many other things which were not
a part of the original SSM and OOM programs.
If you would like more information about what all SSM and OOM has evolved
into  just let me know.

A Simple RayTriangle Pluecker Optimization, by Jeffrey Mahovsky
[Though I personally prefer methods such as Moeller/Trumbore (see
http://www.ce.chalmers.se/staff/tomasm/code/) for ray/triangle
intersection, there are occasions when using Pluecker coordinates
can make sense. The test is straightforward and elegant. For some
previous work on the subject, see Ray Jones' article in RTNv13n1.  EAH]
There is a simple way to reduce the memory requirements of raytriangle
intersections by 17% when using Pluecker coordinates. The optimization
also improves performance by roughly 5%.
Raytriangle tests using Pluecker coordinates will look something like
this:
if (r0 * a4 + r1 * a5 + r2 * a3 + r3 * a2 + r4 * a0 + r5 * a1 < 0)
return MISS;
if (r0 * b4 + r1 * b5 + r2 * b3 + r3 * b2 + r4 * b0 + r5 * b1 < 0)
return MISS;
if (r0 * c4 + r1 * c5 + r2 * c3 + r3 * c2 + r4 * c0 + r5 * c1 < 0)
return MISS;
else
return HIT;
Where r0..r5 are the Pluecker coefficients for the ray and a0..a5, b0..b5,
and c0..c5 are the Pluecker coefficients for the three triangle edges.
Say the triangle is defined counterclockwise as vertices (x0, y0, z0),
(x1, y1, z1), (x2, y2, z2). Then the equations for generating the Pluecker
edge coefficients are:
a0 = x0 * y1  x1 * y0
a1 = x0 * z1  x1 * z0
a2 = x0  x1
a3 = y0 * z1  y1 * z0
a4 = z0  z1
a5 = y1  y0
b0 = x1 * y2  x2 * y1
b1 = x1 * z2  x2 * z1
b2 = x1  x2
b3 = y1 * z2  y2 * z1
b4 = z1  z2
b5 = y2  y1
c0 = x2 * y0  x0 * y2
c1 = x2 * z0  x0 * z2
c2 = x2  x0
c3 = y2 * z0  y0 * z2
c4 = z2  z0
c5 = y0  y2
Realize that:
a2 + b2 + c2 = x0  x1 + x1  x2 + x2  x0 = 0
a4 + b4 + c4 = z0  z1 + z1  z2 + z2  z0 = 0
a5 + b5 + c5 = y1  y0 + y2  y1 + y0  y2 = 0
therefore:
c2 = a2  b2
c4 = a4  b4
c5 = a5  b5
Substituting into the 'if' statements:
if (r0 * a4 + r1 * a5 + r2 * a3 + r3 * a2 + r4 * a0 + r5 * a1 < 0)
return MISS;
if (r0 * b4 + r1 * b5 + r2 * b3 + r3 * b2 + r4 * b0 + r5 * b1 < 0)
return MISS;
if (r0 * a4  r0 * b4  r1 * a5  r1 * b5 + r2 * c3 +
r3 * a2  r3 * b2 + r4 * c0 + r5 * c1 < 0)
return MISS;
else
return HIT;
Simplifying, the optimized form is obtained:
A = r0 * a4 + r1 * a5 + r3 * a2
if (A + r2 * a3 + r4 * a0 + r5 * a1 < 0)
return MISS;
B = r0 * b4 + r1 * b5 + r3 * b2
if (B + r2 * b3 + r4 * b0 + r5 * b1 < 0)
return MISS;
if (r2 * c3 + r4 * c0 + r5 * c1  A  B < 0)
return MISS;
else
return HIT;
Note that we no longer need to precompute and store c2, c4, and c5.
Hence only 15 Pluecker coefficients need to be stored for the triangle
instead of 18  a 17% memory savings.
There should be no penalty for separately computing values A and B, as the
total number of multiplies and adds is the same for both methods up until
the third 'if' statement. The optimized third 'if' has 3 multiplies and 4
adds/subtracts, instead of 6 multiplies and 5 adds in the unoptimized
version. Hence the savings is 3 multiplies and 1 add. This results in a
roughly 5% speed increase. Few raytriangle tests make it all the way to
the third 'if' so the overall speed increase is not as great as one might
think.
This optimization also extends to convex polygons with more sides,
although the percentage savings is not as great. For example, a 4sided
poly would require 21 coefficients instead of 24 (a 13% savings) and the
fourth 'if' statement would require 3 multiplies and 5 adds/subtracts, a
savings of 3 multiplies. (Note the additional subtract!)
(Note that the Pluecker test shown here doesn't compute an intersection
distance along the ray; a separate calculation is needed for that.)

Recovering a Triangle's Surface Normal from the Pluecker Coordinates of its
Edges, by Jeffrey Mahovsky
There is a simple way to recover the unnormalized surface normal of a
triangle from the Pluecker coordinates of its edges. This avoids the need to
store the surface normal separately from the Pluecker coefficients.
Given a triangle with vertices (x0, y0, z0), (x1, y1, z1), (x2, y2, z2), the
Pluecker coordinates of the edges A, B, and C are given by the following 18
coefficients:
A0 = x0 * y1  x1 * y0;
A1 = x0 * z1  x1 * z0;
A2 = x0  x1;
A3 = y0 * z1  y1 * z0;
A4 = z0  z1;
A5 = y1  y0;
B0 = x1 * y2  x2 * y1;
B1 = x1 * z2  x2 * z1;
B2 = x1  x2;
B3 = y1 * z2  y2 * z1;
B4 = z1  z2;
B5 = y2  y1;
C0 = x2 * y0  x0 * y2;
C1 = x2 * z0  x0 * z2;
C2 = x2  x0;
C3 = y2 * z0  y0 * z2;
C4 = z2  z0;
C5 = y0  y2;
The unnormalized surface normal can be recovered with the following equations:
ni = A3 + B3 + C3
nj = A1  B1  C1
nk = A0 + B0 + C0
Proof:
The surface normal components (ni, nj, and nk) of a 3D triangle are directly
proportional to the areas of the 2D triangles that result when the 3D
triangle is projected onto each of the x, y, and z planes.
The area of the 2D triangle projected onto the z plane (and the k component of
the surface normal) is given by the equation:
 x0 y0 1 
areaZ or nk = 1/2  x1 y1 1 
 x2 y2 1 
(This assumes the vertices are defined in counterclockwise fashion.)
(Any good book on geometry should have this area equation.)
Expanding the determinant gives:
areaZ or nk = 1/2 * (x0 * y1  y0 * x1  x0 * y2 + y0 * x2 + x1 * y2  y1 * x2)
The factor of 1/2 can be removed as the resulting nk component isn't normalized anyways.
areaZ or nk = x0 * y1  y0 * x1  x0 * y2 + y0 * x2 + x1 * y2  y1 * x2
Similarly, for ni and nj:
 y0 z0 1 
areaX or ni = 1/2  y1 z1 1 
 y2 z2 1 
areaX or ni = y0 * z1  z0 * y1  y0 * z2 + z0 * y2 + y1 * z2  z1 * y2
 x0 z0 1 
areaY or nj = 1/2  x1 z1 1 
 x2 z2 1 
areaY or nj = x0 * z1 + z0 * x1 + x0 * z2  z0 * x2  x1 * z2 + z1 * x2
Comparing these equations with the Pluecker coefficient equations, it is easy to see that:
ni = A3 + B3 + C3
nj = A1  B1  C1
nk = A0 + B0 + C0
This idea will probably extend to any convex nsided polygon with edges defined
as Pluecker coordinates. The number of addition operations needed to recover
each normal component would be (n1), where n is the number of sides.

Objects per Grid Cell, by Pete Shirley
I wrote Pete about uniform grid efficient structures, and how many cells they
should have compared to the number of objects in the scene:
In your book "Realistic Ray Tracing" you mention that the number of cells in a
grid should be within an order of magnitude of the number of objects. True,
but interestingly, people seem to find that the number of cells should be
noticeably larger than the number of objects for best performance, e.g. 3* as
many cells as objects. Again, mileage varies, but if this turns out to be the
case, then mailboxing becomes very important, as there will more likely be
objects spanning two or more grid cells.
Pete's reply:
Bruce Walter and I did a bunch of work on analyzing this in an as yet
unpublished paper. Definitely if (cost of object intersection) is greater
than (cost of cell advance) then you want more cells than objects. My guess
is between 2 and 10 has worked best in my code for most models. 8 is my
default.

END OF RTNEWS