Two ways to think about transforms

I was just answering a question for the Udacity Interactive Graphics MOOC. I had made a rather confusing lecture, much more involved and less informative that I would have liked, so today I wrote a re-do (sadly, it’s not easy to make a new video, since step 1 is “fly from Boston to San Francisco”). I’m still not thrilled with my description – what do you think? Is there a better way to talk about this subject? Anything I could improve? Surprisingly, this course still gets about 35 sign-ups a day (though I’m guessing maybe one of those actually finishes), so it’d be nice to make this lesson better.

Background: up to this point in the course I’d been showing how you typically write down transforms from right to left (OpenGL-style column-major matrices), e.g. “TR” means “rotate the object, then translate it (in world space) to some location.” In this lesson I wanted to point out that you can also read the transform order from left to right.

=================

You’re at 41 Avenue George V in Paris. Someone comes up and asks “How can I see the Arc de Triomphe?” You tell them, “Go up two blocks and then turn to the left – you can’t miss it.” Indeed, at 101 Avenue des Champs-Élysées he can see L’Arc de Triomphe.

So if you wanted to take this person and apply these two transforms, translation T (walk two blocks) and rotation R (turn about 60 degrees to the left), how would you write that out? Think about it for a minute, then scroll down for the answer. (And I like the disembodied arm to the right from Google’s street view).

The order is (right-to-left “application order”): TR. That is, you want to apply the rotation first, so that it doesn’t affect the translation. So you rotate the person 60 degrees to the left, then you translate him north two blocks north, which is then not affected by the rotation.

If you incorrectly used order RT, you would first translate him north two blocks, so far so good. But, as you saw in the snowman lesson, rotating after translation means the object is rotated around the origin from his present location; in this case, the person’s starting location is the origin. So performing a translation, then a rotation, would move him up two blocks north, then rotate him in a circle with a 2 block radius by 60 degrees, putting him somewhere else in the city (Rue Euler, I guess, which is a great coincidence that it’s named for a famous mathematician).

I hope you accept TR is the right order, then. But, to describe directions we definitely first said “perform T” – walk two blocks north – “then perform R” – rotate to the left 60 degrees. So we talk about directions in a left-to-right fashion. This may seem odd, as we are then describing the last transform that we apply, T, if we actually want to position the man in his environment.

The key thing here, and the point of the lesson, is that by specifying T first, we’re saying to the man, change your frame of reference to be 2 blocks north. From this new frame of reference, then rotate 60 degrees around where you’re standing, your new origin. It’s how we talk about directions. We don’t say “when you get to your final position, rotate 60 degrees left. Then, to get to your final position, walk two blocks north.”

The person walking has his own frame of reference, where he’s always the origin, and rotations are done relative to whichever way he’s facing at the time. To specify transforms when talking in these terms, an object-centric way of describing things, we describe “from left to right.” When we’re looking at the world and want to think how to make some other object take on a particular orientation and position, we tend to work from right to left, getting it oriented and them moving it into position.

However, it all depends. Moving a couch up a flight of stairs, down a hall, and next to a wall in room is a series of transforms, and again we specify them from left to right. We could also shortcut the process if we don’t care about the intermediate steps along the way. Say the couch is facing north, and we know it’ll end up facing east. We could specify the one 90 degree rotation to get it to face east, then the one XYZ translation to move it directly to its desired location – right to left order, so that the rotation doesn’t interfere with the translation.

The final effect of the transforms – a series of moves or the direct rotation and translation – have the same final effect. The point is, each way of thinking has its uses.

Tags:

1. When teaching this stuff, I always used the metaphor of a spaceship, with transformations from the point of view of the pilot, versus those of Mission Control back on the ground.

2. In his course notes for his graphics classes at NYU, Ken Perlin started embedding simple JavaScript code to animate the concepts. The ability to click ‘n drag on a concept can really help. Examples:

Another good example is this text on Bezier curves, fully illustrated with interactive diagrams: