CS184 Lecture 7 summary

3D to 2D projections and cameras.

To view 3D objects on a screen, the 3D coordinates must be mapped to 2D by a projection. The main projections we are interested in are orthographic and perspective projections.

First of all, we assume a change of coordinates such that all objects are transformed to the reference frame of the viewer. The viewer's view is at (0, 0, 10) and directed along the -Z axis by default in VRML. The viewer's position is sometimes called the camera position.

For orthographic projection, we simply ignore the z-coordinate. That is, all 3D points (x, y, z) project to (x, y). Here is a VRML cube in orthographic projection.

Perspective projection is a much better approximation to what physical cameras (including the human eye) see when viewing a 3D scene.

To compute perspective projection, draw light rays from object points to the viewer's origin, and intersect those with the viewing screen. Algebraically, projection to a screen at z = -1 is given by the mapping (x, y, z) ® (-x/z, -y/z). More generally, if the screen is at z = zS, the projection is given by:

(x, y, z) ® zS(x/z, y/z)

Here is the same VRML cube in perspective projection. Note that perspective projection exhibits foreshortening, that is, close objects appear larger (because their z-coordinate magnitude is smaller).

Using Homogeneous Coordinates

Homogeneous coordinates we have seen are good for representing both rotation and translation in 2D and 3D. They are also ideal for perspective projection. So far we have assumed that the last coordinate in the homogeneous coordinates of a point is 1. That is

(x, y, z) is represented as (x, y, z, 1)

But in general, we can represent

(x, y, z) as (dx, dy, dz, d)

for any value of d. So the homogeneous point

(x, y, z, h) represents the physical point (x/h,  y/h,   z/h)

Which allows us to do a perspective projection by a linear map:

(x, y, z) ® (x, y, 0, -z) which is the same as (-x/z, -y/z, 0)

The projection is therefore representable as a single 4x4 matrix T which is

[ 1  0  0  0 ]
[ 0  1  0  0 ]
[ 0  0  0  0 ]
[ 0  0 -1  0 ]

Since the transformation from world to viewer-centered coordinates is also a 4x4 matrix, we can compute the perspective projection to an arbitrary viewpoint in a single 4x4 matrix by composing them.

Finding the object

Its often desirable to place the camera at some arbitrary position v =(vX, vY, vZ). But we must also rotate the viewpoint so that the origin (where most objects are centered) projects to the center of the screen. Furthermore, the "natural" appearance of objects is with their Y-axis up, or orthogonal to the X-axis.

If the viewpoint corresponds to a coordinate frame X', Y', Z', then

Z' = v/|v| where v is the viewpoint position, to align Z' with the view center v.

X' = Y x v / |Y x v| which ensures that X' is normal to Y and Y'.

Y' = Z' x X' which ensures that Y' is parallel to Y.

And the rotation R = (X', Y', Z') transforms points represented in viewer coordinates to world coordinates, if the viewer system requires rotation only. To translate points as well, we use the transformation T which is

[ R    v ]
[ 0    1 ]

which maps the origin in the viewer frame to v in the world frame.

Example

Suppose we place the viewpoint at (5, 10, 15). Then using the above expressions the viewer frame is defined by

Z' = ( 0.2673   0.5345   0.8018)
X' = ( 0.9487  0             -0.3162)
Y' = (-0.1690  0.8452  -0.5071)

v  = ( 5, 10, 15 )

So the matrix R is (X' Y' Z') which is

[  0.9487   -0.1690   0.2673  ]
[  0.0           0.8452    0.5345  ]
[ -0.3162   -0.5071   0.8018  ]

To represent the matrix R = (X', Y', Z') in axis-angle form, we compute R - RT, which is

[   0.0           -0.1690    0.5835 ]
[  0.1690      0.0           1.0416 ]
[ -0.5835   -1.0416     0.0         ]

From which the anti-symmetric part is s = (-R23, R13, -R12) = (-1.0416, 0.5835, 0.1690). The rotation axis is the normalized vector

a = s/|s| = (-0.8638   0.4839   0.1402),

while the angle is the arcsin of half of |s| (see lecture 4), which is 0.6471.

Thus the VRML rotation is (a q) =

-0.8638   0.4839   0.1402   0.6471

Here is a view of the cube from that perspective.