CS184 Lecture 37 summary

Aliasing and Anti-Aliasing

Aliasing is a problem caused by sampling in images. e.g. when you draw a line or polygon using a filling scheme you are really sampling a  shape with infinitely-many points at a finite subset of those points (the pixels). Sampling theory says that sampling works OK (in terms of preserving all the information in a signal) when the sampling rate is twice the frequency of the highest-frequency component in the signal. This required sampling frequency is called the Nyquist frequency.

When we talk about 2D shapes, frequency is spatial frequency. If you took the 2D Fourier transform of a polygon (thought of as a 0-1 valued function of x and y, where 0 is outside, 1 is inside), you would get lots of spatial frequencies that are much higher than the rate at which you are sampling, which is one over the pixel spacing. Therefore you get aliasing and the "jaggies" (staircase effect).

You can eliminate aliasing in various ways. Mathematically, what you want to do is to remove those high-frequency components in the signal before you sample it. Removing high frequencies corresponds to low-pass filtering. In the spatial case, that means smoothing. If you think of smoothing out that 0-1 valued polygon function, you would get a gentle mesa with sides that slope gracefully down to 0. When you sampled it, you would get 1 when you're well inside and 0 when you're well outside, but near the boundary, you would get intermediate values. If you display those samples on a monitor, you will probably not see any jaggies. That's the theory behind anti-aliasing.

Supersampling

One simple scheme for anti-aliasing is supersampling. Instead of taking a single sample at the middle of a pixel, with subdivide the area of the pixel into smaller squares, and sample in the middle of each one. Then we compute the average. That amounts to low-pass filtering the image with a square averaging function.  You might recall that this method is an option in BMRT, and we used it for distributed ray-tracing.

For example, if you subdivide a pixel into 3x3 pixels, then you make 9 samples per pixel (sketch). If the object being sampled is a Bresenham-drawn line, there will be a maximum of 3 pixels on. So this scheme gives 4 gray levels per pixel (corresponding to 0, 1, 2 or 3 pixels in the subsamples) between 0 and 1.

Its often desirable to thicken the line (e.g. make it a polygon with parallel sides - sketch). Then there are more pixels that can fit in the 3x3 window, and there will be more gray levels produced. But note that this line will appear thicker as well.

Taking the direct average of the super-sampling pixels isnt the best way to use them. You can weight the samples near the middle more heavily - the book argues that this is "natural" because they are more representative of the pixel value. But a better explanation is that this is a better smoothing function. An equal-weight averaging mask lets through high frequencies. By smoothing the sides of this mask, i.e. by reducing the weights of pixels near the edge of the mask, it does a better job of filtering high spatial frequencies, and therefore a better job of anti-aliasing. e.g. a 3x3 weighted averaging mask is typically:

1   2   1
2   4   2
1   2   1

Area Sampling

Supersampling helps by applying an averaging filter to remove high frequencies. But this filter itself is sampled, albeit at a higher rate than the final image. That means some aliasing happens when the mask is applied. A true continuous averaging mask would compute the exact area of the polygon or thickened line that intersects the pixel area. When the polygons are represented using their edges, its possible to compute the overlap area exactly (sketch). Area sampling can do a better job than supersampling, but its still not a good idea to use a square averaging function (with steep sides). What you really want is a continuous averaging function that tapers off smoothly at the edges.

Spatial Filtering

Spatial filtering is the application of a smooth, continuous averaging function to the image before sampling. That is, you want the area of the primitive that's contained inside the pixel area, weighted by a certain smooth function over the pixel area. This can involve some hairy integrals. So you dont want to do it when try to render 30M pixels a second! But you can pre-compute a lookup table, indexed say by the x and y intersections of lines with the pixel boundary. The weights are additive, so you can use one lookup for the top of the line, and another for the bottom. This is pretty exotic way to draw pixels, but is available in some high-end systems.