Glenn Ehrlich on comp.lang.lisp suggests:
The Object-Oriented Numerics page, as well as the Blitz++ page, may be handy references.
Expression templates are a C++ paradigm for lazy evaluation at compile time. CLOS does a lot more at runtime than the C++ class and template systems, so expression templates aren't necessarily as useful for Lisp. However, the the underlying idea of lazy evaluation is useful for avoiding temporaries. In fact, lazy evaluation is probably easier in Lisp with its less strict type system.
It would be interesting to see how much we could do in Lisp at compile time. Doing so may require constructing a "template system" or "compile-time class system." Actually, though, part of this is handled by series expressions (in the CL standard). By writing the appropriate scanner functions for e.g. vectors and vector views, we get part of the effect of lazy evaluation -- at least the part that describes how to combine different vectors.
Lazy evaluation is handy for vector-style operations, as the expression templates page explains. However, how useful would it be for matrix factorizations? It doesn't really help us with reasoning whether a particular operation could be performed in place rather than by copying -- that requires a compile-time code analysis. If we don't do that, it will be up to programmers to select an in-place operation when they know that they can overwrite the input array.
Where it can be helpful, though, is with implicit representations of factors. A good example is the Q factor from a QR factorization. We can save a lot if we know that we don't need to construct Q explicitly. That is a sort of "lazy evaluation." However, making the lazy evaluation effective in saving computation requires some reasoning. For example, the user may start asking for columns of the matrix. If it's just one or two columns, it's probably more efficient to keep the implicit representation and just operate on columns of the identiy matrix to get the columns. If the programmer asks for a lot of columns, however, it may be more efficient just to give him/her the whole thing. Reasoning about that at runtime is possible but incurs overhead: for example, the "Q factor expression" can notice that the programmer is calling for a lot of columns and thus decide to expand itself into an explicit representation. Part of the overhead in this case is examining the vectors to see if they are columns of the identity.
Here are some cases where lazy evaluation is useful:
The only rational option for getting Lisp to support parallel operations "natively" is to wrap some portable parallel processing library, like MPI or more probably GASNet. We can't just hide it under the covers (of e.g. ScaLAPACK) because setting up the data for calling a library such as ScaLAPACK will require some communication (you have to distribute the data first before you can work on it!).
Here is a paper: G. Cooperman, STAR/MPI: Binding a Parallel Library to Interactive Symbolic Algebra Systems, Proc. of International Symposium on Symbolic and Algebraic Computation (ISSAC '95), ACM Press, 126-132. You can get it on Citeseer. Gene's package, ParGCL, uses GCL as the underlying Lisp. It is open-source and available for download here. Last update on the FTP site appears to be 12/16/2005, as of 10/5/2006. According to the GCL website, the GCL authors plan to incorporate ParGCL into GCL itself.
High-level wrappers for ScaLAPACK is a decidedly non-trivial task. Matlab*P does it, as well as Cleve Moler's upcoming parallel Matlab. Again, it's not enough to wrap ScaLAPACK; there need to be parallel primitive operations in order to distribute the data.