RABBIT: A Compiler for SCHEME (A Dialect of LISP) A Study in Compiler Optimization Based on Viewing LAMBDA as RENAME and PROCEDURE CALL as GOTO using the techniques of Macro Definition of Control and Environment Structures Source-to-Source Transformation Procedure Integration and Tail-Recursion Guy Lewis Steele Jr. Massachusetts Institute of Technology May 1978 Revised version of a dissertation submitted (under the title "Compiler Optimization Based on Viewing LAMBDA as RENAME plus GOTO") to the Department of Electrical Engineering and Computer Science on May 12, 1977, in partial fulfillment of the requirements for the degree of Master of Science. RABBIT: A Compiler for SCHEME (A Dialect of LISP) A Study in Compiler Optimization Based on Viewing LAMBDA as RENAME and PROCEDURE CALL as GOTO using the techniques of Macro Definition of Control and Environment Structures, Source-to-Source Transformation, Procedure Integration, and Tail-Recursion Guy Lewis Steele Jr. Massachusetts Institute of Technology May 1978 ABSTRACT We have developed a compiler for the lexically-scoped dialect of LISP known as SCHEME. The compiler knows relatively little about specific data manipulation primitives such as arithmetic operators, but concentrates on general issues of environment and control. Rather than having specialized knowledge about a large variety of control and environment constructs, the compiler handles only a small basis set which reflects the semantics of lambda-calculus. All of the traditional imperative constructs, such as sequencing, assignment, looping, (iOTO, as well as many standard LISP constructs such as AND, OR, and COND, are expressed as macros in terms of the applicative basis set. A small number of optimization techniques, coupled with the treatment of function calls as GOTO statements, serve to produce code as good as that produced by more traditional compilers. The macro approach enables speedy implementation of new constructs as desired without sacrificing efficiency in the generated code. A fair amount of analysis is devoted to determining whether environments may be stack-allocated or must be heap-allocated. Heap-allocated environments are necessary in general because SCHEME (unlike Algol 60 and Algol 68, for example) allows procedures with free lexically scoped variables to be returned as the values of other procedures; the Algol stack-allocation environment strategy does not suffice. The methods used here indicate that a heap-allocating generalization of the "display" technique leads to an efficient implementation of such "upward funargs". Moreover, compile-time optimization and analysis can eliminate many "funargs" entirely, and so far fewer environment structures need be allocated at run time than might be expected. A subset of SCHEME (rather than triples, for example) serves as the representation intermediate between the optimized SCHEME code and the final output code; code is expressed in this subset in the so-called continuation- passing style. As a subset of SCHEME, it enjoys the same theoretical properties; one could even apply the same optimizer used on the input code to the intermediate code. However, the subset is so chosen that all temporary quantities are made manifest as variables, and no control stack is needed to evaluate it. As a result, this apparently applicative representation admits an imperative interpretation which permits easy transcription to final imperative machine code. These qualities suggest that an applicative language like SCHEME is a better candidate for an UNCOL than the more imperative candidates proposed to date. Thesis Supervisor: Gerald Jay Sussman Title: Associate Professor of Electrical Engineering Note The first part of this report is a slightly revised version of a dissertation submitted in May 1977. Where it was of historical interest to reflect changes in the SCHEME language which ocurred in the following year and the effect they had on RABBIT, the text was left intact, with notes added of the form, "Since the dissertation was written, thus-and-so occurred." The second part, the Appendix, was not part of the dissertation, and is a complete listing of the source code for RABBIT, with extensive commentary. It is intended that the first part should be self-contained, and provide a qualitative overview of the compilation methods used in RABBIT. The second part is provided for those readers who would like to examine the precise mechanisms used to carry out the general methods. Thus there are five levels of thoroughness at which the reader may consume this document: (1) The reader who wishes only to skim is advised to read sections 1, 5, 6, possibly 7, 8A, 8B, 8C, 10, 11, and 12. This will give a basic overview, including the use of macros and the optimizing techniques. (2) The reader who also wants to know about the details of SCHEME, the run-time system, and a long example is advised to read the entire main text (about a third of the document). (3) The reader who wants to understand the low-level organization of the algorithms, and read about the more tricky special cases, should read the main text and then the commentary on the code. (4) The reader who additionally wants to understand the nit-picking details should read the code along with the commentary. (5) The reader who wants a real feel for the techniques involved should read the entire document, invent three new SCHEME constructs and write macros for them, and then reimplement the compiler for another run-time environment. (He ought please also to send a copy of any documents on such a project to this author, who would be very interested!) Acknowle4g~q~ I would like to acknowledge the contributions to this work of the following people and other entities: Gerald Jay Sussman, who is not only my thesis advisor but a colleague and a good friend; who is fun to hack programs with; who not only provided insights on the issues of programing, but also was willing to give me a kick in the right direction when necessary; Jon Doyle, one of the first real "users" of SCHEME, who was always willing to discuss my problems, and who carefully proofread the thesis in one day when no one else would or could; Richard Zippel, the other first real SCHEME user, who has discussed with me many possibilities for the practical use of SCHEME-like languages in such large systems as MACSYMA; Carl Hewitt, whose actors metaphor inspired in part first SCHEME and then the investigations presented here; Scott Fahlman, who has Great Ideas, and who paid some of his dues at the same place I did; Jon L White, resident LISP compiler expert and agreeable office-mate, who likes both tea and (4; Dan Weinreb, Bernie Greenberg, Richard Stallman, Dave Moon, Howard Cannon, Alan Bawden, Henry Baker, and Richard Greenblatt for their companionship, advice, comments, enthusiasm, criticism, and/or constructive opposition; the rest of the gang at the AL Lab and Project MAC (loosely known as the Lab for Computer Science), for their continued interest in my work and for the pleasant social atmosphere they provide; Bill Wulf, Charles Geschke, Richard Johnsson, Charles Weinstock, and Steven Hobbs, whose work on BLISS-li I found a great inspiration, for it told me that there was at least one beautiful compiler already; Dan Friedman and Dave Wise, who also know that LISP is the One True Way; Dick Gabriel, a most singular person (that's odd...), who knows that Lapin is best dealt with gingerly; the National Science Foundation, which provided the fellowship under which this work was done; Cindy Ellis and J.J. McCabe, who always treated me as just a regular guy; Julie Genovese, my main (and only) groupie; the congregation at the Brighton Evangelical Congregational Church, for their social and moral support; Mittens Jr., our cat, who was willing to communicate when the rest of the world was asleep; Chuck, the peculiar poodle, who carried on as best she could after Mittens Jr. had gone, and who still barks in the night; my brother, David A. Steele, who has kept me up to date on cultural affairs, and who probably understands me better than anyone else; and my parents, Guy L. Steele Sr. and Nalora Steele, who provided unbounded amounts of patience, encouragement, opportunity, and support. Con t en t s 1. Introduction 7 A. Background 7 B. The Thesis 10 2. The Source Language - SCHEME 15 3. The Tarq9j~pj~e 18 4. The Targ~Jj~fline 22 5. Lanyuage Design Considerations 25 6. The Use of Macros 28 7. The Imperative Treatment of Applicative Constructs 37 8. Compilation Strategy 44 A. ~ 45 B. Preliming~yp~jjsis 46 C. Optimization 49 P. ~ 56 E. Environment and closure analysis 60 f~fflpg~9eneration 64 9. Example: Compilation of Iterative Factorial 69 Performance Measurements 86 11. Comparison with Other Work 88 12. Conclusions and Future Work 90 Notes 93 References 113 4ppen dix 117 7 I. Introduction The work described here is a continuation (!) of that described in [SCHEME], [Imperative], and [Declarative]. Before enumerating the points of the thesis, we summarize here each of these documents. A. Background In [SCHEME] we (Gerald Jay Sussman and the author) described the implementation of a dialect of LISP named SCHEME with the properties of lexical scoping and tail-recursion; this implementation is embedded within MacLISP [Moon], a version of LISP which does not have these properties. The property of lexical scoping (that a variable can be referenced only from points textually within the expression which binds it) is a consequence of the fact that all functions are closed in the "binding environmentTM. [Moses] That is, SCHEME is a "full-funarg" LISP dialect. (Note Full-Funarg Example) The property of tail- recursion implies that loops written in an apparently recursive form will actually be executed in an iterative fashion. Intuitively, function calls do not "push cpntrol stack"; instead, it is argument evaluation which pushes control stack. The two properties of lexical scoping and tail-recursion are not independent. In most LISP systems [LISPl.5M] [Moon] [Teitelman], which use dynamic scoping rather than lexical, tail-recursion is impossible because function calls must push control stack in order to be able to undo the dynamic bindings after the return of the function. On the other hand, it is possible to have a lexically scoped LISP which does not tail-recurse, but it is easily seen that such an implementation only wastes storage space needlessly compared to a tail-recursing implementation. [Steele] Together, these two properties cause 8 SCHEME to reflect lambda-calculus semantics much more closely than dynamically scoped LISP systems. SCHEME also permits the treatment of functions as full- fledged data objects; they may be passed as arguments, returned as values, made part of composite data structures, and notated as independent, unnamed ("anonymous") entities. (Contrast this with most ALGOL-like languages, in which a fuhction can be written only by declaring it and giving it a name; imagine being able to use an integer value only by giving it a name in a declaration!) The property of lexical scoping allows this to be done in a consistent manner without the possibility of identifier conflicts (that is, SCHEME "solves the FUNARO problem" [Noses]). In [SCHEME] we also discussed the technique of "continuation-passing style", a way of writing programs in SCHEME such that no function ever returns a value. In [Imperative] we explored ways of exploiting these properties to implement most traditional programming constructs, such as assignment, looping, and call-by-name, in terms of function application. Such applicative (lambda- calculus) models of programing language constructs are well-known to theoreticians (see (Stoy], for example), but have not been used in a practical programming system. All of these constructs are actually made available in SCHEME by macros which expand into these applicative definitions. This technique has permitted the speedy implementation of a rich user-level language in terms of a very small, easy-to-implement basis set of primitive constructs. In [Imperative] we continued the exploration of continuation-passing style, and noted that the escape operator CATCH is easily modelled by transforming a program into this style. We also pointed out that transforming a program into this style enforces a particular order of argument evaluation, and makes all intermediate computational quantities manifest as variables. In [Declarative] we examined more closely the issue of tail-recursion, 9 and demonstrated that the usual view of function calls as pushing a return address must lead to an either inefficient or inconsistent implementation, while the tail-recursive approach of SCHEME leads to a uniform discipline in which function calls are treated as GOTO statements which also pass arguments. We also noted that a consequence of lexical scoping is that the only code which can reference the value of a variable in a given environment is code which is closed in that environment or which receives the value as an argument; this in turn implies that a compiler can structure a run-time environment in any arbitrary fashion, because it will compile all the code which can reference that environment, and so can arrange for that code to reference it in the appropriate manner. Such references do not require any kind of search (as is commonly and incorrectly believed in the LISP community because of early experience with LISP interpreters which search a-lists) because the compiler can determine the precise location of each variable in an environment at compile time. It is not necessary to use a standard format, because neither interpreted code nor other compiled code can refer to that environment. (This is to be constrasted with "spaghetti stacks" [Bobrow and Wegbreit].) In [Declarative] we also carried on the analysis of continuation-passing style, and noted that transforming a program into this style elucidates traditional compilation issues such as register allocation because user variables and intermediate quantities alike are made manifest as variables on an equal footing. Appendix A of [Declarative] contained an algorithm for converting any SCHEME program (not containing ASET) to continuation-passing style. We have implemented two compilers for the language SCHEME. The purpose was to explore compilation techniques for a language modelled on lambda-calculus, using lambda-calculus-style models of imperative programming constructs. Both compilers use the strategy of converting the source program to continuation- 10 passing style. The first compiler (known as CHEAPY) was written as a throw-away implementation to test the concept of conversion to continuation-passing style. The first half of CHEAPY is essentially the algorithm which appears in Appendix A of [Declarative], and the second is a simple code generator with almost no optimization. In conjunction with the writing of CHEAPY, the SCHEME interpreter was modified to interface to compiled functions. (This interface is described later in this report.) The second compiler, with which we are primarily concerned here, is known as RABBIT. It, like CHEAPY, is written almost entirely in SCHEME (with minor exceptions due only to problems in interfacing with certain MacLISP I/O facilities). Unlike CHEAPY, it is fairly clever. It is intended to demonstrate a number of optimization techniques relevant to lexical environments and tail- recursive control structures. (The code for RABBIT, with commentary, appears in the Appendix.) B. The Thesis (1) Function calls are not expensive when compiled correctly; they should be thought of as GOTO statements that happen to pass arguments. (2) The combination of cheap function calls, lexical scoping, tail-recursion, and "anonymous" notation of functions (which are not independent properties of a language, but aspects of a single unified approach) permits the definition of a wide variety of "imperative" constructs in applicative terms. Because these properties result from adhering to the principles of the well-known lambda-calculus [Church], such definitions can be liftefl intact from existing literature and used directly. 11 (3) A macro facility (the ability to specify syntactic transformations) makes it practical to use these as the only definitions of imperative constructs in a programming system. Such a facility makes it extremely easy to define new constructs. (4) A few well-chosen optimization strategies enable the compilation of these applicative definitions into the imperative low-level code which one would expect from a traditional compiler. (5) The macro facility and the optimization techniques used by the compiler can be conceptually unified. The same properties which make it easy to write the macros make it easy to define optimizations correctly. In the same way that many programing constructs are defined in terms of a small, well- chosen basis set, so a large number of traditional optimization techniques fall out as special cases of the few used in RABBIT. This is no accident. The separate treatment of a large and diverse set of constructs necessitates separate optimization techniques for each. As the basis set of constructs is reduced, so is the set of interesting transformations. If the basis set is properly chosen, their combined effect is "multiplicative" rather than "additive". (6) The technique of compiling by converting to continuation-passing style elucidates some important compilation issues in a natural way. Intermediate quantities are made manifest; so is the precise order of evaluation. Moreover, this is all expressed in a language isomorphic to a subset of the source language SCHEME; as a result the continuation-passing style version of a program inherits many of the philosophical and practical advantages. For example, the same optimization techniques can be applied at this level as at the original source level. While the use of continuation-passing style may not make the decisions any easier, it provides an effective and 12 natural way to express the results of those decisions. (7) Continuation-passing style, while apparently applicative in nature, admits a peculiarly imperative interpretation as a consequence of the facts that it requires no control stack to be evaluated and that no functions ever return values. As a result, it is easily converted to an imperative machine language. (8) A SCHEME compiler should ideally be a designer of good data structures, since it may choose any representation whatsoever for environments. RABBIT has a rudimentary design knowledge, involving primarily the preferral of registers to heap-allocated storage. However, there is room for knowledge of "bit-diddling" representations. (9) We suggest that those who have tried to design useful UNCOL's (UNiversal Computer-Oriented Languages) [Sammet] [Coleman] have perhaps been thinking too imperatively, and worrying more about data manipulation primitives than about environment and control issues. As a result, proposed UNCOLs have been little more than generalizations of contemporary machine languages. We suggest that SCHEME makes an ideal UNCOL at two levels. The first level is the fully applicative level, to which most source-language constructs are easily reduced; the second is the continuation-passing style level, which is easily reduced to machine language. We envision building a compiler in three stages: (a) reduction of a user language to basic SCHEME, whether by macros, a parser of algebraic syntax, or some other means; (b) optimization by means of SCHEME-level source-to-source transformations, and conversion to continuation-passing style; and (c) generation of code for a particular machine. RABBIT addresses itself to the second stage. Data manipulation primitives are completely ignored at this stage, and are just passed along from input to output. These primitives, whether integer arithmetic, string 13 concatenation and parsing, or list structure manipulators, are chosen as a function of a particular source language and a particular target machine. RABBIT deals only with fundamental environment and control issues common to most modes of algorithmic expression. (10) While syntactic issues tend to be rather superficial, we point out that algebraic syntax tends to obscure the fundamental nature of function calling and tail-recursion by arbitrarily dividing functions into syntactic classes such as "operators" and "functions". ([Standish], for example, uses much space to exhibit each conceptually singular transformation in a multiplicity •of syntactic manifestations.) The lack of an "anonymous" notation for functions in most algebraic languages, and the inability to treat functions as data objects, is a distinct disadvantage. The uniformity of LISP syntax makes these issues easier to deal with. To the LISP community in particular we address these additional points: (11) Lexical scoping need not be as expensive as is commonly thought. Experience with lexically-scoped interpreters is misleading; lexical scoping is not inherently slower than dynamic scoping. While some implementations may entail access through multiple levels of structure, this occurs only under circumstances (accessing of variables through multiple levels of closure) which could not even be expressed in a dynamically scoped language. Unlike deep-bound dynamic variables, compiled lexical access requires no search; unlike shallow-bound dynamic variables, lexical binding does not require that values be put in a canonical value cell. The compiler has complete discretion over the manipulation of environments and variable values. The "display" technique used in Algol implementations can be generalized to provide an efficient solution to the FUNARO problem. (12) Lexical scoping does not necessarily make LISP programming unduly difficult. 14 The very existence of RABBIT, a working compiler some fifty pages in length written in SCHEME, first implemented in about a month, part-time, substantiates this claim (which is, however, admitted to be mostly a matter of taste and experience). (Note Refinement of RABBIT) SCHEME has also been used to implement several Al problem-solving languages, including AMORD [Doyle]. 15 ~ The basic language processed by RABBIT is a subset of the SCHEME language as described in [SCHEME] and [Revised Report], the primary restrictions being that the first argument to ASET must be quoted and that the multiprocessing primitives are not accommodated. This subset is summarized here. SCHEME is essentially a lexically scoped ("full funarg") dialect of LISP. Interpreted programs are represented by S-expressions in the usual manner. Numbers represent themselves. Atomic symbols are used as identifiers (with the conventional exception of T and NIL, which are conceptually treated as constants). All other constructs are represented as lists. In order to distinguish the various other constructs, SCHEME follows the usual convention that a list whose car is one of a set of distinguished atomic symbols is treated as directed by a rule associated with that symbol. All other lists (those with non-atomic cars, or with undistinguished. atoms in their cars) are combinations, or function calls. All subforms of the list are uniformly evaluated in an unspecified order, and then the value of the first (the function) is applied to the values of all the others (the arguments). Notice that the function position is evaluated in the same way as the argument positions (unlike most other LISP systems). (In order to be able to refer to MacLISP functions, global identifiers evaluate to a special kind of functional object if they have definitions as MacLISP functions of the EXPR, SUBR, or LSUBR varieties. Thus "(PLUS 1 2)" evaluates to 3 because the values of the subforms are (functional object for PLUS>, 1, and 2; and applying the first to the other two causes invocation of the MacLISP primitive PLUS.) The atomic symbols which distinguish special constructs are as follows: LAMBDA This denotes a function. A form (LAMBDA (van var2 ... yarn) body) 16 will evaluate to a function of n arguments. The parameters van are ids~ntifiers (atomic symbols) which may be used in the body to refer to the respective ~ when the function is invoked. Note that a LAMBDA-expression is not a function, but evaluates to one, a crucial distinction. IF This denotes a conditional form. (IF a b c) evaluates the predicate a, producing. a value x; if x is non-NIL, then the ~ b is evaluated, and otherwise the alternative c. If c is omitted, NIL is assumed. QUOTE As in all LISP systems, this provides a way to specify any S-expression as a constant. (QUOTE x) evaluates to the S-expression x. This may be abbreviated to 'x, thanks to the MacLISP read-macro-character feature. LABELS This primitive permits the local definition of one or more mutually recursive functions. The format is: (LABELS ((namel (LAMBDA ...)) (name2 (LAMBDA ...)) (namen (LAMBDA ...))) body) This evaluates the body in an environment in which the names refer to the respective functions, which are themselves closed in that same environment. Thus references to these names in the bodies of the LAMBDA-expressions will refer to the labelled functions. (Note Generalized LABELS) ASET' This is the primitive side-effect on variables. (ASET' var body) evaluates the body, assigns the resulting value to the variable var, and returns that value. (Note Non-quoted ASET) For implementation- dependent reasons, it is forbidden by RABBIT to use ASET' on a global 17 variable which is the name of a primitive MacLISP function, or on a variable bound by LABELS. (ASET' is actually used very seldom in practice anyway, and all these restrictions are "good programming practice". RABBIT could be altered to lift these restrictions, at some expense and labor.) CATCH This provides an escape operator facility. [Landin] [Reynolds] (CATCH var body) evaluates the body, which may refer to the variable var, which will denote an "escape function" of one argument which, when called, will return from the CATCH-form with the given argument as the value of the CATCH-form. Note that it is entirely possible to return from the CATCH-form several times. This raises a difficulty with optimization which will be discussed later. Macros Any atomic symbol which has been defined in one of various ways to be a macro distinguishes a special construct whose meaning is determined by a macro function. This function has the responsibility of rewriting the form and returning a new form to be evaluated in place of the old one. In this way complex syntactic constructs can be expressed in terms of simpler ones. 18 3. The Target Language The "target language" is a highly restricted subset of MacLISP, rather than any particular machine language for an actual hardware machine such as the PDP-1O. RABBIT produces MacLISP function definitions which are then compiled by the standard MacLISP compiler. In this way we do not need to deal with the uninteresting vagaries of a particular piece of hardware, nor with the peculiarities of the many and various data-manipulation primitives (CAR, RPLACA, +, etc.). We allow the MacLISP compiler to deal with them, and concentrate on the issues of environment and control which are unique to SCHEME. While for production use this is mildly inconvenient (since the code must be passed through two compilers before use), for research purposes it has saved the wasteful re- implementation of much knowledge already contained in the MacLISP compiler. On the other hand, the use of MacLISP as a target language does not by any means trivialize the task of RABBIT. The MacLISP function-calling mechanism cannot be used as a target construct for the SCHEME function call, because MacLISP's function calls are not guaranteed to behave tail-recursively. Since tail-recursion is a most crucial characteristic distinguishing SCHEME from most LISP systems, we must implement SCHEME function calls by more primitive methods. Similarly, since SCHEME is a full-funarg dialect of LISP while MacLISP is not, we cannot in general use MacLISP's variable-binding mechanisms to implement those of SCHEME. On the other hand, it is a perfectly legitimate optimization to use MacLISP mechanisms in those limited situations where they are applicable. Aside from ordinary MacLISP data-manipulation primitives, the only MacLISP constructs used in the target language are PROG, GO, RETURN, PROGN, COND, SETQ, and ((LAMBDA ...) ...). PROG is never nested; there is only a single, outer PROG. RETURN is used only in the form (RETURN NIL) to exit this outer 19 PROG; it is never used to return a value of any kind. LAMBDA-expressions are used only to bind temporary variables. In addition, CONS, CAR, CDR, RPLACA, and RPLACD are used in the creation and manipulation of environments. We may draw a parallel between each of these constructs and an equivalent machine-language (or rather, assembly language) construct: PROG A single program module. GO A branch instruction. PROG tags correspond to instruction labels. RETURN Exit from program module. PROGN Sequencing of several instructions. COND Conditional branches, used in a disciplined manner. One may think of (COND (predl valuel) (predz value2) (predn valuen)) as representing the sequence of code (code for predl> JUMP-IF-NIL regl,TAG1 (code for valuel> JUMP ENDTAG TAGi: (code for pred2> JUMP-IF-NIL regl,TAGZ (code for valueZ> JUMP ENDTAG TAG2: (code for predn> JUMP-IF-NIL regl,TAGn (code for valuen> JUMP ENDTAG TAGn: LOAD-VALUE NIL ENDTAG: which admits of some optimizations, but we shall not worry about this. (The MacLISP compiler does, but we do not depend at all on this fact.) SETQ Load register, or store into memory. 20 LAMBDA We use this primarily in the form: ((LAMBDA (qi ... qn) (setq van qi) (setq yarn qn)) valuel ... valuen) which we may think of as saving values on a temporary stack and then popping them into the variables: (code for valuel> ;leaves result in regi PUSH regi (code for valuen> PUSH regi POP yarn POP van This is in fact approximately how the MacLISP compiler will treat this construct. This is used to effect the simultaneous assignment of several values to several registers. It would be possible to do without the MacLISP LAMBDA in this case, by using extra intermediate variables, but it was decided that this task was less interesting than other issues within RABBIT, and that assignments of this kind would occur sufficiently often that it was desirable to get the MacLISP compiler to produce the best possible code in this case. The form ((LAMBDA ...) ...) is also used in some situation where the user wrote such a form in the SCHEME code, and the arguments and LAMBDA-body are all "trivial", in a sense to be defined later. CONS CONS is used, among other things, to "push" new values onto the current environment. While SCHEME variables can sometimes be represented as temporary MacLISP variables using LAMBDA, in general they must be kept 21 in a "consed environment" in the heap; CAR and CDR are used to "index" the environment "stack" (which is not really a stack, but in general tree-like). (N.B. By using CONS for this purpose we can push the entire issue of environment retention off onto the LISP garbage collector. It would be possible to use array-like blocks for environments, and an Algol-like "display" pointer discipline for variable access. However, a retention strategy as opposed to a deletion strategy must be used in general, because SCHEME, unlike Algol 60 and 68, permits procedures to be the values of other procedures. Stack allocation does not suffice in general -- a heap must be used. Later we will see that RABBIT uses stack allocation of environments and a deletion strategy in simple cases, and reverts to heap allocation of environments and a retention strategy in more complicated situations.) CAR, + Primitive MacLISP operators such as + and CAR are analogous to machine- language instructions such as ADD and LOAD-INDEXED. We leave to the MacLISP compiler the task of compiling large expressions involving these; but we are not avoiding the associated difficult issues such as register allocation, for we shall have to deal with them in compiling calls to SCHEME functions. 22 4. The Target Machine Compiled code is interfaced to the SCHEME interpreter in two ways. The interpreter must be able to recognize functional objects which happen to be compiled and to invoke them with given arguments; and compiled code must be able to invoke any function, whether interpreted or compiled, with given arguments. (This latter interface is traditionally known as the "UUO Handler" as the result of the widespread use of the PDP-10 in implementing LISP systems. [DEC] [Moon] [Teitelroan]) We define here an arbitrary standard form for functional objects, and a standard means for invoking them. In the PDP-1O MacLISP implementation of SCHEME, a function is, in general, represented as a list whose car contains one of a set of distinguished atomic symbols. (Notice that LAMBDA is not one of these; a LAMBDA-expression may evaluate to a function, but is not itself a valid function.) This set of symbols includes EXPR, SUBR, and LSUBR, denoting primitive MacLISP functions of those respective types; BETA, denoting a SCHEME function whose code is interpretive; DELTA, denoting an escape function created by the interpreter for a CATCH form, or a continuation given by the interpreter to compiled code; CBETA, denoting a SCHEME function or continuation whose code is compiled; and EPSILON, denoting a continuation created when compiled code invokes interpreted code. Each of these function types requires a different invocation convention; the interpreter must distinguish these types and invoke them in the appropriate manner. For example, to invoke an EXPR the MacLISP FUNCALL construct must be used. A BETA must be invoked by creating an appropriate environment, using the given arguments, and then interpreting the code of the function. We have arbitrarily defined the CBETA interface as follows: there are a number of "registers", in the form of global variables. Nine registers called 23 **CONT**, **ONE**, **TWO**, ..., **EIGHT** are used to pass arguments to compiled functions. **CONT** contains the continuation. The others contain the arguments prescribed by the user; if there are more than eight arguments, however, then they are passed as a list of all the arguments in register **ONE**, and the others are unused. (Any of a large variety of other conventions could have been chosen, such as the first seven arguments in seven registers and a list of all remaining arguments in **EIGHT**. We merely chose a convention which would be workable and convenient, reflect the typical finiteness of hardware register sets, and mirror familiar LISP conventions. The use of a list of arguments is analogous to the passing of an arbitrary number of arguments on a stack, sometimes known as the LSUBR convention. [Moon] [Declarative]) There is another register called **FUN**. A function is invoked by putting the functional object in **FUN**, its arguments in the registers already described, and the number of arguments in the register **NARGS**, and then exiting the current function. Control (at the MacLlSP level) is then transferred to a routine (the "SCHEME UUO handler") which determines the type of the function in **FUN** and invokes it. A continuation is invoked in exactly the same manner as any other kind of function, with two exceptions: a continuation does not itself require a continuation, so **CONT** need not be set up; and a continuation always takes a single argument, so **NARGS** need not be set to 1. (Note Multiple- Argument Continuations) A CBETA form has additional fixed structure. Besides the atomic symbol CBETA in the car, there is always in the cadr the address of the code, and in the cddr the environment. The form of the environment is completely arbitrary as far as the SCHEME interpreter is concerned; indeed, the CHEAPY compiler and the RABBIT compiler use completely different formats for environments for compiled 24 function. (Recall that this cannot matter since the only code which will ever be able to access that environment is the code belonging the the functional closure of which that environment is a part.) The "UUO handler" puts the cddr of **FUN** in the register **ENV**, and then transfers to the address in the cadr of **FUN**. When that code eventually exits, control returns to the "UUO handler which expects the code to have set up **FUN** and any necessary arguments. There is a set of "memory locations" -11-, -12-, ... which are used to hold intermediate quantities within a single user function. (Later we shall see that we think of these as being used to pass values between internally generated functions within a module. For this purpose we think of the "registers" and "memory locations" being arranged in a single sequence **CONT**, **ONE**, **EIGHT**, -11-, -12-, ... There is in principle an unbounded number of these I' memory locations", but RABBIT can determine (and in fact outputs as a declaration for the MacLISP compiler) the exact set of such locations used by any given function.) One may think of the "memory locations" as being local to each module, since they are never used to pass information between modules; in practice they are implemented as global MacLISP variables. The registers **FUN**, **NARGS**, **ENV**, and the argument registers are the only global registers used by compiled SCHEME code (other than the "memory locations"). Except for global variables explicitly mentioned by the user program, all communication between compiled SCHEME functions is through these registers. It is useful to note that the continuation in **CONT** is generally analogous to the usual "control stack" which contains return addresses, and so we may think of **CONT** as our "stack pointer register". 25 5. Language Design Considerations SCHEME is a lexically scoped ("full-funarg") dialect of LISP, and so is an applicative language which conforms to the spirit of the lambda-calculus. [Church] We divide the definition of the SCHEME language into two parts: the environment and control constructs, and the data manipulation primitives. Examples of the former are LAMBDA-expressions, combinations, and IF; examples of the latter are CONS, CAR, EQ. and PLUS. Note that we can conceive of a version of SCHEME which did not have CONS, for example, and more generally did not have S-expressions in its data domain. Such a version would still have the same environment and control constructs, and so would hold the same theoretical interest for our purposes here. (Such a version, however, would be less convenient for purposes of writing a meta-circular description of the language, however!) By the "spirit of lambda-calculus" we mean the essential properties of the axioms obeyed by lambda-calculus expressions. Among these are the rules of alpha-conversion and beta-conversion. The first intuitively implies that we can uniformly rename a function parameter and all references to it without altering the meaning of the function. An important corollary to this is that we can in fact effectively locate all the references. The second implies that in a situation where a known function is being called with known argument expressions, we may substitute an argument expression for a parameter reference within the body of the function (provided no naming conflicts result, and that certain restrictions involving side effects are met). Both of these operations are of importance to an optimizing compiler. Another property which follows indirectly is that of tail-recursion. This property is exploited in expressing iteration in terms of applicative constructs, and is discussed in some detail in 26 [Declarative]. We realize that other systems of environment and control constructs also are reasonably concise, clear, and elegant, and can be axiomatized in useful ways, for example the guarded commands of Dijkstra. [Dijkstra] However, that of lambda-calculus is extremely well-understood, lends itself well to certain kinds of optimizations in a natural manner, and has behind it a body of literature which can be used directly by RABBIT to express non-primitive constructs. The desire for uniform lexical scoping arises from other motives as well, some pragmatic, some philosophical. Many of these are described in [SCHEME], [Imperative], [Declarative], and [Revised Report]. It is often difficult to explain some of these to those who are used to dynamically scoped LISP systems. Any one advantage of lexical scoping may often be countered with "Yes, but you can do that in this other way in a dynamically scoped LISP." However, we are convinced that lexical scoping in its totality provides all of the advantages to be described in a natural, elegant, and integrated manner, largely as a consequence of its great simplicity. There are those to whom lexical scoping is nothing new, for example the ALGOL community. For this audience, however, we should draw attention to another important feature of SCHEME, which is that functions are first-class data objects. They may be assigned or bound to variables, returned as values of other functions, placed in arrays, and in general treated as any other data object. Just as numbers have certain operations defined on them, such as addition, so functions have an important operation defined on them, namely invocation. The ability to treat functions as objects is not at all the same as the ability to treat representations of functions as objects. It is the latter ability that is traditionally associated with LISP; functions can be represented as S-expressions. In a version of SCHEME which had no S-expression primitives, 27 however, one could still deal with functions (i.e. closures) as such, for that ability is part of the fundamental environment and control facilities. Conversely, in a SCHEME which does have CONS, CAR, and CDR, there is no defined way to use CONS by itself to construct a function (although a primitive ENCLOSE is now provided which converts an S-expression representation of a function into a function), and the CAR or CDR of a function is in general undefined. The only defined operation on a function is invocation. (Note Operations on Functions) We draw this sharp distinction between environment and control constructs on the one hand and data manipulation primitives on the other because only the former are treated in any depth by RABBIT, whereas much of the knowledge of a real" compiler deals with the latter. A PL/I compiler must have much specific knowledge about numbers, arrays, strings, and so on. We have no new ideas to present here on such issues, and so have avoided this entire area. RABBIT itself knows almost nothing about data manipulation primitives beyond being able to recognize them and pass them along to the output code, which is a small subset of MacLISP. In this way RABBIT can concentrate on the interesting issues of environment and control, and exploit the expert knowledge of data manipulation primitives already built into the MacLISP compiler. 28 6. The Use of Macros An important characteristic of the SCHEME language is that its set of primitive constructs is quite small. This set is not always convenient for expressing programs, however, and so a macro facility is provided for extending the expressive power of the language. A macro is best thought of as a syntax rewrite rule. As a simple example, suppose we have a primitive GCD which takes only two arguments, and we wish to be able to write an invocation of a GCD function with any number of arguments. We might then define (in a "production- rule" style) the conditional rule: (XGCD) => 0 (XGCD x) => x (XGCD x . rest) 0 (GCD x (XGCD . rest)) (Notice the use of LISP dots to refer to the rest of a list.) This is not considered to be a definition of a function XGCD, but a purely syntactic transformation. In principle all such transformations could be performed bpfore executing the program. In fact, RABBIT does exactly this, although the SCHEME interpreter naturally does it incrementally, as each macro call is encountered. Rather than use a separate production-rule/pattern-matching language, in practice SCHEME macros are defined as transformation functions from macro-call expressions to resulting S-expressions, just as they are in MacLISP. (Here, however, we shall continue to use production rules for purposes of exposition.) It is important to note that macros need not be written in the language for which they express rewrite rules; rather, they should be considered an adjunct to the interpreter, and written in the same language as the interpreter (or the compiler). To see this more clearly, consider a version of SCHEME which does not have S-expressions in its data domain. If programs in this language are 29 represented as S-expressions, then the interpreter for that language cannot be written in that language, but in another meta-language which does deal with 5- expressions. Macros, which transform one S-expression (representing a macro call) to another (the replacement form, or the interpretation of the call), clearly should be expressed in this meta-language also. The fact that in most LISP systems the language and the meta-language appear to coincide is a source of both power and confusion. • In the PDP-10 MacLISP implementation of SCHEME, four separate macro mechanisms are used in practice. One is the MacLISP read-macro mechanism [Moon], which performs transformations such as 'FOO => (QUOTE FOO) when an expression is read from a file. The other three are as described earlier, processed by the interpreter or compiler, and differ only in that one kind is recognized by the MacLISP interpreter as well while the other two are used only by SCHEME, and that of the latter two one kind is written in MacLISP and the other kind in SCHEME itself. There is a growing library of SCHEME macros which express a variety of traditional programming constructs in terms of other, more primitive constructs, and eventually in terms of the small set of primitives. A number of these are catalogued in [Imperative] and [Revised Report]. Others were invented in the course of writing RABBIT. We shall give some examples here. The BLOCK macro is similar to the MacLISP PROON; it evaluates all its arguments and returns the value of the last one. One critical characteristic is that the last argument is evaluated "tail-recursively" (I use horror quotes because normally we speak of invocation, not evaluation, as being tail- recursive). An expansion rule is given for this in [Imperative] equivalent to: (BLOCK x) => x (BLOCK x . rest) => ((LAMBDA (DUMMY) (BLOCK . rest)) x) 30 This definition exploits the fact that SCHEME is evaluated in applicative order, and so will evaluate all arguments before applying a function to them. Thus, in the second subrule, x must be evaluated, and then the block of all the rest is. It is then clear from the first subrule that the last argument is evaluated "tail-recursively". One problem with this definition is the occurrence of the variable DUMMY, which must be chosen so as not to conflict with any variable used by the user. This we refer to as the "GENSYM problem", in honor of the traditional LISP function which creates a "fresh" symbol. It would be nicer to write the macro in such a way that no conflict could arise no matter what names were used by the user. There is indeed a way, which ALGOL programmers will recognize as equivalent to the use of "thunks", or call-by-name parameters: (BLOCK x) :> x (BLOCK x . rest) => ((LAMBDA (A B) (B)) x (LAMBDA () (BLOCK . rest))) Consider carefully the meaning of the right-hand side of the second subrule. First the expression x and the (LAMBDA () ...) must be evaluated (it doesn't matter in which order!); the result of the latter is a function (that is, a closure), which is later invoked in order to evaluate the rest of the arguments. There can be no naming conflicts here, because the scope of the variables A and B (which is lexical) does not contain any of the arguments to BLOCK written by the user. (We should note that we have been sloppy in speaking of the "arguments" to BLOCK, when BLOCK is properly speaking not a function at all, but merely a syntactic keyword used to recognize a situation where a syntactic rewriting rule is applicable. We would do better to speak of "argument expressions" or "macro 31 arguments", but we shall continue to be sloppy where no confusion should arise.) This is a technique which should be understood quite thoroughly, since it is the key to writing correct macro rules without any possibility of conflicts between names used by the user and those needed by the macro. As another example, let us consider the AND and OR constructs as used by most LISP systems. OR evaluates its arguments one by one, in order, returning the first non-NIL value obtained (without evaluating any of the following arguments), or NIL if all arguments produce NIL. AND is the dual to this; it returns NIL if any argument does, and otherwise the value of the last argument. A simple-minded approach to • OR would be: (OR) => 'NIL. (OR x . rest) => (IF x x (OR . rest)) There is an objection to this, which is that the code for x is duplicated. Not only does this consume extra space, but it can execute erroneously if x has any side-effects. We must arrange to evaluate x only once, and then test its value: (OR) => 'NIL (OR x . rest) => ((LAMBDA (V) (IF V V (OR . rest))) x) This certainly evaluates x only once, but admits a possible naming conflict between the variable V and any variables used by rest. This is avoided by the same technique used for BLOCK: (OR) => 'NIL (OR x . rest) => ((LAMBDA (V R) (IF V V (R))) x (LAMBDA () (OR . rest))) Similarly, we can express AND as follows: 32 (AND) (AND x) => x (AND x . rest) ~> ((LAMBDA (V R) (IF V (R) 'NIL)) x (LAMBDA () (AND . rest))) (The macro rules are not precise duals because of the non-duality between NIL- ness and non-NIL-ness, and the requirement that a successful AND return the actual value of the last argument and not just T.) (Note Tail-Recursive OR) As yet another example, consider a modification to BLOCK to allow a limited form of assignment statement: if (v := x) appears as a statement in a block, it "assigns" a value to the variable v whose scope is the remainder of the block. Let us assume that such a statement cannot occur as the last statement of a block (it would be useless to have one in that position, as the variable would have a null scope). We can write the rule: (BLOCK x) > x (BLOCK (v :: x) . rest) => ((LAMBDA (v) (BLOCK . rest)) x) (BLOCK x . rest) :> ((LAMBDA (A B) (B)) x (LAMBDA () (BLOCK . rest))) The second subrule states that an "assignment" causes x to be evaluated and then bound to v, and that the variable v is visible to the rest of the block. We may think of :~ as a "sub-macro keyword" which is used to mark an expression as suitable for transformation, but only in the context of a certain larger transformation. This idea is easily extended to allow other constructions, such as "simultaneous assignments" of the form ((van var2 ... yarn) :: value! value2 ... valuen) which first compute all the values and then assign to all the variables, and "exchange assignments" of the form (X :~: Y), as follows: 33 (BLOCK x) => x (BLOCK (v := x) . rest) -> ((LAMBDA (v) (BLOCK . rest)) x) (BLOCK (vars := . values) . rest) -> ((LAMBDA vars (BLOCK . rest)) . values) (BLOCK (x :=: y) . rest) -> ((LAMBDA (x y) (BLOCK . rest)) y x) (BLOCK x . rest) => ((LAMBDA (A B) (B)) x (LAMBDA () (BLOCK . rest))) Let us now consider a rule for the more complicated COND construct: (COND) => 'NIL (COND (x) . rest) ~> (OR x (COND . rest)) (COND (x . r) . rest) => (IF x (BLOCK . r) (COND rest)) This defines the "extended" COND of modern LISP systems, which produces NIL if no clauses succeed, which returns the value of the predicate in the case of a singleton clause, and which allows more than one consequent in a clause. An important point here is that one can write these rules in terms of other macro constructs such as OR and BLOCK; moreover, any extensions to BLOCK, such as the limited assignment feature described above, are automatically inherited by COND. Thus with the above definition one could write (COND ((NUMBERP X) (Y (SQRT X)) (' Y (SQRT Y))) (T (HACK X))) where the scope of the variable Y is the remainder of the first COND clause. SCHEME also provides macros for such constructs as DO and PROG, all of which expand into similar kinds of code using LAMBDA, IF, and LABELS (see below). In particular, PROG permits the use of GO and RETURN in the usual manner. In this manner all the traditional imperative constructs are expressed in an applicative manner. (Note ASET' Is Imperative) None of this is particularly new; theoreticians have modelled imperative 34 constructs in these terms for years. What is new, we think, is the serious proposal that a practical interpreter and compiler can be designed for a language in which such models serve as the sole definitions of these imperative constructs. (Note Dijkstra's Opinion) This approach has both advantages and disadvantages. One advantage is that the base language is small. A simple-minded interpreter or compiler can be written in a few hours. (We have re-implemented the SCHEME interpreter from scratch a dozen times or more to test various representation strategies; this was practical only because of the small size of the language. Similarly, the CHEAPY compiler is fewer than ten pages of code, and could be rewritten in a day or less.) Once the basic interpreter is written, the macro definitions for all the complex constructs can be used without revision. Moreover, the same macro definitions can be used by both interpreter and compiler (or by several versions of interpreter and compiler!). Excepting the very few primitives such as LAMBDA and IF, it is not necessary to "implement a construct twice", once each in interpreter and compiler. Another advantage is that new macros are very easy to write (using facilities provided in SCHEME). One can easily invent a new kind of DO loop, for example, and implement it in SCHEME for both interpreter and all compilers in less than five minutes. (In practice such new control constructs, such as the ITERATE loop described in [Revised Report], are indeed installed within five to ten minutes of conception, in a routine manner.) A third advantage is that the attention of the compiler can be focused on the basic constructs. Rather than having specialized code for two dozen different constructs, the compiler can have much deeper knowledge about each of a few basic constructs. One might object that this "deeper knowledge consists of recognizing the two dozen special cases represented by the separate constructs of 35 the former case. This is true to some extent. It is also true, however, that in the latter case such deep knowledge will carry over to any new constructs which are invented and represented as macros. Among the disadvantages of the macro approach are lack of speed and the discarding of information. Many people have objected that macros are of necessity slower than, say, the FSUBR implementation used by most LISP systems. This is true in many current interpretive implementations, but need not be true of compilers or more cleverly designed interpreters. Moreover, the FSUBR implementation is not general; it is very hard for a user to write a meaningful FSUBR and then describe to the compiler the best way to compile it. The macro approach handles this difficulty automatically. We do not object to the use of the FSUBR mechanism as a special-case "speed hack" to improve the performance of an interpreter, but we insist on recognizing the fact that it is not as generally useful as the macro approach. Another objection relating to speed is that the macros produce convoluted code involving the temporary creation and subsequent invocation of many closures. We feel, first of all, that the macro writer should concern himself more with producing correct code than fast code. Furthermore, convolutedness can be eliminated by a few simple optimization techniques in the compiler, to be discussed below. Finally, function calls need not be as expensive as is popularly supposed. [Steele] Information is discarded by macros in the situation, for example, where a DO macro expands into a large mess that is not obviously a simple loop; later compiler analysis must recover this information. This is indeed a problem. We feel that the compiler is probably better off having to recover the information anyway, since a deep analysis allows it to catch other loops which the user did not use DO to express for one reason or another. Mother is the possibility that 36 DO could leave clues around in the form of declarations if desired. Another difficulty with the discarding of information is the issuing of meaningful diagnostic messages. The user would prefer to see diagnostics mention the originally-written source constructs, rather than the constructs into which the macros expanded. (An example of this problem from another LISP compiler is that it may convert (MEMQ X '(A B C)) into (OR (EQ X 'A) (EQ X 'B) (EQ X 'C)); when by the same rule it converts (MEMQ X '(A)) (a piece of code generated by a macro) into (OR (EQ X 'A)), it later issues a warning that an OR had only one subform.) This problem can be partially circumvented if the responsibility for syntax-checking is placed on the macro definition at each level of expansion. 37 7. The Imperative Treatment of Applicative Constructs Given the characteristics of lexical scoping and tail-recursive invocations, it is possible to assign a peculiarly imperative interpretation to the applicative constructs of SCHEME, which consists primarily of treating a function call as a GOTO. More generally, a function call is a GOTO that can pass one or more items to its target; the special case of passing no arguments is precisely a GOTO. It is never necessary for a function call to save a return address of any kind. It is true that return addresses are generated, but we adopt one of two other points of view, depending on context. One is that the return address, plus any other data needed to carry on the computation after the called function has returned (such as previously computed intermediate values and other return addresses) are considered to be packaged up into an additional argument (the continuation) which is passed to the target. This lends itself to a non-functional interpretation of LAMBDA, and a method of expressing programs called the continuation-passing style (similar to the message-passing actors paradigm [Hewitt]), to be discussed further below. The other view, more intuitive in terms of the traditional stack implementation, is that the return address should be pushed before evaluating arguments rather than before calling a function. This view leads to a more uniform function-calling discipline, and is discussed in [Declarative] and [Steele]. We are led by these points of view to consider a compilation strategy in which function calling is to be considered very cheap (unlike the situation with PL/I and ALGOL, where programmers avoid procedure calls like the plague -- see [Steele] for a discussion of this). In this light the code produced by the sample macros above does not seem inefficient, or even particularly convoluted. Consider the expansion of (OR a b c): 38 ((LAMBDA (V R) (IF V V (R))) a (LAMBDA () ((LAMBDA (V R) (IF V V (R))) b (LAMBDA () ((LAMBDA (V R) (IF V V (R))) c (LAMBDA () 'NIL)))))) Then we might imagine the following (slightly contrived) compilation scenario. First, for expository purposes, we shall rename the variables in order to be able to distinguish them. ((LAMBDA (Vi Ri) (IF VI Vi (RI))) a (LAMBDA () ((LAMBDA (V2 R2) (IF V2 V2 (R2))) b (LAMBDA () ((LAMBDA (V3 R3) (IF V3 V3 (R3))) c (LAMBDA () 'NIL)))))) We shall assign a generated name to each LAMBDA-expression, which we shall notate by writing the name after the word LAMBDA. These names will be used as tags in the output code. ((LAMBDA namel (Vi Ri) (IF Vi VI (Rl))) a (LAMBDA name? () ((LAMBDA name3 (V2 R2) (IF V2 V2 (R2))) b (LAMBDA name4 () ((LAMBDA name5 (V3 R3) (IF V3 V3 (R3))) c (LAMBDA name6 () 'NIL)))))) Next, a simple analysis shows that the variables Rl, R2, and R3 always denote the LAMBDA-expressions named name?, name4, and name6, respectively. Now an optimizer might simply have substituted these values into the bodies of namel, name3, and nameS using the rule of beta-conversion, but we shall not apply that technique here. Instead we shall compile the six functions in a straightforward manner. We make use of the additional fact that all six functions are closed in identical 39 environments (we count two environments as identical if they involve the same variable bindings, regardless of the number of "frames" involved; that is, the environment is the same inside and outside a (LAMBDA () ...)). Assume a simple target machine with argument registers called regi, reg2, etc. main: (code for a> ;result in regi LOAD reg2,[name2] ;[name?] is the closure for name? CALL-FUNCTION 2jnamei] ;call namel with 2 arguments namel: JUMP-IF-NIL regl,namela RETURN ;return the value in regl namela: CALL-FUNCTION 0,regz ;call function in reg?, 0 arguments name?: (code for b> ;result in regi LOAD reg?,[name4] ;[name4] is the closure for name4 CALL-FUNCTION 2,[name3] ;call name3 with 2 arguments name3: JUMP-IF-NIL regl,name3a RETURN ;return the value in regi name3a: CALL-FUNCTION 0,reg2 ;call function in reg2, 0 arguments name4: (code for c> ;result in regl LOAD reg2,[name6] ;[name6] is the closure for nameb CALL-FUNCTION 2,[name5] ;call nameS with 2 arguments nameS: JUMP-IF-NIL regl,name5a RETURN ;return the value in regi nameSa: CALL-FUNCTION 0,reg2 ;call function in reg2, 0 arguments name6: LOAD regl,'NIL ;constant NIL in regi RETURN Now we make use of our knowledge that certain variables always denote certain functions, and convert CALL-FUNCTION of a known function to a simple GOTO. (We have actually done things backwards here; in practice this knowledge is used before generating any code. We have fudged over this issue here, but will return to it later. Our purpose here is merely to demonstrate the treatment of function calls as GOTOs.) main: (code for a> ;result in regi LOAD reg2,[name2] ;[name2] is the closure for name2 GOTO name I 40 namel: JUMP-IF-NIL regl,namela RETURN • ;return the value in regi namela: GOTO name? name?: (code for b> ;result in regl LOAD reg?,[name4] ;[name4] is the closure for name4 GOTO name3 name3: JUMP-IF-NIL regl,name3a RETURN ;return the value in regl name3a: GOTO name4 name4: (code for c> ;result in regi LOAD reg2,[name6] ;[name6] is the closure for name6 GOTO nameS nameS: JUMP-IF-NIL regl,name5a RETURN ;return the value in regi nameSa: GOTO name6 name6: LOAD regl,'NIL ;constant NIL in regi RETURN The construction [foo] indicates the creation of a closure for foo in the current environment. This will actually require additional instructions, but we shall ignore the mechanics of this for now since analysis will remove the need for the construction in this case. The fact that the only references to the variables Ri, RZ, and R3 are function calls can be detected and the unnecessary LOAD instructions eliminated. (Once again, this would actually be determined ahead of time, and no LOAD instructions would be generated in the first place. All of this is determined by a general pre-analysis, rather than a peephole post-pass.) Moreover, a GOTO to a tag which immediately follows the GOTO can be eliminated. 41 main: (code for a> ;result in regi namel: JUMP-IF-NIL regl,namela RETURN ;return the value in regi name Ia: name?: (code for b> ;result in regi name3: JUMP-IF-NIL regl,name3a RETURN ;return the value in regi name3a: name4: (code for c> ;result in regi nameS: JUMP-IF-NIL regl,name5a RETURN ;return the value in regl nameSa: name6: LOAD regl,'NIL ;constant NIL in regi RETURN This code is in fact about what one would expect out of an ordinary LISP compiler. (There is admittedly room for a little more improvement.) RABBIT indeed produces code of essentially this form, by the method of analysis outlined here. Sijni lar considerations hold for the BLOCK macro. Consider the expression (BLOCK a b c); conceptually this should perform a, b, and c sequentially. Let us examine the code produced: ((LAMBDA (A B) (B)) a (LAMBDA () ((LAMBDA (A B) (B)) b (LAMBDA () c)))) Renaming the variables and assigning names to LAMBDA-expressions: ((LAMBDA namel (Al Bi) (Bi)) a (LAMBDA name? () ((LAMBDA name3 (AZ B?) (B?)) b (LAMBDA name4 () c)))) Producing code for the functions: 42 main: (code for a> LOAD reg?,[name2] CALL-FUNCTION 2,[namel] namel: CALL-FUNCTION 0,reg2 name?: (code for b> LOAD reg?,[name4] CALL-FUNCTION 2,[name3] name3: CALL-FUNCTION 0,reg2 name4: (code for c> RETURN Turning general function calls into direct GO's, on the basis of analysis of what variables must refer to constant functions: main: (code for a> LOAD reg?,[name2] GOTO name 1 namel: GOTO name? name?: (code for b> LOAD regZ,[name4] GOTO name3 name3: GOTO name4 name4: (code for c> RETURN Eliminating useless GOTO and LOAD instructions: main: (code fora> name 1: name?: (code forb> name3: name4: (code forc> RETURN What more could one ask for? Notice that this has fallen out of a general strategy involving only an approach to compiling function calls, and has involved no special knowledge of OR 43 or BLOCK not encoded in the macro rules. The cases shown so far are actually special cases of a more general approach, special in that all the conceptual closures involved are closed in the same environment, and called from places that have not disturbed that environment, but only used "registers". In the more general case, the environments of caller and called function will be different. This divides into two subcases, corresponding to whether the closure was created by a simple LAMBDA or by a LABELS construction. The latter involves circular references, and so is somewhat more complicated; but it is easy to show that in the former case the environment of the caller must be that of the (known) called function, possibly with additional values added on. This is a consequence of lexical scoping. As a result, the function call can be compiled as a GOTO preceded by an environment adjustment which consists merely of lopping off some leading portion of the current one (intuitively, one simply "pops the unnecessary crud off the stack"). LABELS-closed functions also can be treated in this way, if one closes all the functions in the same way (which RABBIT presently does, but this is not always desirable). If one does, then it is easy to see the effect of expanding a PROG into a giant LABELS as outlined in [Imperative] and elsewhere: normally, a GOTO to a tag at the same level of PROG will involve no adjustment of environment, and so compile into a simple GOTO instruction, whereas a GOTO to a tag at an outer level of PROG probably will involve adjusting the environment from that of the inner PROG to that of the outer. All of this falls out of the proper imperative treatment of function calls. 44 8. Compilation Strategy The overall approach RABBIT takes to the compilation of SCHEME code may be summarized as follows: (1) Alpha-conversion (renaming of variables) • and macro-expansion (expansion of syntactic rewrite rules). (2) Preliminary analysis (variable references, "trivial" expressions, and side effects). (3) Optimization (meta-evaluation). (4) Conversion to continuation-passing style. (5) Environment and closure analysis. (6) Code generation. During (1) a data structure is built which is structurally a copy of the user program but in which all variables have been renamed and in which at each "node" of the program tree are additional slots for extra information. These slots are filled in during (2). In (3) the topology of the structure may be modified to reflect transformations made to the program; routines from (2) may be called to update the information slots. In (4) a new data structure is contructed from the old one, radically different in structure, but nevertheless also tree-like in form. During (5) information is added to slots in the second structure. In (6) this information is used to produce the final code. 45 A. Alpha-conversion and macro-expansion In this phase a copy of the user program is made. The user program is conceptually a tree structure; each node is one of several kinds of construct (constant, variable, LAMBDA-expression, IF-expression, combination, etc.). Some kinds of nodes have subnodes; for example, a LAMBDA-expression node has a subnode representing the body, and a combination node has a subnode for each argument. The copying is performed in the obvious way by a recursive tree-walk. In the process all bound variables are renamed. Each bound variable is assigned a new generated name at the point of binding, and each node for a reference to a bound variable contains this generated name, not the original name. From this point on all variables are dealt with in terms of their new names. (This is possible because, as a consequence of lexical scoping, we can identify all references to each bound variable.) These new names are represented as atomic symbols, and the property lists of these symbols will later be used to store information about the variables. As each subform of the user program is examined, a check is made for a macro call, which is a list whose car is an atomic symbol with one of several macro-defining properties. When such a call is encountered, the macro call is expanded, and the tree-walk is resumed on the code returned by the expansion process. 46 B. Preliminary analysis The preliminary analysis ("phase 1") is in three passes, each involving a tree-walk of the node structure, filling in information slots at each node. (Two passes would have sufficed, but for reasons of clarity and modularity there is one pass for each type of analysis.) The first pass (ENV-ANALYZE) analyzes variable references. For each node we determine the set of all local (bound) variables referenced at or below that node. For example, for a variable-reference node this set is empty (for a global variable), or the singleton set of the variable itself (for a local variable); for a LAMBDA-expression, it is the set for its body minus the variables bound by that LAMBDA-expression; for an IF-expression, it is the union of the sets for the predicate, consequent, and alternative; and so on. We also compute for each node the set of bound variables which appear in an ASET' at or below the node. (This set will be a subset of the first set, but no non-trivial use of this property is used in this pass.) Finally, for each variable we store several properties on its property list, including a list of all nodes which reference the variable (for "reading") and a list of of all ASET' nodes which modify the variable. These lists are built incrementally, with an entry added as each reference is encountered during the tree walk. (This exemplifies the general strategy for passing data around; any information which cannot be passed conveniently up and down the tree, but which must jump laterally between branches, is accumulated on the property lists of the variables. It may appear to be "lucky" that all such information has to do with variables, but this is actually an extremely deep property of our notation. The entire point of using identifiers is to relate textually separated constructions. We depend on alpha- conversion to give all variables distinct names (by "names" we really mean 47 I' compile-time data structures") so that the information for variables which the user happened to give the same name will not be confused.) The second pass (TRIV-ANALYZE) locates "trivial" portions of the program. (Cf. [Wand and Friedman].) Constants and variables are trivial; an IF- expressions is trivial iff the predicate, consequent, and alternative are all trivial; an ASET' is trivial iff its body is trivial; a combination is trivial iff the function is either a global variable which is the name of a MacLISP primitive, or a LAMBDA-expression whose body is trivial, and the arguments are all trivial. LAMBDA-expressions, LABELS-forms (which contain LAMBDA- expressions), and CATCH-forms are never trivial. The idea is that a trivial expression is one that MacLISP could evaluate itself, without benefit of SCHEME control structures. (No denigration of MacLISP's ability is intended by this terminology!) Note particularly the two special cases of combinations distinguished here (in which the function position contains either the name of a MacLISP primitive or a LAMBDA-expression); they are very important, and shall be referred to respectively as TRIVFN-combinations and LAMBDA-combinations. The third pass (EFFS-ANALYZE) analyzes the possible side-effects caused by each node, and the side-effects which could affect it. It actually produces two sets of analyses, one liberal and one conservative. Where there is any uncertainty as to what side-effects may be involved, it assumes none in one case and all possible in the other. The liberal estimation is used only to issue error messages to the user about possible conflicts which might result as a consequence of the freedom to evaluate arguments to combinations in any order. The user is given the benefit of doubt, and warned only of a "provable" conflict. (Actually, the "proof" is a little sloppy, and can err in both directions, but in practice it has issued no false alarms and a number of helpful warnings.) The conservative estimation is used by the optimizer, which will move expressions 48 only if it can prove that there will be no conflict. Side effects are grouped into classes: ASET, RPLACA and RPLACD (which are considered distinct), FILE (input/output operations), and CONS. These are not intended to be exhaustive; there is also an internal notation for "any side- effect whatever". The use of classes enables the analysis to realize, for example, that RPLACA cannot affect the value of a variable p~t~g~ There is a moderately large body of data in RABBIT about the side-effects of MacLISP primitive functions. For example, CAR, CDR, CAAR, CADR, and so on are known not to have side-effects, and to be respectively affected only by RPLACA, RPLACD, RPLACA, RPLACA or RPLACD, and so on. Similarly, RABBIT knows that ASET' affects the values of variables, but cannot affect the outcome of a CAR operation. (It may affect the value of the expression (CAR X), but only because a variable reference is a subnode of the combination. The effects, or affectability, of a combination are the union of the effects, or affectibility, of all arguments plus those of the function.) The CONS side-effect is a special case. This side— effect cannot affect anything, and two instances of it may be performed in the "wrong" order, but performing a single instance twice will produce distinct (as determined by EQ) and therefore incorrect results. In particular, closures of LAMBDA-expressions involve the CONS side-effect. (The definition of SCHEME says nothing about whether EQ is a valid operation on closures, but in general it is not a good idea to produce unnecessary multiple copies.) On the other hand, LAMBDA-expressions occurring in function position of a LAMBDA-combination do not incur the CONS side-effect. The CONS side-effect is given special treatment in the optimizer. (Note Side-Effect Classifications) 49 C. Optimization Once the preliminary analysis is done, the optimization phase performs certain transformations on the code. The result is an equivalent program which will (probably) compile into more efficient code. This new program is itself structurally a valid SCHEME program; that is, all transformations are contained within the language. The transformations are thus similar to those performed on macro calls, consisting of a syntactic rewriting of an expression, except that the situations where such transformations are applicable are more easily recognized in the case of macro calls. It should be clear that the optimizer and the macro-functions are conceptually at the same level in that they may be written in the same meta-language that operates on representations of SCHEME programs. (Note Non-deterministic Optimization) The ~implest transformation is that a combination whose function position contains a LAMBDA-expression and which has no arguments can be replaced by the body of the LAMBDA-expression: ((LAMBDA () body)) 0 body Another is that, in the case of a LAMBDA-combination, if some parameter of the LAMBDA-expression is not referenced and the corresponding argument can be proved to have no side-effects (with an exception discussed below), then the parameter and argument can be eliminated: ((LAMBDA (xl x2 x3) body) al a2 a3) 0 ((LAMBDA (xi x3) body) al a3) if x? is unreferenced in body and a2 has no side-effects Repeated applications of this rule can lead to the preceding case. A third rule is that, in a LAMBDA-combination, an argument can be 50 substituted for one or more occurrences of a parameter in the body of the LAMBDA- expression. (This rule is related to the view of LAMBDA as a renaming operator discussed in [Declarative], and together with the two preceding rules make up the rule of beta-conversion.) Such a substitution is permissible only if (a) either the parameter is referred to only once or the argument has no side effects, and (b) the substitution will not alter the order in which expressions are evaluated in such a way as to allow possible side-effects to produce different results. Before performing the substitution it is necessary to show that side-effects will not interfere in this manner. This issue is discussed in [Allen and Cocke], [Geschke], and [WuIf], and characterized more accurately in [Standish]. There is also some difficulty if the parameter appears in an ASET'. Presently RABBIT does not attempt any form of substitution for such a parameter. (ASET' is so seldom used in SCHEME programs that this restriction makes very little difference.) This third rule creates an exception to the second. If an argument with a side effect is referred to once, and is substituted for the reference, then the second rule must be invoked to eliminate the original occurrence of the argument, so that the side effect will not occur twice. This requires a little collusion between the two rules. Even if such a substitution is permissible, it is not always desirable; time/space tradeoffs are involved. The current heuristic is that a substitution is desirable if (1) the parameter is referred to only once; or (2) the argument to be substituted in is a constant or variable; or (3) the argument is a LAMBDA- expression whose body is (3a) a constant, or (3b) a variable reference, or (3c) a combination which has no more arguments than the LAMBDA-expression requires and for which the arguments are all constants or variables. This heuristic was designed to be as conservative as possible while handling most cases which arise from typical macro-expansions. 51 The case where the expression substituted for a variable is a LAMBDA- expression constitutes an instance of procedure integration [Allen and Cocke]. The more general kind of procedure integration proposed in [Declarative], which would involve block compilation of several user functions, and possibly also user declarations or data type analysis, has not been implemented yet. It would be possible to substitute a LAMBDA-expression for a variable reference in the case of a variable bound by a LABELS. This might be useful in the case of a LABELS produced by a simple-minded PROG macro, which produced a labelled function for each statement of the PROG; in such a case most labelled functions would be referred to only once. We have not implemented this yet in RABBIT. (Note Loop Unrolling) Currently there is not any attempt to perform the inverse of beta- conversion. This process would be that of locating common subexpressions of some single large expression, making that large expression the body of a LAMBDA- • expression of one parameter, replacing all occurrences of the common subexpression by a reference to the parameter, and replacing the large expression by a combination whose function position contained the LAMBDA-expression and whose argument was a copy of the common subexpression. More generally, several common subexpressions could be isolated at once and made into several parameters of the LAMBDA-expression. For example, consider: (LAMBDA (A B C) (LIST (1 (+ (- B) (SQRT (~ (A B ?) (* 4 A C)))) (* 2 A)) (I (- (- B) (SQRT(- (ABZ) (*4AC)))) (* 2 A)))) Within the large expression (LIST ...) we might detect the common subexpressions (- B), (SQRT ...), and (* ? A). Thus we would invent three parameters Qi, Q2, Q3 and transform the expression into: 52 (LAMBDA (A B C) ((LAMBDA (Qi Q? Q3) (LIST (1 (+ Qi Q?) Q3) (I (- Qi Q?) Q3))) (- B) (SQRT (- (~ B ?) (* 4 A C))) (* ? A))) (There would be no problem of conflicting names as there is for macro rules, because we are operating on code for which all variables have already been renamed; QI, Q?, and Q3 can be chosen as the next variables in the renaming sequence.) This approach doesn't always work if side-effects are present; the abstracted (!) common subexpression may be evaluated too soon, or the wrong number of times. This can be solved by wrapping (LAMBDA () .) around the common subexpression, and replacing references by a combination instead of a simple variable reference. For example: (IF (HAIRYP X) (BLOCK (PRINT 'IHere is some hair:j) (PRINT X) (PRINT 'lEnd of hair.I)) (BLOCK (PRINT 'iThis one is bald:l) (PRINT X) (PRINT 'lEnd of baldness.l))) We could not transform it into this: ((LAMBDA (Ql) (IF (HAIRYP X) (BLOCK (PRINT 'IHere is some hair:I) Qi (PRINT 'lEnd of hair.l)) (BLOCK (PRINT 'IThis one is bald:I) Qi (PRINT 'lEnd of baldness.l)))) (PRINT X)) because x would be printed before the appropriate leading message. Instead, we 53 transform it into: ((LAMBDA (QL) (IF (HAIRYP X) (BLOCK (PRINT 'IHere is some hair:I) (Qi) (PRINT 'lEnd of hair.l)) (BLOCK (PRINT 'IThis one is bald:l) (Qi) (PRINT 'lEnd of baldness,j)))) (LAMBDA () (PRINT X))) This is similar to the call-by-name trick used in writing macro rules. A more general transformation would detect nearly common subexpressions as follows: ((LAMBDA (QI) (IF (HAIRYP X) (Qi 'IHere is some hair:l 'lEnd of hair.I) (Qi 'IThis one is bald:I 'lEnd of baldness.l))) (LAMBDA (Q? Q3) (BLOCK (PRINT Q?) (PRINT X) (PRINT Q3)))) In this way we can express the notion of subroutinization. (Note Subroutinization) We point out these possibilities despite the fact that they have not been implemented in RABBIT because the problem of isolating common subexpressions seems not to have been expressed in quite this way in the literature on compilation strategies. We might speculate that this is because most compilers which use complex optimization strategies have been for ALGOL-like languages which do not treat functions as full-fledged data objects, or even permit the writing of "anonymous" functions in functions calls as LISP does. RABBIT does perform folding on constant expressions (Allen and Cocke]; that is, any combination whose function is a side-effect-less MacLISP primitive 54 and whose arguments are all constants is replaced by the result of applying the primitive to the arguments. There is presently no attempt to do the same thing for side-effect-less SCHEME functions, although this is conceptually no problem. Finally, there are two transformations on IF expressions. One is simply that an IF expression with a constant predicate is simplified to its consequent or alternative (resulting in elimination of dead code). The other was adapted from [Standish], which does not have this precise transformation listed, but gives a more general rule. In its original form this transformation is: (IF (IF a b c) d e) 0 (IF a (IF b d e) (IF c d e)) One problem with this is that the code for d and e is duplicated. This can be avoided by the use of LAMBDA-expressions: ((LAMBDA (QI Q?) (IF a (IF b (QL) (Q2)) (IF c (Qi) (Q2)))) (LAMBDA () d) (LAMBDA () e)) As before, there is no problem of name conflicts with Qi and Q2. While this code may appear unnecessarily complex, the calls to the functions QI and Q2 will typically, as shown above, be compiled as simple GOTO's. As an example, consider the expression: (IF (AND PREDi PRED?) (PRINT 'WIN) (ERROR 'LOSE)) Expansion of the AND macro will result in: (IF ((LAMBDA (V R) (IF V (R) 'NIL)) PREDI (LAMBDA () PRED?)) (PRINT 'WIN) (ERROR 'LOSE)) 55 (For expository clarity we will not bother to rename all the variables, inasmuch as they are already distinct.) Because V and R have only one reference apiece (and there are no possible interfering side-effects), the corresponding arguments can be substituted for them. (IF ((LAMBDA (V R) (IF PREDi ((LAMBDA () PRED?)) 'NIL)) PREDI (LAMBDA () PRED?)) (PRINT 'WIN) (ERROR 'LOSE)) Now, since V and R have no referents at all, they and the corresponding arguments can be eliminated, since the arguments have no side-effects. (IF ((LAMBDA () (IF PREDi ((LAMBDA () PRED?)) 'NIL))) (PRINT 'WIN) (ERROR 'LOSE)) Next, the combination ((LAMBDA () ...)) is eliminated in two places: (IF (IF PREDi PRED? 'NIL) (PRINT 'WIN) (ERROR 'LOSE)) Now, the transformation on the nested IF's: ((LAMBDA (Qi Q?) (IF PREDi (IF PRED? (Qi) (Q?)) (IF 'NIL (Qi) (Q2)))) (LAMBDA () (PRINT 'WIN)) (LAMBDA () (ERROR 'LOSE))) Now one IF has a constant predicate and can be simplified: 56 ((LAMBDA (Qi Q?) (IF PREDI (IF PRED? (Qi) (Q2)) (Q?))) (LAMBDA () (PRINT 'WIN)) (LAMBDA () (ERROR 'LOSE))) The variable QI has only one referent, and so we substitute in, eliminate the variable and argument, and collapse a ((LAMBDA () ..)): ((LAMBDA (Q?) (IF PREDL (IF PRED? (PRINT 'WIN) (Q2)) (Q?))) (LAMBDA () (ERROR 'LOSE))) Recalling that (Q?) is, in effect, a GOTO branching to the common piece of code, and that by virtue of later analysis no actual closure will be created for either LAMBDA-expression, this result is quite reasonable. (Note Evaluation for Control) D. Conversion to Continuation-Passin St le This phase is the real meat of the compilation process. It is of interest primarily in that it transforms a program written in SCHEME into an equivalent program (the continuation-passing-style version, or CPS version), written in a language ~ with the property that interpreting it requires no control stack or other unbounded temporary storage and no decisions as to the order of evaluation of (non-trivial) subexpressions. The importance of these properties cannot be overemphasized. The fact that it is essentially a subset of SCHEME implies that its semantics are as clean, elegant, and well-understood as those of the original language. It is easy to build an 57 interpreter for this subset, given the existence of a SCHEME interpreter, which can execute the transformed program directly at this level. This cannot be said for other traditional intermediate compilation forms; building an interpreter for triples [Gries], for example, would be a tremendous undertaking. The continuation-passing version expresses all temporary intermediate results explicitly as variables appearing in the program text, and all temporary control structure in the form of LAMBDA-expressions (that is, closures). It is explicit in directing the order of operations; there is no non-trivial freedom at any point in the evaluation process. As a result, once the CPS version of a program has been generated, the remainder of the compilation process is fairly easy. There is a reasonably direct correspondence between constructs in the CPS language and "machine- language" operations (if one assumes CONS to be a "machine-language" primitive for augmenting environment structure, which we do). The later passes are complicated only by the desire to handle certain special cases in an optimal manner, most particularly the case of a function call whose function position contains a variable which can be determined to refer to a known LAMBDA- expression. This analysis must be done after the CPS conversion because it applies to continuations as well as LAMBDA-expressions written by the user or generated by macros. The CPS language differs from SCHEME in only two respects. First, each primitive function is different, in that it returns no value; instead, it accepts an additional argument, the continuation, which must be a "function" of one argument, and by definition invokes the continuation tail-recursively, giving it as an argument the computed "value" of the primitive function. We extend this by convention to non-primitive functions, and so all functions are considered to take a continuation as one of its arguments (by convention the first -- this 58 differs from the convention used in the examples in [SCHEME], [Imperative], and [Declarative]). Continuations, however, do not themselves take continuations as arguments. Second, no combination may have a non-trivial argument. In strict continuation-passing style (as described in note (Evalorder) of [Imperative]), this implies that no combination can have another combination as an argument, or an IF-expression with a non-trivial consequent or alternative, etc. We relax this to allow as arguments any trivial form in the sense described above for the preliminary triviality analysis. We note that, in principle, trivial expressions require no unbounded space on the part of the SCHEME interpreter to evaluate, and that the compiler need not worry about control and environment issues for trivial expressions. (Trivial expressions do require unbounded space on the part of the MacLISP run-time system, because the point of the triviality analysis is that trivial expressions can be handled by MacLISP! The question of what should be considered trivial is actually a function of the characteristics of our target machine. We note that, at the least, variables, constants, and LAMBDA- expressions should be considered trivial. That the preliminary triviality analysis does not consider LAMBDA-expressions trivial is a trick so that all closures will be processed by the CPS-conversion process, and the fact that we call it a triviality analysis is a white lie. See, however, [Wand and Friedman].) The effect of the restriction on combinations is startling. On the one hand, they do not so constrain the language as to be useless; on the other hand, they require a radically different approach to the expression of algorithms. It is easy to see that no control stack is necessary to evaluate such code, for, as mentioned in [SCHEME], control stack is used only to keep track of intermediate values and return addresses, and these arise only in the case of combinations 59 with non-trivial arguments, and conditionals with non-trivial predicates. An algorithm for converting SCHEME programs to continuation-passing style was given in Appendix A of [Declarative]. (Note Old CPS Algorithm) The one used in RABBIT is almost identical, except that for the convenience of the code- generation phase a distinction is made between ordinary LAMBDA-expressions and continuations, and between combinations used to invoke "functions" and those used to invoke continuations. These sets can in fact be consistently distinguished, and it affords a certain amount of error-checking; for example, a LAMBDA- expression should never appear in the "function" position of a continuation- invoking combination. Another fine point is that ASET' can never be applied to a variable bound by a continuation. Except for such differences arising from their uses, the two sets of constructs are treated more or less identically in later phases. An additional difference between the algorithm in [Declarative] and the one in RABBIT is that trivial subforms are treated as single nodes in the CPS version; these nodes have pointers to the non-CPS versions of the relevant code, which are largely ignored by later processing until the final code is to be generated. It must be emphasized that there is not necessarily a unique CPS version for a given SCHEME program; there is as much freedom as there is in the original program to re-order the evaluation of subexpressions. In the course of the conversion process decisions must be made as to what order to use in evaluating arguments to combinations. The current decision procedure is fairly simple- minded, consisting mostly of not making copies of constants and the values of variables. The point here, as earlier, is not so much that RABBIT has a much better algorithm than other compilers as that it has a far cleaner way of expressing the result. (For a complex decision procedure for argument ordering, see [Wulf].) (Note Non-deterministic CPS Conversion) 60 E. Environment and closure analysis This phase consists of four passes over the CPS version of the program. As with the earlier preliminary analysis, each pass determines one related set of information and attaches this information to nodes of the program tree and to property lists. The first pass (CENV-ANALYZE) analyzes variable references for the CPS version in a manner similar to that of the first pass of the preliminary analysis. The results of this previous analysis are used here in the case of trivial expressions; with this exception the analysis is redone completely, because additional variables are introduced by the CPS conversion. (None of these new variables can appear in an ASET', however, and so the analysis of written variables need not be done over.) In addition, for each variable reference which does not occur in the function position of a combination, we mark that variable with a non-nil VARIABLE-REFP property, used later to determine whether closures need to be created for known functions. The second pass (BIND-ANALYZE) determines for each LAMBDA-expression whether a closure will be needed for it at run time. There are three possibilities: (1) If the function denoted by the LAMBDA-expression is bound to some variable, and that variable is referenced other than in function position, then the closure is being treated as data, and must be a full (standard CBETA format) closure. If the function itself occurs in non-function position other than in a LAMBDA-combination, it must be fully closed. (2) If the closure is bound to some variable, and that variable is referenced