User control of garbage collection


Protecting array storage from garbage collection

Multithreaded Lisps that allow passing a Lisp array directly (without copy in and out) into a C function need some means to prevent the GC from moving the Lisp array around in memory during the C function call. There are at least three ways to accomplish this. Some Lisps (e.g. ACL) have extra MAKE-ARRAY parameters that specify that an array should not be moved by the GC (is "static"). With a multithreaded Lisp, this would (in theory at least) free other threads to execute garbage collection during the execution of the foreign function. Other Lisps (e.g. SCL) can pin a specific array during the body of a macro; SCL's strategy in particular allows other threads to execute garbage collection while the current thread is executing the foreign function. Finally, some Lisps (e.g. SBCL) have user functions to pause all garbage collection during the execution of a form. The basic abstraction in that case is a WITH-GC-OFF form:

  (WITH-GC-OFF
    (c-function lisp-array ...))
or GC-ON and GC-OFF functions (which can be used to implement a WITH-GC-OFF macro).

The absence of any of these features limits the implementation of our matrix library.

Finalization

Finalization is another important feature that determines how we store data in our matrix library. "Finalization" refers to a user-level hook into the GC that lets us call a function when an object is garbage-collected. This is very useful if the Lisp object has a pointer to some external resource (e.g. memory that was dynamically allocated in C) that should be freed.

Finalization on exit

Some Lisps, including SBCL, don't call finalizations on exit of the Lisp system. This isn't important for matrix storage because the OS will reclaim heap space on exit of the Lisp process. Where it does become important is if we use handles from external libraries that save state on disk. We need to finalize those handles on exit.

SBCL in particular has hooks so that you can call a function on exit of the Lisp system. This can be used to finalize named handles.

How to do finalization

The CFFI source is a good way to learn how to do finalization. Actually you can use CFFI to do finalization, in Lisps that support it. We used a CFFI source tarball (the first with ECL support), version 061012, to help us figure out which Lisps support finalization. The GCL port of CFFI is not complete; read the info manual (internal.texi) to learn more about the GCL FFI. You won't find a GCL manual online; you'll have to download the GCL package and build the info manual, or examine the .texi files by hand (as of 15 Oct 2006).

The presence of finalization allows us to use C arrays (allocated on the heap) to represent vector and matrix data, as long as the C arrays are wrapped in a Lisp object. A finalization can be set on the Lisp object so that when Lisp GC's it, free (or whatever C or Lisp function you want) is called on the C array.

Alternatives to finalization

There are other techniques besides finalization for freeing a C pointer automatically upon garbage collection of a wrapper Lisp object. In particular, Bruno Haible has written an essay that describes so-called "weak" data structures, such as weak pointers and weak hash tables. For example, we could keep a hash table that keys from weak pointer to a vector object, to the C pointer of the associated array. Periodically, we could search the keys of the hash table for weak pointers that have been broken; if they have, then free the associated C array. For example, we could keep track of the amount of C memory used every time we allocate an array. If the C memory usage passes a certain threshold, we can initiate a "collection." This introduces some overhead at object allocation time, which may be significant if we have to collect a large number of objects.

Note however that the classic technique of reference counting won't work with CLOS, because CLOS doesn't have destructors (as in C++ or Java) or destroy-instance methods. (The classic koan about Moon's warning to the student doesn't seem to apply here because there are no circular references: any view has reference only to the underlying array, and the array itself doesn't have reference to any other arrays. However, it is important that the views have a reference to the original object (not just the array) so that the object doesn't fall out of scope and cause the finalization of the object to deallocate the array.)

Feature table

Lisp implementation Static arrays or array pinning? WITH-GC-OFF? Finalizations? Weak pointers / hash tables?
ACL Static No (6) Yes Yes
Clisp (4) No No Yes Yes
CMUCL No Yes (10) Yes Yes
Corman No? (8) No? Yes Yes
ECL No? (3) No Yes No
GCL Static Yes ??? (5) No
LispWorks Static Not really (7) Yes Yes
OpenMCL No Yes (1) Yes Yes
SBCL No Yes (2) Yes Yes
SCL Pinned (9) No? Yes ?
Notes:

  1. OpenMCL: see the OpenMCL documentation. In particular, (EGC T) enables the GC and (ECG nil) disables it, though this is a moot point because OpenMCL probably boxes Lisp arrays.
  2. SBCL: see the SBCL manual (actually it's not there, but SB-EXT:GC-OFF and SB-EXT:GC-ON are the functions you want). The Matlisp developers have successfully exploited this feature for correct implementations of calling BLAS and LAPACK functions with Lisp arrays.
  3. ECL: as of around 14 Oct 2006, finalization is supported, but weak pointers are not, according to Juanjo. The old info manual says that MAKE-ARRAY takes an extra keyword argument :STATIC. If it is given a non-null value, then the array is static. I'm checking with Juanjo to see if this is still the case.
  4. Clisp: I don't think either feature is supported. See the developer notes on garbage collection.
  5. As of GCL <= 2.6.7, you can turn off SGC (stratified garbage collection) by calling (SI:SGC-ON nil) and turn it back on by calling (SI:SGC-ON T). GCL has a C function interface but no CFFI port yet; you can learn more about the interface in the Info documentation. The syntax is significantly different from that of CFFI. For example, DEFCFUN requires an entire C function header, rather than just the name of the function. Furthermore, the example in the info manual suggests that it's only meant to work with GCL-defined C objects. DEFENTRY seems to support more general C functions and could presumably help us. The internal.texi info file also gives an example of calling MAKE-ARRAY with :STATIC T. Note that the finalization interface isn't supported in CFFI; I don't know if GCL supports finalization.
  6. ACL 8.0: see the implementation notes. The :STATIC-RECLAIMABLE keyword is what you want. It's supported for all four datatypes (real and complex double and single float) that LAPACK 3.0 supports. If we move to arbitrary-precision floating-point, we'll want to use :LISPSTATIC-RECLAIMABLE for those arrays, since they can hold objects of any type (are they boxed?). The ACL 8.0 documentation specifically says that there is no way to stop scavenges from occurring, even temporarily.
  7. LispWorks: see the Reference Manual, which specifically says that if you want to pass a Lisp array into a C function, you should allocate the Lisp array with :ALLOCATION :STATIC.
  8. 32-bit LispWorks (and NOT 64-bit LispWorks) has an AVOID-GC function (use NORMAL-GC afterwards to restore things) but it isn't guaranteed to prevent a GC.
  9. Corman Lisp 3.0: see the manual. There are weak pointers (e.g. MAKE-WEAK-POINTER function creates a weak pointer to an object, and WEAK-POINTER-OBJ gets the object to which it points). There is a way to suspend the GC but it's only meant to work for a few instructions.
  10. Scieneer Common Lisp: see this post on comp.lang.lisp, in particular the response by Douglas Crosher on 22 Oct 2006. Mr. Crosher says the following:

    The recommended approach is to use the 'ext:with-pinned-object macro which pins the address of a lisp object within the context of the macro, and by default also inhibits write protection of pages occupied by the object. This allows other threads to execute garbage collection while the current thread is executing the foreign function. For example:

    (defun example (array)
       (ext:with-pinned-object (array)
         (let ((sap (sys:vector-sap array)))
            ... foreign call that may read or write directly to the lisp array ...
            )))
      
    If the foreign function does not write to the lisp object then the following faster form may be used:
       (ext:with-pinned-object (array :inhibit-write-protection nil) ...)
      

  11. CMUCL 19c: updates 16 Oct 2006 thanks to Raymond Toy. EXT:GC-OFF and EXT:GC-ON work, but SYS:WITHOUT-GCING is the preferred idiom. No static arrays. Weak pointers are supported but weak hash table support is currently (16 Oct 2006) experimental. Finalizers are implemented with weak pointers to the object you want to finalize.


Last modified: Mon Oct 23 17:27:33 PDT 2006