On this page:
1.12.1 Writing your Garbage Collectors
1.12.2 Testing your Garbage Collectors
1.12.3 Sample Code
1.12.3.1 Trivial Collector
1.12.4 Notes

1.12 Garbage Collectors

Complete this assignment with Team Two.

In this assignment, you will implement two garbage collectors: mark & sweep, and stop & copy. As we have seen, garbage collection involves starting from a root set of references, which reside in local variables and stack frames, and searching for reachable values on the heap.

1.12.1 Writing your Garbage Collectors

You must complete this assignment using a specific language. Choose Determine language from source in the DrRacket menu, then write
as the first line of your program, overwritting the default #lang at the top of the file.

This language defines an interface to a program’s stack and heap that you will use to implement garbage collection.

If you look at the documentation for plai/collector, it gives you a list of the functions that are available to you as a collector author (Garbage Collector Interface) and a list of the functions that you must write (Garbage Collector Exports). It is essential that you understand what these functions do, since it is the entirety of the assignment.

The most misunderstood pieces of the interface are get-root-set and procedure-roots.

get-root-set is a macro that takes any number of identifiers (including zero) and gives you the list of roots. If you give it identifiers, then it includes them on the list as roots (even though they may not actually be). Sometimes you have to fake roots in case a live object is only reachable from arguments passed to a collector function. For example, you should pass the first and rest addresses to get-root-set in your implementation of gc:cons. (This is not the only example!)

procedure-roots is a function that takes a closure and returns its environment as a list of roots. Since your mutators can contain closures, it is essential for you to call this function to get the environment so you can trace and copy it correctly.

1.12.2 Testing your Garbage Collectors

You must complete this assignment using a specific language. Choose Determine language from source in the DrRacket menu, then write
as the first line of your program, overwritting the default #lang at the top of the file.

This language is a subset of Racket that uses a garbage collector that you specify. The first line of a test program must be:

(allocator-setup "collector.rkt" heap-size)

where "collector.rkt" must be the name of your collector’s file. heap-size is the size of the heap your collector will use.

The remainder of the program is in a subset of Racket with numbers, symbols, lists, etc. The primitives of the language map directly to the procedures you define in your garbage collector.

Make sure you read the documentation for plai/mutator

1.12.3 Sample Code

To get you started, we’ve provided a trivial collector that signals an error when the heap fills up.

1.12.3.1 Trivial Collector

"collector.rkt":

#lang plai/collector
(define heap-ptr 'uninitialized-heap-ptr)
 
(define (init-allocator)
 (set! heap-ptr 0))
 
(define (gc:alloc-flat p)
 (begin
   (when (> (+ heap-ptr 2) (heap-size))
     (error 'gc:alloc-flat "out of memory"))
   (heap-set! heap-ptr 'prim)
   (heap-set! (+ 1 heap-ptr) p)
   (set! heap-ptr (+ 2 heap-ptr))
 
   (- heap-ptr 2)))
 
(define (gc:cons f r)
 (begin
   (when (> (+ heap-ptr 3) (heap-size))
     (error 'gc:cons "out of memory"))
   (heap-set! heap-ptr 'cons)
   (heap-set! (+ 1 heap-ptr) f)
   (heap-set! (+ 2 heap-ptr) r)
   (set! heap-ptr (+ 3 heap-ptr))
   (- heap-ptr 3)))
 
(define (gc:cons? a)
 (eq? (heap-ref a) 'cons))
 
(define (gc:first a)
 (heap-ref (+ 1 a)))
 
(define (gc:rest a)
 (heap-ref (+ 2 a)))
 
(define (gc:set-first! a f)
 (if (gc:cons? a)
     (heap-set! (+ 1 a) f)
     (error 'set-first! "expects address of cons")))
 
(define (gc:set-rest! a r)
 (heap-set! (+ 2 a) r))
 
(define (gc:flat? a)
 (eq? (heap-ref a) 'prim))
 
(define (gc:deref a)
 (heap-ref (+ 1 a)))

This is a very bad collector. In particular, it is unsafe, because it doesn’t verify that, for example, gc:set-rest! is only called on a cons. If you use this collector without fixing these problems, you will get a zero. The errors are there to force you to understand this program, rather than just copying it blindly

1.12.4 Notes

You must use a free list for mark & sweep. Do not simply scan memory looking for free blocks. (How would that work for 8 GB of RAM?)

You must store bookkeeping data on the heap provided by GC Collector Racket. You may store 2-3 atomic values, such as addresses into the heap as variables in your garbage collector. We will assume they represent machine registers. However, all compound data structures must be on the heap.

In particular, in the mark & sweep collector, you should maintain the free list in the heap. That is, you should not use any auxiliary data structure; instead, use the available space in the heap to keep track of the free list. You may use one extra box (“register”) to point to the start of the free list. You may need to adjust your allocation accordingly to have enough space to maintain free list pointers!

Some final words of advice:
  • You will want to test your program with small heap sizes, so that the garbage collector runs frequently. Try running it with various sizes so that the collector kicks in at different points in your code in order test it under various conditions.

  • You do not need to compact memory in mark-and-sweep.

  • You may find it convenient to use the Racket construct set!, which allows you to mutate a variable without using boxes. We recommend you use this feature only in one instance: when swapping the semispaces in the stop & copy collector. This will save you the trouble of unboxing the heap every time you use it.