NOTE The implementation of eval_expr and the design of the stack in this chapter are rather ad-hoc, and I'm not particularly proud of them. Please skip to the next chapter if they offend you.

Continuations and tail recursion

Our eval_expr function has been implemented recursively — that is to say, when in the course of evaluating an expression it is necessary to evaluate a sub-expression, eval_expr calls itself to obtain the result.

This works fairly well, and is easy to follow, but the depth of recursion in our LISP environment is limited by the stack size of the interpreter. LISP code traditionally makes heavy use of recursion, and we would like to support this up to the limit of available memory.

Take the following pathological example:

(define (count n)
  (if (= n 0)
      0
      (+ 1 (count (- n 1)))))

The COUNT function will recurse to depth n and return the sum of n ones. Expressions such as (COUNT 10) should compute OK with our current interpreter, but even (COUNT 10000) is enough to cause a stack overflow on my machine.

To achieve this we will rewrite eval_expr as a loop, with helper functions to keep track of evaluations in progress and return the next expression to be evaluated. When there are no more expressions left, eval_expr can return the final result to the caller.

As eval_expr works through the tree of expressions, we will keep track of arguments evaluated and pending evaluation in a series of frames, linked together to form a stack. This is broadly the same way that the compiled version of the recursive eval_expr works; in this case we are replacing the machine code stack with a LISP data structure and manipulating it explicitly.

The stack can also be thought of as representing the future of the computation once the present expression has been evaluated. In this sense it is referred to as the current continuation.

Since any function which is called by eval_expr may not call eval_expr (to avoid recursion), we must integrate apply and builtin_apply into the body of eval_expr.

Implementation

A stack frame has the following form.

(parent env evaluated-op (pending-arg...) (evaluated-arg...) (body...))

parent is the stack frame corresponding to the parent expression (that is, the one which is waiting for the result of the current expression). env is the current environment, evaluated-op is the evaluated operator, and pending-arg... and evaluated-arg are the arguments pending and following evaluation respectively. body... are the expressions in the function body which are pending execution.

Rather than writing out long lists of car() and cdr(), we will define some helper functions to manipulate members of a list.

Atom list_get(Atom list, int k)
{
	while (k--)
		list = cdr(list);
	return car(list);
}

void list_set(Atom list, int k, Atom value)
{
	while (k--)
		list = cdr(list);
	car(list) = value;
}

void list_reverse(Atom *list)
{
	Atom tail = nil;
	while (!nilp(*list)) {
		Atom p = cdr(*list);
		cdr(*list) = tail;
		tail = *list;
		*list = p;
	}
	*list = tail;
}

Another function creates a new stack frame ready to start evaluating a new function call, with the specified parent, environment and list of arguments pending evaluation (the tail).

Atom make_frame(Atom parent, Atom env, Atom tail)
{
	return cons(parent,
		cons(env,
		cons(nil, /* op */
		cons(tail,
		cons(nil, /* args */
		cons(nil, /* body */
		nil))))));
}

Here is the innermost part of our new exec_expr, which sets expr to the next part of the function body, and pops the stack when we have reached end of the body.

int eval_do_exec(Atom *stack, Atom *expr, Atom *env)
{
	Atom body;

	*env = list_get(*stack, 1);
	body = list_get(*stack, 5);
	*expr = car(body);
	body = cdr(body);
	if (nilp(body)) {
		/* Finished function; pop the stack */
		*stack = car(*stack);
	} else {
		list_set(*stack, 5, body);
	}

	return Error_OK;
}

This helper binds the function arguments into a new environment if they have not already been bound, then calls eval_do_exec to get the next expression in the body.

int eval_do_bind(Atom *stack, Atom *expr, Atom *env)
{
	Atom op, args, arg_names, body;

	body = list_get(*stack, 5);
	if (!nilp(body))
		return eval_do_exec(stack, expr, env);

	op = list_get(*stack, 2);
	args = list_get(*stack, 4);

	*env = env_create(car(op));
	arg_names = car(cdr(op));
	body = cdr(cdr(op));
	list_set(*stack, 1, *env);
	list_set(*stack, 5, body);

	/* Bind the arguments */
	while (!nilp(arg_names)) {
		if (arg_names.type == AtomType_Symbol) {
			env_set(*env, arg_names, args);
			args = nil;
			break;
		}

		if (nilp(args))
			return Error_Args;
		env_set(*env, car(arg_names), car(args));
		arg_names = cdr(arg_names);
		args = cdr(args);
	}
	if (!nilp(args))
		return Error_Args;

	list_set(*stack, 4, nil);

	return eval_do_exec(stack, expr, env);
}

The next function is called once all arguments have been evaluated, and is responsible either generating an expression to call a builtin, or delegating to eval_do_bind.

int eval_do_apply(Atom *stack, Atom *expr, Atom *env, Atom *result)
{
	Atom op, args;

	op = list_get(*stack, 2);
	args = list_get(*stack, 4);

	if (!nilp(args)) {
		list_reverse(&args);
		list_set(*stack, 4, args);
	}

	if (op.type == AtomType_Symbol) {
		if (strcmp(op.value.symbol, "APPLY") == 0) {
			/* Replace the current frame */
			*stack = car(*stack);
			*stack = make_frame(*stack, *env, nil);
			op = car(args);
			args = car(cdr(args));
			if (!listp(args))
				return Error_Syntax;

			list_set(*stack, 2, op);
			list_set(*stack, 4, args);
		}
	}

	if (op.type == AtomType_Builtin) {
		*stack = car(*stack);
		*expr = cons(op, args);
		return Error_OK;
	} else if (op.type != AtomType_Closure) {
		return Error_Type;
	}

	return eval_do_bind(stack, expr, env);
}

This part is called once an expression has been evaluated, and is responsible for storing the result, which is either an operator, an argument, or an intermediate body expression, and fetching the next expression to evaluate.

int eval_do_return(Atom *stack, Atom *expr, Atom *env, Atom *result)
{
	Atom op, args, body;

	*env = list_get(*stack, 1);
	op = list_get(*stack, 2);
	body = list_get(*stack, 5);

	if (!nilp(body)) {
		/* Still running a procedure; ignore the result */
		return eval_do_apply(stack, expr, env, result);
	}

	if (nilp(op)) {
		/* Finished evaluating operator */
		op = *result;
		list_set(*stack, 2, op);

		if (op.type == AtomType_Macro) {
			/* Don't evaluate macro arguments */
			args = list_get(*stack, 3);
			*stack = make_frame(*stack, *env, nil);
			op.type = AtomType_Closure;
			list_set(*stack, 2, op);
			list_set(*stack, 4, args);
			return eval_do_bind(stack, expr, env);
		}
	} else if (op.type == AtomType_Symbol) {
		/* Finished working on special form */
		if (strcmp(op.value.symbol, "DEFINE") == 0) {
			Atom sym = list_get(*stack, 4);
			(void) env_set(*env, sym, *result);
			*stack = car(*stack);
			*expr = cons(make_sym("QUOTE"), cons(sym, nil));
			return Error_OK;
		} else if (strcmp(op.value.symbol, "IF") == 0) {
			args = list_get(*stack, 3);
			*expr = nilp(*result) ? car(cdr(args)) : car(args);
			*stack = car(*stack);
			return Error_OK;
		} else {
			goto store_arg;
		}
	} else if (op.type == AtomType_Macro) {
		/* Finished evaluating macro */
		*expr = *result;
		*stack = car(*stack);
		return Error_OK;
	} else {
	store_arg:
		/* Store evaluated argument */
		args = list_get(*stack, 4);
		list_set(*stack, 4, cons(*result, args));
	}

	args = list_get(*stack, 3);
	if (nilp(args)) {
		/* No more arguments left to evaluate */
		return eval_do_apply(stack, expr, env, result);
	}

	/* Evaluate next argument */
	*expr = car(args);
	list_set(*stack, 3, cdr(args));
	return Error_OK;
}

And here we are at last with the new eval_expr. There is a lot of code for setting up special forms, but the rest is simply a loop waiting for the stack to clear.

int eval_expr(Atom expr, Atom env, Atom *result)
{
	Error err = Error_OK;
	Atom stack = nil;

	do {
		if (expr.type == AtomType_Symbol) {
			err = env_get(env, expr, result);
		} else if (expr.type != AtomType_Pair) {
			*result = expr;
		} else if (!listp(expr)) {
			return Error_Syntax;
		} else {
			Atom op = car(expr);
			Atom args = cdr(expr);

			if (op.type == AtomType_Symbol) {
				/* Handle special forms */

				if (strcmp(op.value.symbol, "QUOTE") == 0) {
					if (nilp(args) || !nilp(cdr(args)))
						return Error_Args;

					*result = car(args);
				} else if (strcmp(op.value.symbol, "DEFINE") == 0) {
					Atom sym;

					if (nilp(args) || nilp(cdr(args)))
						return Error_Args;

					sym = car(args);
					if (sym.type == AtomType_Pair) {
						err = make_closure(env, cdr(sym), cdr(args), result);
						sym = car(sym);
						if (sym.type != AtomType_Symbol)
							return Error_Type;
						(void) env_set(env, sym, *result);
						*result = sym;
					} else if (sym.type == AtomType_Symbol) {
						if (!nilp(cdr(cdr(args))))
							return Error_Args;
						stack = make_frame(stack, env, nil);
						list_set(stack, 2, op);
						list_set(stack, 4, sym);
						expr = car(cdr(args));
						continue;
					} else {
						return Error_Type;
					}
				} else if (strcmp(op.value.symbol, "LAMBDA") == 0) {
					if (nilp(args) || nilp(cdr(args)))
						return Error_Args;

					err = make_closure(env, car(args), cdr(args), result);
				} else if (strcmp(op.value.symbol, "IF") == 0) {
					if (nilp(args) || nilp(cdr(args)) || nilp(cdr(cdr(args)))
							|| !nilp(cdr(cdr(cdr(args)))))
						return Error_Args;

					stack = make_frame(stack, env, cdr(args));
					list_set(stack, 2, op);
					expr = car(args);
					continue;
				} else if (strcmp(op.value.symbol, "DEFMACRO") == 0) {
					Atom name, macro;

					if (nilp(args) || nilp(cdr(args)))
						return Error_Args;

					if (car(args).type != AtomType_Pair)
						return Error_Syntax;

					name = car(car(args));
					if (name.type != AtomType_Symbol)
						return Error_Type;

					err = make_closure(env, cdr(car(args)),
						cdr(args), &macro);
					if (!err) {
						macro.type = AtomType_Macro;
						*result = name;
						(void) env_set(env, name, macro);
					}
				} else if (strcmp(op.value.symbol, "APPLY") == 0) {
					if (nilp(args) || nilp(cdr(args)) || !nilp(cdr(cdr(args))))
						return Error_Args;

					stack = make_frame(stack, env, cdr(args));
					list_set(stack, 2, op);
					expr = car(args);
					continue;
				} else {
					goto push;
				}
			} else if (op.type == AtomType_Builtin) {
				err = (*op.value.builtin)(args, result);
			} else {
			push:
				/* Handle function application */
				stack = make_frame(stack, env, args);
				expr = op;
				continue;
			}
		}

		if (nilp(stack))
			break;

		if (!err)
			err = eval_do_return(&stack, &expr, &env, result);
	} while (!err);

	return err;
}

Testing

Let's try our COUNT function again.

> (count 100000)
100000

Hooray! We can now recurse as much as we like without causing a stack overflow. If you have a lot of RAM, you should even be able to do a million levels deep.

Tail recursion

If the last expression in a function is a call to another function, then the result can be returned directly to the first function's caller. This is known as a tail call. If the called function, through a series of tail calls, causes the first function to be called, we have tail recursion.

Tail calls do not require the caller's stack frame to be retained, so a tail-recursive function can recurse as many levels as necessary without increasing the stack depth.

The count function could be formulated as a tail-recursive procedure as follows:

(define (count n a)
  (if (= n 0)
      a
      (count (- n 1) (+ a 1))))

(count 100000 0)

If you watch eval_expr with a debugger you can confirm that the stack never grows above a few levels deep.

All that is left to do is clean up all the temporary objects created by our new evaluator.