by Manfred von Thun
This short paper contains the design of a Joy interpreter written in Joy itself.
Let L1 and L2 be two languages. An interpreter for language L1,
written in language L2 is a program P which takes as a parameter any
program written in L1 and executes it in the same way as a processor for
L1 would have. If the two languages are the same language L, such an
interpreter is called metacircular. Languages in which
program = data make it particularly easy to write an
interpreter in its own language. The best known example is the
Lisp interpreter eval
. A metacircular interpreter
for Joy can also be written.
The remainder of this paper assumes some familiarity with Joy.
A Joy interpreter in Joy is a program which expects a quoted program
on top of the stack and executes it. The following is the design of Joy
interpreter joy written in Joy itself. The first version of the
interpreter is merely a reminder that Joy already has a combinator, the
i
combinator, which removes a program from the top of the
stack and executes it:
joy == i
The next version makes explicit the fact that Joy programs are
sequences which are executed by stepping through the members of the
sequence. For each member in the sequence an appropriate action is
taken. The step
combinator will indiscriminately leave
literals, operators and combinators on top of the stack. But any
operators and combinators on top of the stack cannot actually
be executed. However, their unitlist
can be executed by the
i
combinator. So this is the next version of the
interpreter:
joy ==
[ unitlist
i ]
step
The last interpreter does not actually specify what is to be done what a particular element of the sequence has appeared on top of the stack. A better one should say things like this for the operators:
If the top element is "+"
then pop off the "+" and add the two numbers below
If the top element is "rest"
then pop off the "rest" and take the rest of the aggregate below
For the literals it is even simpler:
If the top element is any number
then leave it there and do not do anything
If the top number is any list
then leave it there and do not do anything
Essentially we need a way of examining what is on top of the stack
and executing the appropriate action. So in the previous version of the
interpreter we must replace the unitlist
by something more
specific. It has to consist of several cases, for the various literals,
operators and combinators. The opcase
operator is suitable
for just that. It expects any kind of item on top of the stack, and
above that a list of cases. Each case is a list consisting of a test
item and a (possibly) empty rest. The effect of the opcase
operator is to remove the top item and the list of cases, and to leave
behind the rest of case for which the item matched the test item. The
last case in the list is the default, it does not have a test item. The
default is returned if there was no match. For the present design step
the default does nothing. The following is what has to replace the
unitlist
in the interpreter:
[ [ 0 ] (* sample number *)
[ [] ] (* sample list *)
[ + pop + ] (* addition operator *)
[ rest pop rest ] (* rest operator *)
[ ] ] (* default, do nothing *)
opcase
It is an easy matter to add the other cases for literals. They have to be treated just like numbers and list:
[ 'A ] (* sample character *)
[ true ] (* sample truth value *)
[ "" ] (* sample string *)
[ {} ] (* sample set *)
Similarly, other operators have to be added such as:
[ swap pop swap ] (* swap operator *)
[ cons pop cons ] (* cons operator *)
For the combinators it is tempting to treat them just like operators:
[ dip pop dip ] (* dip combinator - WRONG *)
[ map pop map ] (* map combinator - WRONG *)
This will work correctly, but it just uses the Joy system
inside the quoted program that is being called by
i
or map
. Instead it should use the Joy-in-Joy
interpreter that we are writing. To achieve that effect, the program
parameter [P]
for the combinator has to be replaced by
[[P] joy]
. For i
, dip
and
map
and other unary combinators this is quite
easy: after the pop
execute [joy] cons
. This
gives cases like the following:
[ i pop [joy] cons i ] (* i combinator *)
[ dip pop [joy] cons dip ] (* dip combinator *)
[ map pop [joy] cons map ] (* map combinator *)
The case for the i
combinator is unnecessarily
inefficient, it could be optimised to:
[ i pop joy ] (* i combinator *)
However, for uniformity this optimisation will not be used here.
So that we do not lose track, here is the interpreter as designed so far:
joy == (* literals *)
[ [ [ 0 ]
[ [] ]
[ true ]
[ 'A ]
[ "" ]
[ {} ]
(* operators *)
[ + pop + ]
[ rest pop rest ]
[ dup pop dup ]
[ swap pop swap ]
[ pop pop pop ]
[ - pop - ]
[ and pop and ]
[ cons pop cons ]
(* unary combinators *)
[ i pop [joy] cons i ]
[ dip pop [joy] cons dip ]
[ map pop [joy] cons map ]
[ filter pop [joy] cons filter ]
[ ] ] (* provisional default *)
opcase
i ]
step
The interpreter is getting close to its final shape now, but several
things need to be fixed. Obviously the binary combinators have
to be treated in a way that is similar to the unary ones: The two
program parameters [P]
and [Q]
have to be
replaced by [[P] joy]
and [[Q] joy]
. This is
best done by using the app2unary2
combinator as
follows:
[[joy] cons] unary2 (* app2 *)
So for the binary combinators the cases look like this:
(* binary combinators *)
[ branch pop [[joy] cons] unary2 branch ] (* app2 *)
[ cleave pop [[joy] cons] unary2 cleave ] (* app2 *)
For the ternary combinators and quaternary
combinators the pattern is much the same, there are now three or
four program parameters that need to be modified. This is easily done
using the app3unary3
and
app4unary4
combinators to effect the
modification:
(* ternary combinators *)
[ ifte pop [[joy] cons] unary3 ifte ] (* app3 *)
(* quaternary combinators *)
[ linrec pop [[joy] cons] unary4 linrec ] (* app4 *)
[ binrec pop [[joy] cons] unary4 binrec ] (* app4 *)
There are still two major amendments needed for the interpreter. The
first concerns user defined symbols as they might occur in the standard
library, the personal library or in the preamble to a particular run.
All Joy symbols have an internal tag, and the tags differ individually
only for the operators and combinators. However, all numbers
have the same internal tag, all characters have the same tag,
all lists have the same tag and so on. Similarly all
defined symbols have the same tag. The opcase
operator
looks at these tags, so all that is needed is one new case for user
defined symbols. Just as any number will do as the prototype for
numbers, so any user defined symbol will do as the prototype for defined
symbols. We might as well choose “joy”. When a defined symbol has been
encountered, it is necessary to find its definition, which is the
program that constitutes the body of that definition. The
body
operator will find that, it expects a user defined
symbol on top of the stack and returns the defining program in quoted
form. The Joy interpreter now has to execute this. But it is essential
that the Joy interpreter should execute inside that quotation. So it
will not do to use the i
combinator, but the
joy
interpreter itself has to be used. Hence the case for
user defined symbols is just:
[ joy body joy ]
The single most common case will be the call of a defined atom rather
than an inbuilt one. To improve efficiency this case (with
joy
) is placed to the front of the caselist.
It is always possible that the interpreter is used for programs which
contain operators or combinators that are part of the language (not user
defined) but have not been given cases in the interpreter itself, either
intentionally or through neglect. The joy interpreter should be able to
perform reasonably for those operators and combinators. So this is where
the default clause for the opcase
operator comes in handy.
When none of the listed cases apply, treat the symbol as we did in the
second version of the interpreter: take the unitlist
and
use the i
combinator to execute that. Instead of using
unitlist
it is better to use its definition
[] cons
. So the default clause looks like this:
[ [] cons i ] (* default *)
This completes the working design of the interpreter. The following is its structure with the last two additions written out fully:
joy ==
[ [ [ joy body joy ] (* user defined *)
... (* literals *)
... (* operators *)
... (* unary combinators *)
... (* binary combinators *)
... (* ternary combinators *)
... (* quaternary combinators *)
[ [] cons i ] ] (* default *)
opcase
i ]
step
Before we write out a more complete version of the interpreter, it is useful to make a number of changes to the design.
Firstly, the cases for the combinators of various arities contain
common code which is repeated again and again. The interpreter becomes
more readable if such instances are factored out and defined separately.
There are cases for unary,
binary, ternary and quaternary combinators, and for each of these the common code will be called
cr1,
cr2,
cr3and
cr4. For uniformity the code for
cr1is assimilated to that of the others, using ~~app1~~
i`.
Since they are not likely to be wanted anywhere else, their definitions
are given inside a HIDE declaration.
Secondly, it could be useful if the default case does its job not
silently but traces out each symbol that it hands over to the Joy
system. Such symbols are dup
licated and then written out by
put
. This now makes the default case:
[ dup put [] cons i ]
The interpreter joy now looks like this:
HIDE
cr1 == pop [[joy] cons] i; (* app1 *)
cr2 == pop [[joy] cons] unary2; (* app2 *)
cr3 == pop [[joy] cons] unary3; (* app3 *)
cr4 == pop [[joy] cons] unary4 (* app4 *)
IN
joy ==
[ [ [ joy body joy ]
[ [] ]
[ 0 ]
...
[ dup pop dup ]
[ + pop + ]
[ cons pop cons ]
[ put pop put ]
...
[ i cr1 i ]
[ dip cr1 dip ]
[ map cr1 map ]
[ filter cr1 filter ]
[ i cr1 i ] (* app1 *)
[ unary2 cr1 unary2 ] (* app2 *)
...
[ ifte cr3 ifte ]
[ linrec cr4 linrec ]
[ binrec cr4 binrec ]
...
[ dup put [] cons i ] ]
opcase
i ]
step
END
The interpreter has appropriate cases for all literals and for quite a few operators and combinators. Most operators and combinators are still missing and will therefore be handled by the default clause. However, it is straightforward to make the interpreter more comprehensive and even complete.
It is of some interest to write an interpreter that is just adequate
to interpret itself and leaves everything else to the default clause.
The following is the minimal Joy interpreter joy0; for variety it uses
the optimisation for the i
combinator mentioned
earlier:
joy0 ==
[ [ [ joy0 body joy0 ]
[ [] ]
[ pop pop pop ]
[ cons pop cons ]
[ opcase pop opcase ]
[ body pop body ]
[ i pop joy0 ]
[ step pop [joy0] cons step ]
[ [] cons i ] ]
opcase
i ]
step
The two versions joy and joy0 are in the file Joy in Joy.