by Manfred von Thun
This paper shows how to write simple programs in Joy. Since Joy is very different from familiar programming languages, it takes a while to become used to writing programs in Joy. One way to start the learning process is by way of writing some simple generally useful library programs. In an implementation these may be part of an actual library, or they may be built into the language.
Some general utility programs include operators for manipulating the Joy stack just below the top element, further operators for manipulating aggregate values, and a few output programs. Generally useful are the stack type and the queue type. Values and operators of these two types are easily implemented as Joy lists and list operators.
Another collection of useful operators take an aggregate as parameter and produce a list of subaggregates. These operators are polymorphic in the sense that the aggregate parameter can be a (small) set, a string, or a list. For example, one such operator can take a set as parameter and produces a list of its subsets. All of these operators are definable without recursion by using the linrec combinator.
Some arithmetic operators are often used to illustrate recursive definitions, although it is well known that iterative execution is more efficient. In particular the use of accumulating parameters can often replace recursion. This is easily done in Joy using various iteration combinators.
Values of sequence types, such as strings and lists, can be sorted, and sorted sequences can be merged. Programs for doing this are easily written in Joy without recursive definitions by using appropriate combinators instead.
Joy’s inbuilt set type is implemented just as bitstrings, and hence it is limited to small sets of small numbers. The more useful big set type, which allows large sets of elements of any type, can be implemented in any language which has lists. It is simple to do in Joy, and the usual set-theoretic operations are easily provided. A similar implementation can be used for the dictionary type, which uses lookup tables for finite functions.
Also useful is the tree type, of lists, or lists of lists, or lists of lists of lists … of elements other than lists.
The remainder of this paper illustrates programming in Joy by way of simple examples. Many of the programs are first written in pseudo-code and then translated into Joy. Some of the Joy programs here have been adapted from the literature on ML or Miranda (“Miranda” is a trademark of Research Software Ltd). The next section defines some useful general purpose operators and combinators. This is followed by a section on two little collections of operators for two datatypes, stacks and queues. A longer section deals with programs for creating and manipulating lists of subaggregates. A shorter section then illustrates the use of accumulating parameters for the efficient implementation of numeric functions. Then there is another section on sorting sequences and on merging sorted sequences. Another section gives an implementation of unrestricted sets and of lookup dictionaries.
This section describes some very simple utilities which are useful in very different settings. Joy definitions are of the form:
ATOM == PROGRAM
where the long equals == means that the atom on the left is being defined to cause the execution of the program on the right.
Joy has three important operations for manipulating the top few items
of the stack: pop
for removing the top item,
dup
for creating a copy of the top item, and
swap
for interchanging the top two items. Often it is
necessary to perform similar operations further below the stack. The
following define three similar operators which leave the top element of
the stack intact and perform the work just below that. All three use the
dip
combinator which takes as a parameter a quoted program
and below that a further item. The item is saved, the quoted program is
executed and the item then restored:
popd == [pop ] dip
dupd == [dup ] dip
swapd == [swap] dip
The popd
operator removes the second item. The
dupd
operator duplicates the second item. The
swapd
operator interchanges the second and third item.
Three similar operators affect the order of the top three items on
the stack. The rollup
operator places items one, two and
three from the top in the order two, three, one. The
rolldown
operator places items one, two and three in the
order three, one, two. The rotate
operator places them in
the order three, two, one:
rollup == swap [swap] dip
rolldown == [swap] dip swap
rotate == swap [swap] dip swap
The next examples are for unary operators which expect an
item on the stack and replace it with either a set or a string or a list
containing just that item. All three operators work by first pushing the
empty set {} or the empty string “” or the empty
list [] on top of the stack. Then they use the cons
operation to add the item below into the aggregate. The items have to be
of the right type. The unitset
operator requires a small
number, the unitstring
operator requires a character, and
the unitlist
operator requires anything. If the items are
not of the right type, an error occurs when the cons
is
executed:
unitset == {} cons
unitstring == "" cons
unitlist == [] cons
The action of all three is reversed by the single standard operator
first
.
Analogously one may define three operators pairset
,
pairstring
and pairlist
, which form a set,
string or list from two appropriate items on top of the
stack:
pairset == {} cons cons
pairstring == "" cons cons
pairlist == [] cons cons
The action of all three is reversed by a single operator
unpair
, which may be defined in either of two equivalent
ways:
unpair == uncons uncons pop
unpair == uncons first
Joy has two operators applicable to non-empty sets, strings
and lists: The operator first
extracts the first item of a
string or list, and that item from a set which is the first in the
underlying order. The operator rest
removes the first item
and returns the remaining set, string or list. Sometimes it is necessary
to extract the second or third item. Suitable definitions are these:
second == rest first
third == rest rest first
The operators uncons
and unswons
undo what
is done by cons
and swons
. Often it is useful
to dissect not just one aggregate into its first
and
rest
, but to dissect two aggregates. This can be done by
uncons2
and unswons2
, defined as follows:
uncons2 == [uncons ] dip uncons swapd
unswons2 == [unswons] dip unswons swapd
Both expect two aggregates on top of the stack, both leave two
first
s and two rest
s on the stack. For
uncons2
the two first
s are items 3 and 4 on
the stack, the two rest
s are items 1 and 2. For
unswons2
it is the other way around.
Similarly, it is sometimes necessary to test whether at least one of
two aggregates is empty, or whether at least one of two numeric values
is equal to zero. For a single parameter this is done by
null
, for two parameters it is done by null2
,
defined in either of two equivalent ways:
null2 == [null] [true] [pop null] ifte
null2 == [ [[null] true] [pop null] ] cond
Strings and lists are special kinds of aggregates, they are sequences. Sometimes it is necessary to reverse sequences. The naive way of doing this is recursively as follows:
To reverse a sequence S:
If S is empty
then return the empty sequence
else remove the first element
reverse the rest of S
append the first element of S at the tail
It is easy to see that this is very inefficient because the append
operation requires a lot of copying, and every element to be appended
requires portions of the rest to be copied again and again. A well-known
optimisation uses an extra parameter, an accumulating
parameter, to obtain the same effect. The idea is to prepend the
elements of the original list onto the accumulating parameter. Sometimes
this is expressed by analogy with railways. The shunt
operator takes two sequences as parameters and, starting at the front of
the topmost parameter, moves all items onto the front of the second
parameter. Joy has a combinator step
for stepping through
all items of its top parameter, and for each item executing a program
that is given as a further parameter. That program has to take the item
and add it to the accumulating parameter, so it is the
swons
operator. So this is how shunt
can be
defined:
shunt == [swons] step
To reverse a list or a string, an empty list or empty string has to
be supplied as an accumulating parameter just below the list or string
that is to be reversed. So here are definitions for
reverselist
and reversestring
:
reverselist == [] swap shunt
reversestring == "" swap shunt
But there is something unsatisfactory about this, the reversal
operation should be polymorphic. So the following version of
reverse
first tests whether the sequence to be reversed is
a list or not, and inserts the appropriate accumulating parameter. The
testing is done by the iflist
combinator which takes two
(here rather tiny) programs as parameters and below that one other item,
the list or string to be reversed. If the latter happens to be a list,
then the first quoted program is executed, and it will push an empty
list. Otherwise the second program is executed, and it will push an
empty string. In either case the two top items are now
swap
ped and then shunt
ed:
reverse == [[]] [""] iflist swap shunt
It comes as a surprise that lists can be reversed in another way. The
idea is this: When a list is executed by a combinator, all the members
of the list will be literals, so they will each be pushed onto the
stack. The last element of the list will end up on top of the stack. So
the elements of the list will then be in reverse order. To make use of
this idea we have to arrange that an initially empty list is treated as
a stack. This is what the infra
combinator does. It takes a
list as one parameter and a program as the second. It uses the list as a
temporary stack and executes the program. The resultant stack is then
pushed as a list. For the reversal problem the program is the list to be
reversed, and the other parameter has to be the empty list. That empty
list first has to be inserted below the list to be reversed. So another
program to reverse a list is this:
reverselist == [] swap infra
What makes this version possible is that in Joy the principle that program = data is extended to program = data = memory. This version is actually more efficient than the one given earlier. Of course it cannot be adapted for reversing strings.
The two principal operators for explicit output are put
,
which prints a single value of any type, and putch
, which
prints a single stand alone character without quote symbol. Two useful
little utility operators are worth defining. The putchars
operator uses the step
combinator to step through the
characters in a list or string and writes them to the output file
without the enclosing []
or ""
. The newline
operator just outputs the newline character \n
to terminate
a line:
putchars == [putch] step
newline == '\n put
Using the step
combinator it is easy to define several
conversion operators which can be useful. The first two produce sets
from aggregates of upper or lower case aggregates. The last two produce
strings of upper or lower case characters from aggregates of small
numbers:
upper2set == {} swap [64 - swons] step
lower2set == {} swap [96 - swons] step
set2upper == "" swap [64 + swons] step
set2lower == "" swap [96 + swons] step
The dip
combinator expects a quotation and below that an
item that will be saved before execution of the quotation and restored
afterwards. Sometimes one wants to save and restore two or even three
items, so it is useful to have further variants dip2
and
dip3
, defined as follows:
dip2 == [dip] cons dip
dip3 == [dip] cons dip2
Note that cons
is being used to build a constructed
program that is then supplied as a parameter to the last
combinator.
This section and the next two contain implementations of two simple
data types. Members of the stack type are linear structures
which allow read and write access at one end only, and members of the
queue type are linear structures which allow read access at one
end and write access at the other end. Both have this much in common:
the stack or queue remains on the Joy stack, and any stack or queue
operations using it leave it there. Of course it can be explicitly
removed with pop
.
1. First, the stack type. A stack is a linear
structure that can grow by having items added, inspected and removed all
from the same end. The simplest way to implement stacks in Joy is as
lists. One essential operator is st-new
for the creation of
a new empty stack, which is just an empty list. So the definition
is:
st-new == []
Stacks can have additional items pushed onto them, and this is done
by adding them to the front of the list. Since the stack is already
there, the new item will typically be first pushed onto the Joy stack,
and then it is to be pushed onto the stack. One way to do this is to
swap
the item and the stack and then perform a
cons
operation. But Joy has an operation which combines
these two, namely swons
. So the st-push
operation can be defined by:
st-push == swons
An essential predicate is st-null
for testing whether a
stack is empty or null. However it will not do to just use
null
, since this will remove the stack – but typically the
stack is intended for further applications. So, to avoid losing the
stack, it has to be dup
licated first, and then the
null
test can be applied to the duplicate:
st-null == dup null
The previous two operations both make sense when the stack is empty,
but this is not the case for the stack operations to follow. The first
of these is st-top
for extracting the top element of the
stack, while leaving the stack itself unchanged. The second is
st-pop
for removing the top element. The third is
st-pull
which combines the last two, it is the opposite of
push. It extracts the top item and pops the stack. Ignoring the
complication of an empty stack, the definitions would simply be:
st-top == dup first
st-pop == rest
st-pull == unswons
To guard against an empty stack, a test has to be performed to determine whether the stack is empty. If it is, then an error message should be given, otherwise the operation should be performed. So for all three operations the structure will be of the form:
== [null] [ERROR-MESSAGE] [PERFORM OPERATION] ifte
The error messages should state which of the operations was being
attempted, but otherwise they should be the same. So the name of the
operation is given as a string parameter to an error handling operation.
That particular operation will be called _st-error
, and we
leave the details of its implementation till a little later. The leading
underscore _
in the name has been added because this
operation is not intended to be used by the programmer; in the current
implementation the help command hides identifiers with a leading
underscore. The remaining stack operations are then:
st-top == [null] ["st-top" _st-error] [dup first] ifte
st-pop == [null] ["st-pop" _st-error] [rest ] ifte
st-pull == [null] ["st-pull" _st-error] [unswons ] ifte
As may be seen, the three operations still have a lot in common, and
one might consider extracting that further. However, the result is
likely to be less clear to the human reader. It remains to implement the
error operation. It should state that an error has occurred due to an
empty stack, and this part is the same for all three operations. It
should also state which of the operations failed. So a minimal
implementation of _st-error
would simply write one string
which is common for any call, and another string which is the specific
parameter. This is crude but very easy to implement:
_st-error == "non_empty stack needed for " put put
A minor improvement is to concatenate the two strings (in the right
order), so that only one string has to be written. But the double
quotation marks in the output still look silly. So instead of writing
the one or two strings with put
, it looks nicer to write
their constituent characters with putchars
. Also, the line
should be terminated with a newline
. Finally, there is
little sense in continuing the computation, so after the two parts of
the error message have been displayed, it is best to abort
,
to return to the top level.
In a library implementation for the collection of definitions of stack operations might look like this:
LIBRA (* stack *)
_st-error == "non-empty stack needed for " putchars putchars newline abort;
st-new == [];
st-push == swons;
st-null == dup null;
st-top == [null] ["st-top" _st-error] [dup first] ifte;
st-pop == [null] ["st-pop" _st-error] [rest ] ifte;
st-pull == [null] ["st-pull" _st-error] [unswons ] ifte.
As may be seen from the example, a library declaration begins with
the word LIBRA and terminates with a period. In between is a sequence of
definitions separated by semicolons ;. A definition consists of an
atomic symbol and then the symbol == followed by a Joy program. Note
again that the _st-error
operator is not really intended to
be used outside the remaining definitions. It could well be hidden
completely from outside. A mechanism for this will be illustrated
below.
2. Next, the queue type. A queue is a linear structure that can grow by having items added at one end, and inspected or removed from the other end. A simple minded implementation would consist of a single Joy list to which items are added at the end and removed from the front. But adding something at the end requires copying the entire list every time. Nothing would be gained by reversing the role of front and end, because in that case the removal requires copying of (almost) the entire list. A better implementation uses two lists. Conceptually one is the front of the queue, and items are removed at the front. Conceptually the other list is the back of the queue, but in reverse, and items are added to the front of this list. If at any time the list implementing the front of the queue becomes empty, the other list gets explicitly reversed and it becomes the front, and the empty list becomes the rear. There are two auxiliary operators that need only be visible to the remaining operators but not to the outside; in the following they are hidden in the private part of this module. Because they are hidden, there is no need to choose names which indicate the datatype on which they operate:
LIBRA (* queue *)
HIDE
error == "non_empty queue needed for " putchars putchars newline abort;
prepare == [null] [swap reverse] [] ifte
IN
q-new == [] [];
q-add == swap [swons] dip;
q-addl == swap [shunt] dip;
q-null == prepare dup null;
q-front == prepare [null] ["q-front" error] [dup first] ifte;
q-rem == prepare [null] ["q-rem " error] [unswons ] ifte
END.
As may be seen, such a declaration consists of the word HIDE, followed by a sequence of definitions, then the word IN followed by another sequence of definitions, and then the word END. A sequence of definitions is again separated by semicolons ;. The whole declaration can occur inside a library declaration where a single definition can occur. Any hiding declaration can occur wherever a single definition can occur, so they can be nested.
The first auxiliary error reporting procedure, error
, is
similar to the one for stacks. The second auxiliary operation,
prepare
, prepares the two lists: if the list implementing
the front happens to be empty, the roles of the two lists are
interchanged. If the front list is not empty, nothing is done.
A new queue is created by q-new
in the form of two empty
lists. An item can be added (to the “rear of the queue”) by
q-add
which adds it to the front of the second list. The
members of a whole list can be added to the rear by q-addl
.
The operator q-null
first prepares the two lists so that
the list implementing the front is not empty, if that is possible at
all. It then tests the front list. The operator q-front
and
q-rem
extract respectively a copy of the front element or
the front element itself. The copy or the original are left above the
two lists. Both operators require the queue to be prepared so that the
list implementing the front is not empty. Also, both operators need to
check whether the front really is non-empty. If it is not, the error
operator is called.
The definitions for stacks and queues are part of the library file TYPLIB.JOY.
The aggregate types of Joy comprises sets of small numbers,
or strings of characters, or lists of items of any kind. Much of this
section deals with lists of aggregates constructed from a given
aggregate. The principal tool is the linrec
combinator. It
expects four program parameters on the stack, an if-part, a then-part, a
rec1-part and a rec2-part. Execution begins by saving the four parts and
then executing the if-part. If that produces the truth value
true
on top of the stack, the then-part is executed and the
combinator exits. Otherwise the rec1-part is executed, then the
combinator calls itself with the same four parts, and then the rec2-part
is executed.
The first definition below is for an operator restlist
which takes any aggregate as parameter and produces the list of all
those subaggregates that would be formed by repeatedly taking the rests
of the aggregate. Such an operator can of course be defined recursively
and this could be done in any language. But in Joy it is possible to use
a non-recursive definition using the linrec
combinator.
Here is some pseudocode:
1. If the aggregate is empty
2. then form its unitlist
3. else take a copy and the rest of that,
recurse using this rest,
eventually forming a list of aggregates,
4. use cons to add the original aggregate
to the front of this list
The above pseudocode translates directly into a recursive form in any
language, but in Joy a non-recursive definition is also possible. The
four program parameters for the linrec
combinator
correspond exactly to the labelled lines. Nothing corresponds to the
unlabelled line, the linrec
combinator recurses here
automatically:
restlist ==
1. [ null ]
2. [ unitlist ]
3. [ dup rest ]
4. [ cons ]
linrec
The next program also takes an aggregate as parameter and produces a
list of subaggregates. But the subaggregates are those obtained by
successively deleting the last elements. In analogy with the previous
operator it will be called the frontlist operator. For empty aggregate
parameters again the unitlist
has to be returned, so the
if-part and the then-part are the same as before. Also, for non-empty
aggregates the aggregate has to be taken apart in the rec1-part. This
can be done in two ways. We can take the front aggregate and the last
element, but that would require defining a suitable operator, and it
would require expensive copying in the case of list or string
aggregates. Alternatively we can just uncons
. This leaves
only the rec2-part to be written. But it will be more complicated than
for the previous operator. Let us ignore for the moment that the
operator is intended to be used for aggregates of any of the three
types. When the anonymous recursion has completed, the stack will
contain the first item of the non-empty aggregate and above that the
frontlist
of its rest. The first item has to be
cons
ed into each member of the frontlist
, and
that is best done by:
[cons] map
Then that first item, which is now still the second element on the
stack, has to be deleted. This can be done by a variant of
pop
, namely popd
. Finally, assuming that the
operator is to be used for lists, the empty list has to be added to the
frontlist
, and the easiest way is by
[] swap cons
, or simply by [] swons
. This
gives the following provisional rec2-part:
[ [cons] map popd [] swons ]
But the assumption that the operator is to be used only for lists is
unnecessarily restrictive. The final part, adding an empty aggregate,
should depend on what the initial aggregate was, a set, a string or a
list. This can be achieved by looking up the first element of the
frontlist
, it is a one element aggregate and taking its
rest
produces the required empty aggregate of that type. So
the required rec2-part is:
[ [cons] map popd dup first rest swons ]
The entire definition for frontlist
, applicable to any
aggregate, now is:
frontlist ==
[ null ]
[ unitlist ]
[ uncons ]
[ [cons] map popd dup first rest swons ]
linrec
The next program defines an operator subseqlist
which is
in some ways a combination of the preceding ones. Again it takes any
aggregate as parameter and returns a list of subaggregates. This time
the subaggregates are all those obtainable from the parameter aggregate
by deleting first or last elements. For ordered aggregates, lists and
strings, the resulting subaggregates will still contain elements in the
same order as the parameter. It is tempting to define the operator very
simply by:
== frontlist [restlist] map
But this produces not a list of subsequences but a list of list of subsequences. This list of lists could then be flattened to a single list, even if this is somewhat inefficient. However, a different solution is possible.
The if-part and the then-part are as for restlist
and
frontlist
, of course. The rec2-part is easy, it is only
necessary to concat
enate two lists that were produced by
the rec1-part. But the rec1-part is rather complex, and this is what it
has to do: the first element of the aggregate has to be extracted and
later it has to be cons
ed into every subaggregate of the
frontlist
of the rest of the aggregate. But also the rest
of the aggregate has to be made available for the linrec
combinator to work on. So the rec1-part must uncons
the
aggregate, and produce a second copy of the rest. The second copy has to
be kept aside by using the dip
combinator to work on the
original copy. So an intermediate draft of the rec1-part looks like
this:
[ uncons dup
[ ... ]
dip ]
The [...]
program must take the frontlist
(of the original copy of the rest) and then cons
the first
element into each of the members of the result. We already know how to
do that, and how to delete the hidden first member. So the rec2-part is
the following:
[ uncons dup
[ frontlist [cons] map popd ]
dip ]
The entire program now is this:
subseqlist ==
[ null ]
[ unitlist ]
[ uncons dup [frontlist [cons] map popd] dip ]
[ concat ]
linrec
The program uses frontlist
, but because the latter is
defined without recursion, it is possible to simply use the RHS of the
definition of frontlist
and insert that textually into the
definition of subseqlist
. The frontlist
and
subseqlist
operators were adapted from recursive programs
in {Thompson91 p 247}.
The next program defines a unary operator powerlist which for any aggregate returns a list of all subaggregates. If the parameter aggregate has N members, the resulting list of subaggregates has members:
powerlist ==
[ null ]
[ unitlist ]
[ uncons ]
[ dup swapd [cons] map popd swoncat ]
linrec
It uses the linrec
combinator and the same if-part and
then-part as the previous programs. It also uses the same rec1-part as
the frontlist
program: before recursing, the parameter is
split into its first and its rest by uncons
. The recursion
then produces the powerlist
of the rest. The rec2-part then
uses that result to produce two copies, using dup
. One copy
is left untouched, the other has the original first element
cons
ed before every sublist using map
. For
this to work, the original first element has to be placed in the right
position by swapd
and eventually deleted by
popd
. The two resultant lists are finally combined with
swoncat
. This produces the list of subaggregates starting
with the empty one; if concat
is used instead of
swoncat
, the list ends with the empty one. Neither method
yields the subsequences sorted according to size
; but see
the later section on sorted sequences.
The next program defines a binary operator insertlist
.
This operator expects a list or a string as a parameter and below that a
potential list member or a character. It returns a list whose elements
are either lists or strings, each with the potential member or character
inserted once at all possible positions. So if the original list or
string has N members, the resultant list has lists or strings as
members:
insertlist == (* Item Sequence -> List(Sequence) *)
cons
[ small ]
[ unitlist ]
[ dup (* keep original *)
unswons [uncons] dip swons ] (* take out second *)
[ swap [swons] cons map (* swons in second *)
cons ] (* cons in original *)
linrec
The operator can also be used, with a set parameter instead of a string or list. Then it produces a list of identical sets, each with the new member added. But such a use will rarely be wanted.
The permlist
operator expects a sequence and returns the
list of all permutations of that sequence. So if the original
sequence has N members, the result list has factorial(N) members:
permlist ==
[ small ]
[ unitlist ]
[ uncons ]
[ swap [swap insertlist] cons map
flatten ]
linrec
Note again that in the two preceding programs the map
combinator uses a constructed program.
The zip
operator expects two aggregate parameters, not
necessarily of the same type, and not necessarily of the same
size
. It produces a list of two element lists by combining
corresponding elements from the two aggregates. The result list contains
as many pairs as the smaller of the two parameter aggregates. Here is
the definition:
zip ==
[ null2 ]
[ pop pop [] ]
[ uncons2 ]
[ [pairlist] dip cons ]
linrec
This might be paraphrased as: If one or the other of the two
parameter aggregates is null
, then pop
them
both and return the empty list []
, otherwise take out the
two first
elements and the two rest
s, recurse
with the two rest
s producing a result list of two element
lists of that, then dip
below that result list to combine
the two previous first
s with pairlist to form a two element
list, and cons
that into the front of the result list.
Related to this is the more general zipwith
combinator,
adapted from {Bird and Wadler 1988 p 57}.
It takes three parameters, two aggregates and one quotation which can be
used to combine members of the aggregates. The program again uses the
linrec
combinator which needs four quoted programs as
parameters. The fourth quotation now has to be a constructed
program, it is built from the program parameter of
zipwith
and the program stub [dip cons]
already seen for zip
. The other three program parameters
for linrec
first have to be dip
ped below the
parameter of zipwith
. The definition for this combinator
thus is:
zipwith ==
1. [ [ null2 ]
2. [ pop pop [] ]
3. [ uncons2 ] ]
dip
4. [ dip cons ] cons
linrec
A list of sequences can be concatenated into a single sequence by the
unary operator flatten
. The code is straightforward: if the
parameter list is empty, then there is nothing to concatenate, leave it
as it is. Otherwise use uncons
to take out its first and
its rest, recurse anonymously on the rest to produce the
flatten
ed result of that, and finally
concat
enate the saved first part to the front of the last
result:
flatten == [null] [] [uncons] [concat] linrec
A two dimensional matrix can be implemented as a list of lists. One important matrix operation is the interchange of rows and columns, performed by the unary operator transpose. A draft of the program is the following:
To transpose a list of lists LL :
1 If LL is empty or some sublist of LL is empty
2 then pop it off and return the empty list []
else (all sublists are non-empty)
3a construct a list of all the first s of sublists
3b construct a list of all the rest s of sublists
recurse anonymously on the list of rests
4 cons the list of firsts into the result.
This version has been adapted from {Reade
1989 p 133}. To test the disjunction whether LL is empty or some of
its sublists are empty, we use the conditional combinator
ifte
. The if-part has to test whether LL is empty, and if
it is, then the then-part has to return true
. Otherwise the
then-part will have to determine whether some sublists of LL are empty.
This is best done with the combinator some. So part 1 of the Joy version
is:
1 [null] [true] [[null] some] ifte
For parts 3a and 3b it is necessary to use the parameter LL to
produce two lists, of the first
s and the
rest
s. Either of these two can be obtained with the map
combinator. To obtain the two lists, the cleave
combinator
can be used, it takes two quotation parameters and a further parameter,
and produces two values, one from each of the two quotations. So parts
3a and 3b are just:
3a [ [first] map ]
3b [ [rest ] map ]
cleave
The entire Joy program thus is:
transpose ==
1 [ [null] [true] [[null] some] ifte ]
2 [ pop [] ]
3 [ [[first] map] [[rest] map] cleave ]
4 [ cons ]
linrec;
Alternatively, line 1 can be replaced by the following:
1 [ [null] [[null] some] disjoin i ]
Here the two tests are first disjoined to form a single test
predicate. This is then called by the i
combinator. The net
effect is exactly the same as in the version given earlier.
A common binary operation on aggregates is that of forming the
Cartesian product. It will take two aggregates as parameters
and produce the list of two element lists which each contain one element
from each of the two aggregates. If the two aggregates have M and N
members respectively, then the resultant list has M × N
elements. In order to form the Cartesian product, it is necessary to
consider each of the members of one aggregate with each of the members
of the other aggregate. This is like step
ping through an
aggregate, except that there are two aggregates to be
step
ped through. It will be useful first to define a
combinator step2
which does just that, leaving at each step
two items on top of the stack. Then the Cartesian product operator will
just form their pairs, and then form a list of all these pairs. An
application of the step2
combinator will look like
this:
A1 A2 [P] step2
The implementation is based on the simple idea that A2
and [P]
be used to construct a program which is then used
by the ordinary step
combinator to step through the
elements of A1
:
A1 [ .. A2 .. [P] .. ] step
We now fill in the dots. The program to be constructed has to
step
through the members of A2
using a program
which depends on [P]
. It has to do this for each member of
A1
, and when it has done that the current member of
A1
can be pop
ped off. So the program will have
to look like this:
A1 [ A2 [ .. P ] step pop ] step
Since P
must be allowed to consume a current member of
A1
but this still has to be available for the next inner
step
, that member of A1
first has to be
duplicated, below the current member of A2
. So the dots are
just [dup] dip
. In sum, the required program is:
A1 [ A2 [ [dup] dip P ] step pop ] step
The definition of step2
must construct that program from
A2
and [P]
and then call step
with just two parameters, the program just constructed and below that
the other aggregate A1
. The definition looks like this:
step2 ==
[[dup] dip] swoncat (* form inner quote *)
[step pop] cons (* form outer quote *)
cons (* insert A2 *)
step (* through A1 *)
It is now relatively easy to define the Cartesian product
operator as follows. First we need to insert an accumulating
parameter, an empty list []
. It has to be inserted
below the two aggregates of which the Cartesian product is to
be computed. This is easily done with [] rollup
. The
program which is then used by the step2
combinator has to
form the pairlist
of the two items on top of the stack. The
resultant pair has to be inserted into the accumulator with
swons
. But between the accumulator and the just formed pair
is the current original member of the first aggregate which must be left
intact. So the pair and the member have to be swap
ped, and
the swons
has to be done below the member, by
dip
. This is the program that is given as the parameter for
step2
. So the definition for the cartproduct
operator is:
cartproduct ==
[] rollup
[pairlist swap [swons] dip]
step2
The program works for aggregates of any type, and the two aggregates
do not have to be of the same type. If both are lists of numbers or sets
of small numbers, other variations are possible: By changing the pairing
operator pairlist
to, say, multiplication *
,
we obtain a program which produces a list of all products. By further
changing the accumulator []
to the number 0
and the insertion operation swons
to, say, addition
+
, we obtain a program which produces the sum of all
M × N
products. For two numeric aggregates of the same
size, say N, another binary operator can be defined, the
scalarproduct:
scalarproduct ==
0 rollup (* accumulator *)
[ null2 ]
[ popd ]
[ uncons2 [* +] dip2 ]
tailrec
It produces the sum of all N products of pairs taken from corresponding positions in the two aggregates.
This section gives efficient implementation of several well-known functions which are often used in the literature for demonstration purposes: the factorial, the Fibonacci-function, exponentiation and the greatest common divisor. All of them are often defined recursively, and of course they can be defined recursively in Joy. Using one of several suitable recursion combinators they can be computed recursively in Joy without a recursive definition. But recursive execution in any language can be inefficient.
There are well known techniques for removing linear recursion, see for example {Bauer and Wössner 1982 Chapter 4}. The same topic is discussed in {Henson 1987 Chapter 4} using the useful concept of eureka definitions due to Burstall and Darlington. These involve creative steps in the production of more efficient versions of programs, and hence would be difficult to perform by an optimising program.
Several of the functions to be defined require a little program to be
executed a number of times. A useful combinator for this is
times
. It requires the program to be repeated as the top
element of the stack and the required number of repetitions to be the
second element on the stack.
The factorial of a number n is simply the product of n factors from 1
to n. To compute it using times
, a small program has to be
pushed on top of the number n which is the parameter. The number itself
will be consumed by times
. The program works on two other
numbers on the stack. One of these is the accumulating parameter, it has
to start at 1. The other is the next factor to be used by the program
with which to multiply the accumulator. The multiplication has to be
done without losing the factor, so it has to be duplicated first. Apart
from doing the multiplication, the program also has to increment the
factor using the successor operator succ
. The program which
is the parameter to times
thus is:
[dup [*] dip succ]
Before the times
combinator can get to work on the
parameter n and the quoted program, the accumulator and the first factor
have to be placed in position, below the parameter n. Both begin with
the value 1, so the rolldown
operator can be used to push
these two values below n. Finally, after times
has
completed, the stack will contain the required accumulator value but
also on top of that the next factor. The latter is simply
pop
ped off. The entire definition of fact is:
fact == 1 1 rolldown [dup [*] dip succ] times pop
The Fibonacci function can be computed in a similar way.
Again there is a certain computation that has to be repeated a number of
times as given by the parameter n. Again the computation involves two
further numbers, the larger one is to be replaced by their sum, and the
smaller one is to be replaced by the former larger one. Adding the two
must not destroy the original larger number, so again it has to be
dup
licated. The addition is then performed under the
control of dip
. Then the two numbers are
swap
ped. This describes the little program that serves as
the parameter to the times
combinator. Before it can start,
the two initial values 0 and 1 have to be placed below the parameter n
with rolldown
. When it has completed, the required
accumulated sum is the second element, and the top element, the useless
next summand, is pop
ped. So this is the definition of
fib
:
fib == 0 1 rolldown [dup [+] dip swap] times pop
This version of the Fibonacci function requires a computation time which is a linear function of the parameter n.
The recursive version of the Fibonacci function requires quadratic
computation time. Since the result values are not very large, it is
often used as a test program. What is of interest is the number of
recursive calls made during the computation, to be divided by the total
time it took. To obtain the number of recursive calls it is often
convenient to use a variant of the Fibonacci function, sometimes called
nfib
. It has the property that the value returned is the
same as the number of calls made during recursive execution. The
following are recursive definitions of Fibonacci and its variant:
r-fib == [small] [] [pred dup pred [r-fib ] unary2 + ] ifte (* app2 *)
r-nfib == [small] [pop 1] [pred dup pred [r-nfib] unary2 + succ] ifte (* app2 *)
These are recursive definitions which of course are intended to run
in quadratic time. The following is a definition of nfib
which uses accumulators to run in linear time. Of course it does not
measure its own runtime, it is included here to illustrate a programming
technique:
nfib == 1 1 rolldown [dup [+ succ] dip swap] times pop
The next two programs are for binary operators which compute
functions of two parameters: the exponentiation function and
the greatest common divisor. Exponentiation can be computed by
performing a certain operation as many times as given by the exponent.
This description again suggests using the times
combinator
to execute a quoted program several times. The operation to be repeated
consists in multiplying an accumulator by the base which is the second
parameter. So it is necessary to construct a little program from the
base which for every call will multiply by the base. Assuming that the
base is in the right position on the stack, the program is easily
constructed, by:
[*] cons
Before the constructed program can be handed to
times
, the initial value 1 has to be placed as an
accumulator below the two numbers which are the two parameters, the base
and the exponent. This is done by 1 rollup
. To get the two
parameters into the order appropriate for cons
it is
necessary to perform a rotate
first. So here is the
exp
operator:
exp == 1 rotate [*] cons times
The technique of first constructing a program (here by
cons
) and then supplying it to a combinator (here
times
) is very useful in Joy.
The next program computes the greatest common divisor of two numbers,
using Euclid’s algorithm. The algorithm uses two numbers and
repeatedly takes the remainder after dividing one by the other. The
remainder obtained is then used to replace the dividend. The process is
repeated as long as the potential divisor is positive. So, unlike the
previous programs, we cannot use the times
combinator.
Instead a combinator called while
is used which resembles
while-loops in imperative languages. It takes two parameters: the
while-part is a quoted program which must return a truth value, and the
do-part is a quoted program which can compute anything. The while-part
in the following gcd
program is of course very similar to a
corresponding part in the fib
program:
gcd == [0 >] [dup [rem] dip swap] while pop
Two other arithmetic functions that are sometimes useful are for
computing the sum or the product of a set or a list of numbers. Both are
best implemented by stepping through all members of the set or list,
doing additions or multiplications with an accumulator every time. The
initial accumulator value, 0
or 1
, is first
pushed onto the stack below the parameter set or list. For comparison,
the third line below gives a definition of the size operator which is
applicable to any aggregate. The fourth line below gives a definition of
a similar operator size2
for determining the total number
of elements in a list of aggregates. If these aggregates are themselves
lists, then their members are counted but not the members of their
sublists:
sum == 0 swap [+ ] step
product == 1 swap [* ] step
size == 0 swap [pop succ] step
size2 == 0 swap [size + ] step
A generalisation of size2
for counting the leaves in
recursive lists or trees is treesize
, defined later.
The sequence types of Joy are the string type and the list type. Values of these types can be ordered. Strings contain just characters, but lists may contain anything. So for lists it only makes sense to speak of ordering if the elements are characters or integers or something else that has an ordering defined on it.
An informal description of the quicksort algorithm is this:
To sort a sequence S :
1 If S is small (has only 0 or 1 element)
2 then it is sorted already, leave it alone
else (S has at least one element)
3 using the first of S
as a "pivot" for comparison,
split the rest of S into two portions -
those that are less than the pivot
and those that are not
separately sort both portions P1 and P2
4 concatenate the now sorted lesser portion,
the pivot, and
the sorted other portion.
The following is a definition of an operator qsort
which
uses the above algorithm. But instead of using explicit binary
recursion it uses the binrec
combinator. This is like
the linrec
combinator except that it recurses twice, once
each on the top two elements of the stack. The recursions again occur
between the rec1-part and the rec2-part. The program also uses another
combinator split
which takes as parameter an aggregate and
above that a quoted program which must return a truth value. The
split
combinator returns two aggregates, containing those
elements for which the test yields false
and those for
which it yields true
. The split
combinator has
access to the remainder of the stack which in this case contains the
pivot. So the test determines whether the pivot is >
than the element being examined:
qsort ==
1 [ small ]
2 [ ]
3 [ uncons [>] split ]
4 [ swapd cons concat ]
binrec
Sometimes it is required to sort a list of aggregates on the basis of
their first elements. In that case it is necessary to supply to the
comparison operator >
not the pivot and the element to
be apportioned by split
, but their first elements instead.
This is conveniently done by the app2unary2 combinator which
applies a quoted program to two elements on top of the stack and
replaces them by whatever values the programs return:
qsort1 ==
1 [ small ]
2 [ ]
3 [ uncons [[first] unary2 > ] split ] (* app2 *)
4 [ swapd cons concat ]
binrec
Note that in part 3 when the first element of the pivot has to be
compared with the first element of the aggregate to be apportioned, the
first element of the pivot is being extracted every time. It would
perhaps be more efficient if the first element of the pivot is extracted
just once, as soon as the pivot is available. In that case it is
necessary to take the pivot apart with unswons
, but this
has to be done by dip
ping below the rest of the list still
to be sorted. Then the quotation parameter to split
just
needs to take out the first
of the current aggregate and
compare it with the first of the pivot. After split
has
done its job, the pivot has to be re-assembled by swons
,
but this now has to be done below the two portions with
dip2
. So part 3 can be replaced by:
3 [ uncons [unswons] dip [first >] split [swons] dip2 ]
Sometimes it might be necessary to sort a list of items on the basis
not of their first element but on the size or their second or third
element or even the size of the second of the third element. For the
last example it would only be necessary to use
[third second size]
instead of [first]
in the
qsort1
program. But it would be impossible to anticipate
all alternative sorting bases for a library, and it would be awkward to
have to write the appropriate sorting program on every special occasion.
It is possible to write a general quicksort program which takes as an
additional parameter something like [first]
or
[third second size]
. The mk-qsort
combinator
does just that:
mk_qsort ==
[ [small] [] ] dip
[ unary2 >] cons [split] cons [uncons] swoncat (* app2 *)
[ swapd cons concat ]
binrec
It begins in line 1 by inserting the standard if-part and then-part
below its parameters. In line 2 it uses the parameter to build a
constructed program, the required rec1-part. Then in line 3 it
pushes the standard rec2-part. At this point the top five elements of
the stack are the list to be sorted and above that the four program
parts needed for binrec
. The latter now executes. For
example the program:
[third second size] mk-qsort
will sort a list of lists of three or more elements whose third member are aggregates of two or more elements. It will sort according to the size of the second of the third element.
The binary operator insert
takes a sorted sequence and a
potential new member as parameters, it returns a new sequence with the
additional member inserted in the appropriate position. Here is a draft
program:
To insert an item into a sorted sequence :
1 If the sequence is empty or
its first element is >= than the item
2 then add the new item in the front of the sequence
3 else set aside the first item of the sequence
recurse with the rest of the sequence and the new item
4 add the previously set aside first item to the front
The disjunction in line 1 is best handled by the disjoin operator on
programs. It expects two quoted programs which return a truth value, and
it returns a single quoted program which computes their disjunction. So
line 1 consists of two quoted programs one of them tests whether the
sequence is empty, the other tests whether its first element is
>=
than the item to be inserted. The
disjoin
operator then produces their disjunction. The
resulting program is the if-part for the linrec
combinator.
The other three parts are now quite obvious. So the definition in Joy
is:
insert ==
1 [ pop null ] [ [first] dip ] disjoin
2 [ swons ]
3 [ [uncons] dip ]
4 [ cons ]
linrec
Two sorted sequences can be merged into a single sequence which respects the original ordering. Here is a very informal algorithm for a recursive version:
To merge two sorted sequences :
If the first sequence is empty,
then throw it away and return the second sequence.
If the second sequence is empty,
then throw it away and return the first sequence.
(Both sequences are non-empty, so both have a first element:)
If the first of the first sequence is less than the first of the second,
then set the lesser element aside,
recurse using the rest of the first sequence,
prepend the previously set aside element.
If the first of the first sequence is greater than the first of the second,
then set the lesser element aside,
recurse using the rest of the second sequence,
prepend the previously set aside element.
(The two first elements of the sequences are equal:)
Default
set both first elements aside,
recurse using the rests of both sequences,
prepend the two previously set aside elements.
Like just about all programming languages, Joy has an if-then-else
construct (ifte
) for two-way branching. Multiway branching
can be achieved by nested ifte
s, but this can become
difficult to read. Joy has another combinator for multi-way branching
borrowed from Lisp. The combinator cond
expects one
parameter which is a list of cases. The last case is the default case,
the other cases each consist of a condition or if-part and a program or
then-part. The condition is a quoted program in front of the program.
Execution of the cond
combinator tests successive
conditions, and for the first condition that yields true
the associated program is executed. If none of the conditions is true,
the default case is executed. The informal algorithm given earlier now
translates into the following recursive definition of r-merge:
r-merge ==
[ [ [null] pop]
[ [pop null] swap pop]
[ [unswons2 <] [uncons] dip r-merge cons]
[ [unswons2 >] uncons swapd r-merge cons]
[ uncons2 r-merge cons cons] ]
cond
As may be seen from the earlier informal version and the above Joy
version, for each case the program recurses at most once. Therefore the
program has the pattern of linear recursion. However, because
there are three cases in which recursion occurs, it is not possible to
use the linrec
combinator. However, Joy has a combinator
condlinrec
which has features of cond
and
linrec
. The combinator condlinrec
also expects
one parameter which is a list of cases. Again the last case is the
default case, and the other cases consist of a list of two or three
quoted programs. If there are just two parts, then they are called the
if-part and the then-part. Their meaning is as for cond
. If
there are three parts, then they are called the if-part, the rec1-part
and the rec2-part. In that case linear recursion occurs between
execution of the rec1-part and the rec2-part. The following is a
non-recursive definition of merge
:
merge ==
[ [ [null] [pop] ]
[ [pop null] [swap pop] ]
[ [unswons2 <] [[uncons] dip] [cons] ]
[ [unswons2 >] [uncons swapd] [cons] ]
[ [uncons2] [cons cons] ] ]
condlinrec;
Sometimes it is necessary to merge two lists of aggregates on the
basis of their first elements. In that case the comparisons
<
and >
should not be applied to the
elements of the sequences but to their first members. A simple solution
is to replace the two comparisons respectively by the following two:
[first] unary2 < [first] unary2 > (* app2 *)
So the definition of the merge1
operator could be:
merge1 ==
[ [ [null] [pop] ]
[ [pop null] [swap pop] ]
[ [unswons2 [first] unary2 <] [[uncons] dip] [cons] ] (* app2 *)
[ [unswons2 [first] unary2 >] [uncons swapd] [cons] ] (* app2 *)
[ [uncons2] [cons cons] ] ]
condlinrec
The definition of merge
(and especially
merge1
) could be optimised so that the unswons
(and the first
) is not done repeatedly for each comparison.
As the definitions stand, they are easy to understand and work
correctly.
Computer words are short bit-sequences and a common size is 32. These can be used to implement small sets of small numbers 0..31, with a few common set operations implemented in hardware. Joy uses this in its set type. But often it is necessary to have either much larger sets or sets of larger elements. Such a big set type can be implemented in various ways: as unordered lists, as ordered lists, as unbalanced trees or as balanced trees. Each implementation method has its advantages and disadvantages. The following implementation of big sets in terms of ordered lists has been adapted from {Bird and Wadler 1988 p 230 ff}.
The empty set is represented as an empty list, in this library it is
written as bs-new
:
LIBRA (* big sets *)
bs-new == [];
One very important binary set operation is union. The two
parameters are sorted lists, and the returned value also has to be a
sorted list. It would appear that the two lists should be simply merged.
But if they have an element in common, then the returned list would then
contain the element twice. However, in sets any element should occur at
most once. This consideration affects the default case, the last case of
the program list which is the parameter. The case occurs when the first
elements of the two parameter lists are equal. So in the definition of
bs-union
instead of saving and later restoring both, only
one is saved and later restored:
bs-union ==
[ [ [null] [pop] ]
[ [pop null] [swap pop] ]
[ [unswons2 <] [[uncons] dip] [cons] ]
[ [unswons2 >] [uncons swapd] [cons] ]
[ [rest [uncons] dip] [cons] ] ]
condlinrec;
The same situation arises for inserting or adding a new
member to a set. If the new member is already in the set, then it should
not be inserted again. So if the first member of the current list is
equal to the candidate new member, then the candidate is just popped off
in the third line below. In the definition of bs-insert
the
only recursion occurs in the last, the default case:
bs-insert ==
[ [ [pop null] [swons] ]
[ [[first] dip >] [swons] ]
[ [[first] dip =] [pop] ]
[ [[uncons] dip] [cons] ] ]
condlinrec;
The next operator tests for membership, so it must return a
truth value. If the list is null or its first element is
>
than the candidate, then false
is
returned. If the first element is =
to the candidate, then
true
is returned. In the default case, when the relation is
<
, the list has to be inspected recursively further
down. So the case must contain two programs to effect the recursion.
However, on the way back from the recursion, the last returned truth
value is the one to be used. Hence no further action is required, and
the second program is just []
. This is the definition of
bs-member
:
bs-member ==
[ [ [pop null] [pop2 false] ]
[ [[first] dip >] [pop2 false] ]
[ [[first] dip =] [pop2 true] ]
[ [[rest] dip] [] ] ]
condlinrec;
The same device is used in the default case of the definition of
bs-differ
for finding the difference between two
sets. As may be seen, there are two further recursive cases, for
<
and >
, and one of them uses the same
device again:
bs-differ ==
[ [ [null] [pop]]
[ [pop null] [pop pop []] ]
[ [unswons2 <] [[uncons] dip] [cons] ]
[ [unswons2 >] [rest] [] ]
[ [[rest] dip rest] [] ] ]
condlinrec;
The next definition is for bs-delete
, it
deletes a specified member from a set, if it is a member at
all. The only recursive case is the default case:
bs-delete ==
[ [ [pop null] [pop] ]
[ [[first] dip >] [pop] ]
[ [[first] dip =] [pop rest] ]
[ [[uncons] dip] [cons] ] ]
condlinrec.
The operations of inserting or deleting members into or from a set are essentially special cases of taking unions or differences with unitsets. So the following definitions might have been given instead of the earlier, more efficient definitions:
bs-insert == unitlist bs-union;
bs-delete == unitlist bs-differ;
A dictionary is a way of implementing finite functions as
argument-value pairs. A pair is best implemented in Joy as a two element
list. The totality of pairs is then essentially a big set, and any of
the ways of implementing these is suitable here. If the argument part of
pairs is subject to an ordering relation, the sets of pairs can be
implemented as lists ordered in accordance with the first element, the
argument of the pairs. Not surprisingly then, some of the code to follow
is reminiscent of code for qsort1
and merge1
.
The following is a library for the dictionary type. A new
dictionary is created by d-new
. A predicate
d-null
returns true
or false
according as the parameter dictionary is empty or not. New pairs are
added by d-add
, they are inserted in the correct place
based on the ordering of the first member of the pairs. The union or
difference of two dictionaries is given by the two binary operators
d-union
and d-differ
. A single pair is removed
by the binary operator d-rem
, it removes the pair whose
first member matches the given query parameter. Instead of a test for
membership there is a binary operator d-look
which extracts
the first pair whose first element matches the query.
Only the program for one of the operators will be developed here, the
program for d-union
:
To form the union of two dictionaries D1 and D2:
1 If D2 is empty, pop it off and return just D1
2 If D1 is empty, retain D2, pop D1 and return D2
3 Extract the first pairs from D1 and D2,
from both pairs compare their firsts with <
If the comparison is true,
below D2, uncons D1 into its first
and rest
recurse anonymously on the rest of D1 and D2
cons the saved first pair from D1
into the result
4 Extract the first pairs from D1 and D2,
from both pairs compare their firsts with >
If the comparison is true,
uncons D2, put its first below D2
recurse anonymously on D1 and the rest of D2
cons the saved first pair from D2
into the result
Default (the firsts of the first pairs of D1 and D2
are =):
uncons both D1 and D2 into their
first and rest,
recurse on the two rests to form their union,
cons the two saved firsts into the result.
In the default case both first pairs are retained, so that if one is deleted, the other one, which may well have a different second component, is still available.
As may be seen, the d-union
operator is very similar to
the bs-union
operator. The other three operators
d-differ
, d-look
and d-rem
are
similar to their counterparts for big sets. The entire library is the
following:
LIBRA (* dictionary *)
d_new == [];
d_null == null;
d_add ==
[ [ [pop null] [swons] ]
[ [[first] dip [first] unary2 >=] [swons] ] (* app2 *)
[ [[uncons] dip] [cons] ] ]
condlinrec;
d_union ==
[ [ [null] [pop] ]
[ [pop null] [popd] ]
[ [unswons2 [first] unary2 <] [[uncons] dip] [cons] ] (* app2 *)
[ [unswons2 [first] unary2 >] [uncons swapd] [cons] ] (* app2 *)
[ [uncons2] [cons cons] ] ]
condlinrec;
d_differ ==
[ [ [null] [pop]]
[ [pop null] [pop pop []] ]
[ [unswons2 [first] unary2 <] [[uncons] dip] [cons] ] (* app2 *)
[ [unswons2 [first] unary2 >] [rest] [] ] (* app2 *)
[ [[rest] dip rest] [] ] ]
condlinrec;
d_look == [dup] dip
[ [ [pop null] [pop pop "not found"] ]
[ [[first first] dip >] [pop pop "not found"] ]
[ [[first first] dip =] [pop first] ]
[ [[rest] dip] [] ] ]
condlinrec;
d_rem ==
[ [ [pop null] [pop] ]
[ [[first first] dip >] [pop] ]
[ [[first first] dip =] [pop rest] ]
[ [[uncons] dip] [cons] ] ]
condlinrec.
The definitions of big sets and dictionaries are part of the library file TYPLIB.JOY.
Apart from the aggregate types it is useful to have another type, the tree type. These are lists which can contain lists as members which might contain lists as members and so on. Formally define a leaf to be anything which is not a list. Then a tree is defined to be either a leaf or a list of trees. Sometimes one needs the concept of a proper tree – this is just a list of trees. Trees are similar to other aggregates, but since the tree datatype is recursive, a special treatment is generally needed.
Just as there is the step
combinator to step through the
elements of an aggregate, so there is a treestep
combinator
to step through the leaves of a tree. For example, the following are the
already familiar program for computing the sum of the numbers in an
aggregate and a similar program for computing the sum of the numbers in
a tree:
sum == 0 swap [+] step
treesum == 0 swap [+] treestep
In the same way, the following are a familiar program and a new one
for determining the size
of an aggregate and the
treesize
of a tree:
size == 0 swap [pop succ] step
treesize == 0 swap [pop succ] treestep
Similarly, the following are a familiar program and a new one for shunting members of an aggregate or a tree, respectively, into an initially empty list:
shunt == [swons] step
treeshunt == [swons] treestep
For the binary operator treeshunt
all leaves will appear
in the result list, but in reverse order.
A tree may be flattened completely, losing its entire internal
structure but retaining the order of the leaves by the unary operator
treeflatten
:
treeflatten == [] swap treeshunt reverse
From a given tree we can obtain the reverse list of its leaves by:
[] swap treeshunt
But this may not be what is wanted. To reverse the tree while
retaining its structure it is necessary to reverse the top level list,
reverse the second level lists, reverse the third level lists and so on.
For tasks such as this Joy has a ternary combinator
treegenrec
for general recursion through trees. It is used
like this:
[O1] [O2] [C] treerecgen
Here [O1]
must be a program applicable to leaves,
[O2]
must be an operator applicable to lists, and
[C]
must be a combinator applicable to lists with operators
such as [O2]
. Different choices of the three quotation
parameters yield surprisingly different operators for trees or
combinators applicable to trees. Using this combinator the unary
treereverse
operator is defined by:
treereverse == [] [reverse] [map] treegenrec
The same treegenrec
combinator can be used to define a
unary combinator treemap
which takes a tree and a quoted
program as parameters and returns a tree of the same structure but with
each leaf as modified by the program parameter:
treemap == [] [map] treegenrec
The same combinator can be used to define a unary combinator
treefilter
which expects a tree and a quoted predicate.
What is returned is a tree of the same structure but with only those
leaves which pass the test predicate:
treefilter == [] swap orlistfilter [map] treegenrec
The first portion, [] swap
just inserts the required
[O1]
which in this case does nothing. Following that is a
modification of the test predicate, to be explained presently. The rest
of the definition is familiar:
treefilter == [] swap orlistfilter [map] treegenrec
The [O2]
operator to be used here is constructed from
the test predicate [P]
by orlistfilter
, which
constructs:
[ [ [list] [P] disjoin ] filter ]
The orlistfilter
is defined in two steps:
orlist == [list] swap disjoin
orlistfilter == orlist [filter] cons
An operator to remove all leaves from a tree, but retaining its list
structure is treestrip
, defined as follows:
treestrip == [list] treefilter
Trees cannot have lists as leaves, but otherwise they are very
flexible. In particular they can be used as queues. The following is a
small collection of operations for manipulating trees when the focus is
only on their leaves. A new empty tree is generated by
t-new
. A new leaf or a whole tree of leaves is added to an
existing tree by the operator t-add
; it always ensures that
the tree is of a form suitable for the remaining operators. The tree
predicate t-null
tests whether the tree is empty. It first
has to prepare the tree by ensuring that it does not consist of lists of
lists and so on which ultimately only contain the empty list. Since this
is also required by two other operators, the preparing is done by a
hidden unary operator. Two other operators t-front
and
t-rem
produce, respectively, the first leaf together with
the remainder of the tree, or just the remainder of the tree after
removing the first leaf. Both operators first have to check that the
tree is non-empty; if it is, then an error is reported. A leaf or proper
tree can be turned into a suitable form by t-reset
.
The implementation is as follows. A proper tree is always a list, and
an empty tree starts off by t-new
as an empty list.
Anything can be added by t-add
to an existing tree, and
this has to ensure that the result has a suitable standard form. The
same is true for t-reset
which first makes a copy of an
existing tree. The other operators, t-null
,
t-front
and t-rem
all require the tree to be
in a suitable standard form. This is done by prepare
which
is defined using condlinrec
. If the tree is
null
, it is left as it is. If the first
is
null
, then the rest
is taken and
condlinrec
recurses. If the first
of the
first
is a list, then that is unswons
ed,
condlinrec
recurses and on return does nothing further. In
all other cases the tree is left as it is:
HIDE (* tree *)
error == "non-empty tree needed for" putchars putchars abort;
prepare == [ [ [null] [] ]
[ [first null] [rest] [] ]
[ [first first list] [[unswons] infra] [] ]
[ [] ] ]
condlinrec
IN
t-new == [];
t-reset == dup unitlist unitlist;
t-add == unitlist unitlist cons;
t-null == prepare
dup null;
t-front == prepare
[null]
["t-front\n" error]
[dup first first]
ifte;
t-rem == prepare
[null]
["t-rem\n" error]
[unswons unswons [swons] dip]
ifte
END
The definitions of trees is part of the library file TYPLIB.JOY.