by Manfred von Thun
Joy is a functional programming language which is not based on the application of functions to arguments but on the composition of functions. This paper compares and contrasts Joy with the theoretical basis of other functional formalisms and the programming languages based on them. One group comprises the lambda calculus and the programming languages Lisp, ML and Miranda. Another comprises combinatory logic and the language FP by Backus. A third comprises Cartesian closed categories. The paper concludes that Joy is significantly different from any of these formalisms and programming languages.
This paper outlines the principal similarities and differences between Joy and other high-level and low-level functional languages.
The best known functional languages are the lambda calculus and, based on it, the programming languages Lisp and its descendants. All of them rely heavily on two operations, abstraction and application, which are in some sense inverses of each other. Abstraction binds free variables in an expression, and it yields a function which is a first class value. The bound variables are the formal parameters of the function, and, importantly, they are named. Application of an abstracted function to some actual parameters can be understood as resulting in a substitution of actual for formal parameters and then evaluation of the modified expression. More efficiently application can be implemented using an environment of name-value pairs. The lambda calculus does not need definitions, but all functional programming languages allow them as a matter of convenience. Definitions also use named formal parameters, and in applications these have to be substituted or an environment has to be maintained.
Two other functional languages are the combinatory logic of Curry and the FP system of Backus. They are not based on the lambda calculus, they eliminate abstraction completely and hence do not have to deal with substitution and environments. As a result these languages can be manipulated using simple algebraic techniques. But like the lambda calculus and the languages derived from it, both are based on the application of functions to arguments. However, application does not have attractive algebraic properties, and hence there is no theoretical reason for preferring one concrete notation over another.
The languages of category theory comprises another group of functional languages. Whereas the other functional languages use function application, these use function composition. No high-level programming language has been based on this formalism, but it has been used as a low level machine language as a target for compilation from a (typed) lambda calculus source. Joy is a high-level programming language which resembles the categorical languages more than it resembles any of the other functional languages.
The remainder of this paper is organised as follows: The next section gives an overview of the lambda calculus and of the programming languages that are based on it. The section after that describes combinatory logic which eliminates lambda abstraction. Following that is a section on the FP language of Backus. The next section then outlines the language of categories in which even function application is eliminated. In each of these sections the principal difference between these systems and Joy is outlined. There is also a final section on programming in the large; most structuring devices could be adapted to Joy in the future.
For the present purposes it is sufficient to consider only the abstract syntax of the lambda calculus. A lambda expression is built from variables and constants by two constructions, abstraction and application. Constants are values such as number or lists, or they are functions such as addition and concatenation. In an expression an occurrence of a variable can be free or bound. Specifically in an expression consisting of just a variable that variable occurs free. In an expression consisting just of a constant there are no variables at all. An expression which is a lambda abstraction consists of a variable and a body which is an expression. The abstraction construction binds any free occurrences of that variable in the body. An expression which is an application consists of two expressions, the function and the argument (or actual parameter). Occurrences of free and bound variables in the function and the argument are also free and bound in the applicative expression.
Functions of several variables can be abstracted by successive abstractions, and they can be applied to several arguments by successive applications. Each application returns a function of one less argument than the original. This device of replacing standard functions by their curried form allows the theory to concentrate exclusively on functions of one argument.
Incidentally, a similar effect is achieved in Joy not by application but by composition. The program:
3 +
denotes the composition of two functions from stacks to stacks. The
first pushes the number 3, the second adds two numbers on top of the
stack. The entire program denotes a function which adds 3 to the top
number of the stack. The space between the 3 and the +
is
not application written in reverse but composition.
The lambda calculus uses several syntactic operations called reductions. The most important of these is beta reduction of applicative expressions. If the function is an abstraction then the applicative expression reduces to the body of the function but with all free occurrences of the abstracted variable replaced by the argument of the application. Beta reduction corresponds exactly to calling a function in the terminology of programming languages. The difference with Joy is already apparent here, since Joy does not have any variables to be used in abstractions and beta reduction does not occur in Joy.
Abstraction creates anonymous functions, and the pure lambda calculus does not have any facility for defining functions. In particular, it cannot be used to give recursive definitions of functions. But to compute recursive functions it is possible to introduce a single device, the Y combinator. This might have been defined recursively if that were possible, or it can be provided as one of the constants. However, it also turns out to be equivalent to a particular lambda expression.
The simple lambda calculus can therefore be used to compute all recursive functions, and hence to compute any function that can be computed by a Turing machine. Even constants are not really necessary, since truth values, numerals, list operations and the like can be expressed as particular lambda expressions. It comes as a surprise that all computable functions can be expressed in the lambda calculus with just variables, abstraction and application, and can then be computed by reduction. However, any efficient implementation will need constants, and all practical programming languages based on the lambda calculus provide them.
The lambda calculus can be extended with simple let-expressions and recursive letrec-expressions and with definitions. The additions make programming significantly easier, and this is approximately the level of the core of (pure) Lisp and its earlier descendants.
Other extensions are pattern matching of formal and actual parameters, and static but polymorphic type checking. The best known functional programming languages that have these features are ML and Miranda. Being descendants of Lisp and ultimately the lambda calculus, they are also based on abstraction and application.
Peyton Jones (1987 Chapter 2) contains a good exposition to the lambda calculus, including many extensions. The survey paper Hudak (1989) compares many features of different functional languages, with a minor emphasis on Haskell. A very elegant general introduction to modern functional programming in a non-Lisp language can be found in Hughes (1990). A recent introduction to Miranda is in Turner (1990).
A notable variation on the lambda calculus is described in Cartwright (1991). Normally the binding operators (such as lambda) are special forms rather than operators in a semantic algebra. Here lambda is taken to be a true function; the universe of models is enlarged to include environments, and variables are interpreted as selector functions mapping environments to values.
All languages mentioned in this section were based on application and abstraction. By contrast, Joy uses neither of these, instead it is based on composition and quotation. Brus et al (1987 p 364) write: "if one wants to have a computational model for functional languages which is also close to their implementations, pure lambda calculus is not the obvious choice anymore". They present the language Clean in which programs consist of rewrite rules (for graphs) using pattern matching extensively. The implementation uses the rewrite rules more or less directly. Joy accepts the spirit of the above quotation, but draws a very different consequence.
Robinson (1969 p 125) remarked: "whatever can be expressed in a language based on application and abstraction as fundamental notions can be expressed in a far simpler language based on application alone." The simpler language is combinatory logic. It is not a way of doing logic in a combinatory way, but it deals with the logic of combinators which denote higher order functions. The key idea came from Schönfinkel (1924) but was greatly expanded by Curry. The classic reference, Curry and Feys (1958) uses the same notation as is used today. A recent short exposition of the basic combinators is given for example in Henson (1987).
The calculus of combinators can be understood without reference to its connection with the lambda calculus, as indeed it is done in many expositions. But for the present purposes it is best to keep in mind the goal of eliminating abstraction from the lambda calculus while retaining Turing completeness.
Abstraction is a construction in the object language, the lambda calculus. In combinatory logic this construction is replaced by an operation in the metalanguage. This new operation is called bracket abstraction. It takes an object language variable and an object language expression as arguments and it yields a new object language expression as value. The new expression will contain some function symbols specific to combinatory logic. If all object language lambda abstractions are removed from a lambda expression by this process of metalanguage bracket abstraction, then the final result will be equivalent to the original expression. So this process should be seen as a compilation. Since all lambda calculus expressions can be compiled in this manner, the language of combinators is again Turing complete.
The astonishing feature of this compilation is that it only needs two new function constants or combinators. However, to understand the rationale, it is best to start with three combinators. The three arise naturally from considering the cases of lambda expressions on which bracket abstraction with respect to a variable is to be performed. These will be lambda expressions without lambda abstractions, so they are just variables or applications.
Let x
be the variable to be abstracted. 1) If the
expression is the same variable x
, then the bracket
abstraction should give the identity function which just
returns its argument. So the unary I combinator is introduced
as the translation, it will receive its argument only when the
abstracted expression is applied. 2) If the expression is a different
variable y
, then the abstraction should give a constant
function which ignores its argument x
and just returns
y
. A single binary K combinator is introduced and
given the argument y
. The translation will receive its
second argument only when the abstracted expression is applied, and then
it will ignore that argument. 3) If the expression is an application of
a function f
to an argument g
, then both the
function and the argument first have to be abstracted with respect to
x
, The final bracket abstraction will be applied to an
argument and then it has to make this argument available to both
subabstracts. A ternary S combinator is introduced which does
just that. It is given as arguments the two subabstracts from
f
and g
. The translation will receive its
third argument only when the translation is applied to its argument. At
that point it will supply that argument to the translation of
f
and g
and then apply the result from
f
to the result from g
.
These are the three principal combinators arising naturally. It so happens that the I combinator can actually be defined in terms of the other two. So any lambda calculus expression can be translated into a combinatory expression in which there are no variables but just two combinators K and S. Since lambda calculus even without constants is Turing complete, combinatory logic with just K and S and no other constants is also Turing complete. This is all the more surprising since an expression consisting exclusively of K and S is really just a tree in which the leaves hold only one bit.
Here are some links to web pages. An introduction to combinatory calculus, by Brent Kerby, a valued contributor to the concatenative mailing group:
Another interesting discussion on combinators, by Peter Hancock:
The simple compilation scheme yields translations whose length can be exponential in the length of the source expression. Optimisations have been known for a long time which produce translations of only quadratic length. These optimisations use further combinators that are special cases of the S combinator. But the size of the translation result was still prohibitive for any but the smallest lambda expressions.
Turner (1979) introduced some optimisations into the standard translator from lambda calculus to combinator notation. With these optimisations the size of the combinatory code was kept within an acceptable limit. The interpreter for the combinatory code used normal graph reduction, one form of lazy implementation in which actual parameters are evaluated only once. Turner’s implementation method using combinators has been used to build a hardware reduction machine, the CURRY chip, see Ramsdell (1986).
Peyton Jones (1987 Chapter 16) contains a good exposition on the translation from the lambda calculus to combinators and many details of the implementation of Miranda. Hindley and Seldin (1986) provide a very complete parallel account of the lambda calculus, combinatory logic and their relationship. Robinson (1969) shows how the language of Schönfinkel and Curry can be used in the mechanisation of theorem proving in higher order logic. Fradet and Le Métayer (1989) show how to compile lambda calculus into conventional machine code. Fradet (1991) uses what are described as low level indexed combinators as a target language to implement a lambda calculus language. Expressions using these combinators lend themselves to rewriting techniques, including optimisations. Impressive execution times are reported. The target code is again not intended to be read by human users.
The variables of the lambda calculus are very similar to the
variables of predicate logic. The notions of free and bound variables
are essentially the same in both fields, and so is the operation of
substituting a term for a free variable. Henkin, Monk and Tarski (1971) show how
simultaneous substitution for variables can be eliminated in terms of
the identity relation. The idea is that every formula which
uses variables in some arbitrary order is replaced by another formula in
which all variables occur exactly once and in strictly alphabetical
order. The replacement formula will typically contain more variables
than the original. The replacement formula is then to be conjoined with
several identity sentences of the form x=y
. It can be shown
that the resulting conjunction is logically equivalent to the original
formula. It may be that this idea can be adapted to the lambda calculus.
As an implementation technique it would amount to something like this: A
lambda term is replaced by another term using consecutive variables, and
the replacement is associated with some identity statements which guide
the values. There may be some connection with director strings
used in some implementations of functional languages, (see for example
Peyton Jones (1987 p 274). But it is
unlikely that a high level programming language could be designed which
uses these principles.
The difference between combinatory logic and Joy is best explained by
a simple example. To multiply two numbers, say 2
and
3
, in Joy one writes:
2 3 *
In combinatory logic one writes:
* 2 3
and there seems to be no significant difference between the two
notations apart from the order of operators and operands. But this is
deceptive. In combinatory logic a two-argument function like
multiplication is understood to be curried. The binary
*
function is first applied to 2
, yielding a
function which doubles its argument. That function is then applied to
3
yielding the result 6
. Fully parenthesised
the expression is:
(* 2) 3
However, the convention is that application associates to the left, so the parentheses are not needed.
To compute the square of a number, say 3
, it has to be
multiplied by itself. In combinatory logic and in Joy one can write,
respectively:
* 3 3 3 3 *
But in both notations it is possible to modify the binary multiplication function to turn it into the unary squaring function. In combinatory logic the W combinator can be applied to a function which then duplicates the (first) argument of the function. It is defined by:
(W f) x = (f x) x
Again the parentheses are not needed. So the square of 3
is given by:
(W *) 3
In Joy the simplest way to compute the square of 3
is
by:
3 dup *
To facilitate the comparison between the two languages it is also possible to define a w == [dup] dip
Then the square of 3
is also computed in Joy by:
3 [*] w
In both languages one can introduce a mapping combinator to apply a
function to a list. In the examples to follow the list will be just
[1 2 3 4]
. In combinatory logic one might define a Map
combinator by:
Map f [] = []
Map f [X | Xs] = [f X | Map f Xs]
where the bar |
separates the first element of the list
from the rest. Then the list of squares of the given list is computed
by:
Map (W *) [1 2 3 4]
Note that the parentheses around (W *)
are necessary.
The same computation is expressed in Joy by:
[1 2 3 4] [[*] w] map
Superficially one version is just the reverse of the other. Combinatory logic uses prefix notation, and Joy uses what looks like postfix notation.
But the apparent similarity is deceptive. To see this it will help to
write both versions with the hidden operators made explicit. In
combinatory logic the hidden binary operator is application of
a function to an argument, which might be written explicitly as infix
"@"
. Fully parenthesised the combinatory version thus
is:
(Map @ (W @ *)) @ [1 2 3 4]
In Joy the hidden binary operator is composition of
functions, which might be written explicitly as infix "."
.
The Joy version thus is:
[1 2 3 4] . [[*] . w] . map
There are as many compositions in the Joy version as there are applications in the combinatory logic version. Since composition is associative, it does not matter how the expression is parenthesised.
Because of associativity the following is also meaningful in Joy:
[1 2 3 4] . [[*] . w]
It denotes a function which pushes two items, a list and a quotation, onto the stack. By contrast its combinatory counterpart:
(W @ *) @ [1 2 3 4]
is not meaningful. This is because the squaring function on the left,
(W @ *)
, expects to be applied to a number on the
right, and not a list.
Another way of noting the difference between combinatory logic and Joy is in the following equations, here again with application and composition left implicit:
Map (W *) [1 2 3 4] = [1 4 9 16]
[1 2 3 4] [[*] w] map == [1 4 9 16]
The combinatory logic version denotes the identity of objects, in this case lists. The Joy version denotes the identity of functions, in this case functions which, when applied to a stack, will push a list. Stacks are the arguments to which all Joy functions are applied, but this application plays no role in the construction of programs. By contrast, application is the principal program constructor in combinatory logic, even if the application operator is left implicit.
Wald develops a theory of 'unary
pairfunctions' with primitives L
, S
,
D
and B
satisfying:
L(<a,b>) = a S(<a,b>) = <b,a>
D(a) = <a,a> B(<a,<b,c>>) = <<a,b>,c>
He gives a finite presentation (69 axioms) of a semigroup of such functions under composition. The theory is not intended as the basis of an implementation, but it would appear that there are some connections with Joy that are worth exploring.
In his Turing award lecture Backus (1978) introduced his FP system, short for "Functional Programming system". The system is not about programming using functions, as Lisp and its descendents are, but about programming with functionals, also known as higher order functions or combinators or, in his terminology, functional forms.
Backus builds his FP systems on three kinds of entities. Firstly, there are objects. These are built recursively from atomic objects such as truth values and numbers, and the only constructor is that of forming sequences or lists of objects. Secondly there are primitive functions. These comprise the usual arithmetic operations and relations and several powerful operations on lists. Importantly all primitive functions are unary functions technically, since functions requiring several arguments are provided with a single list of these arguments. Furthermore, all functions are first order. Thirdly there are functional forms, and these are the essential novelty of the system. They are second order functions used to build more complex functions from simpler ones. Since all primitive functions are unary, and the combining forms preserve this property, all functions in the system are unary. Combining forms, however, can have several functions as parameters. In detail, the combining forms are as follows.
The composition form requires two functions. The resulting function is that function which applied to its argument always gives the same value as applying first the one function and then the other. The conditional form takes three functions as parameters, an if-function, a then-function and an else-function. The if-function is applied to the argument, if that yields true the value returned is that given by applying the then-function, otherwise it is that of applying the else-function. The construction form takes as single parameter a list of functions. Applied to one argument the resulting function returns a list of values, each obtained by applying the functions to the argument. The apply-to-all form is essentially the same as insert form is sometimes called reduce or constant form takes as parameter an object (considered as a function of no arguments). The resulting unary function ignores its argument and always returns the parameter object. There is no facility for user-defined forms; Backus held the view that this would lead to obscure programs. However, new functions may be defined, even recursively.
The combining forms as operations on unary functions constitute a rich but unfamiliar algebra. Importantly, the arguments of the functions do not play any role at all. Backus gives an elaborate axiomatisation of the algebra; in Williams (1982) a smaller version is given comprising just 11 axioms. Two axioms deal with the interplay between composition and conditional, two deal with composition, construction and insert, and one deals with just composition and construction. Two deal with construction and indexing into a list. Another concerns nested conditionals with the same if-function. Two deal with the append-left function (elsewhere known as cons) and the apply-to-all form. A final one deals with composition and constant. As may be seen, each combining form has a significant relationship with at least some other combining form.
The FP system is further explained and expanded in Williams (1982). A very useful exposition to the FP systems is found in Henson (1987 Chapter 5). The book also gives a very extensive bibliography. For a good exposition to the relation between the lambda calculus, combinatory logic and the FP systems of Backus see Revesz (1988 section 5.3). Givler and Kieburtz (1984) present methods for automatically and reliably transforming clear and correct but possibly inefficient FP programs into possibly obscure but efficient equivalent programs. Bellegarde (1984) presents a set of convergent rewriting rules for the algebra of FP programs but without conditionals. Whereas FP is a strict language, Dosch and Möller (1984) describe the algebraic semantics of a lenient variant of FP allowing infinite objects and using both busy and lazy evaluation. Sheeran (1984) uses a variant of FP as a VLSI design language for describing semantics and physical layout of hardware. For a critique of the FP systems, see Harland (1984 section 18.4).
A recent descendant of the FP system by Backus is the FL language described in Backus, Williams and Wimmers (1990). Another variant is the language GRAAL which implements ("infinite") streams using call-by-need; it is described in Bellot and Robinet (1985).
In FP there are three kinds of semantic entities, the objects, the functions and the combining forms. They correspond fairly well to three kinds of functions in Joy: those denoted by literals, by operators and by combinators. But the Joy functions are all of the same kind, they are functions taking one stack as argument and giving a new stack as value. In FP combining forms are applied to functions and the resulting functions are applied to objects. In Joy there is no application of functions to arguments at all, there is just composition of functions. In FP the function parameters of combining forms cannot depend on any objects supplied as arguments to functions. In Joy the quotation parameters of combinators can be manipulated at run time. Hence it is possible to call constructed programs which have been built on the fly.
In his Turing award lecture Backus also introduces another language, FFP system, short for "Formal Functional Programming system". In addition to objects as in FP there are now explicit expressions. In addition to the listforming constructor as in FP there is now a new binary constructor to form applications consisting of an operator and an operand. Operator expressions which are atoms of course denote functions which can be applied to an argument. Operator expressions which are lists must have as their first element an expression denoting a function. When such an expression is applied to an argument, the function is applied to a pair consisting of the original list and the argument.
This last rule, metacomposition, is immensely powerful. It can be used to define arbitrary new functional forms, including of course the fixed forms from FP. The rule also makes it possible to compute recursive functions without a recursive definition. This is because in the application the function is applied to a pair which includes the original list operand which in turn contains as its first element the expression denoting the very same function. The method is considerably simpler than the use of the Y combinator. Williams (1982) extends the method to mutually recursive functions, even ones that are not primitive recursive.
Joy is in fact closer to FFP than any of the languages mentioned so
far. Both replace abstraction by other mechanisms, both rely heavily on
higher order functions, and both obey the principle that program =
data. Both permit construction of first order and higher order
recursive and non-recursive functions without ever using named formal
parameters. An effect similar to metacomposition is achieved in Joy with
the combinator x
, which expects a quoted program on top of
the stack and executes it without popping it off.
One important difference is that FFP still uses application as an essential operation, whereas Joy uses composition. It appears that this makes the algebra of Joy considerably simpler.
Meertens (1989 p 72) speaks of "the need of a suitable system of combinators for making functions out of component functions without introducing extra names in the process. Composition should be the major method, and not application."
Meertens (1989 p 71) writes "The basic problem is that the basic operation of the classical combinator calculus (and also of the closely related lambda calculus) is application instead of composition. Application has not a single property. Function composition is associative and has an identity element (if one believes in the 'generic' identity function)." He develops a system of combining functions that is more suitable to formal manipulation than the classical combinators. It is worth noting that in monads the monadic composition operator is associative.
A category consists of a collection of objects and for any two objects a collection of morphisms, each having the one object as their source and the other object as their target. For any single object, the morphisms must include an identity morphism with that single object as source and target. For any object and two morphisms having it as source and target respectively, there must be a composite morphism having as source the source of one morphism and as target the target of the other. This composition of morphisms must be associative, with identity morphisms as left and right unit elements.
An object is a terminal object in a category if for each object as source there is exactly one morphism with that object as target. An object is a product object of two given objects if there are two special projection morphisms having the product as source. For any arbitrary morphism having an arbitrary object as source and either of the two given objects as target there must be a corresponding morphism having the same arbitrary object as source and the product object as target. That arbitrary morphism must then be the composition of that corresponding morphism and the appropriate projection morphism.
In a category with products there may also be exponential objects of a given source object and a given target object. Such an exponential object must have a special evaluation morphism having the product of the given source object and the exponential object as source and the given target object as target. For any arbitrary morphism having the product of the source object and an arbitrary object as source there must be exactly one corresponding ("curried") morphism. That arbitrary morphism must then be (essentially) the composition of the corresponding morphism and the evaluation morphism.
A Cartesian closed category is one which has a terminal object, and for any two objects their product and exponential, together with their projection and evaluation morphisms. In the category of sets, products are Cartesian products, exponentials are functions from sets to sets, and evaluation morphisms are the application of a function to a value. In the category of logical systems, products are conjunctions, exponentials are conditionals, projections are and-elimination rules and evaluation morphisms are the modus ponens rule.
The language of categories is another functional language. If it has products, then it can deal with functions of several variables. If it has exponentials, then functions are "first class citizens". The language is therefore an alternative to the (typed) lambda calculus and to combinatory logic. Whereas the lambda calculus needs variables, the combinatory language and the categorical language do not.
Cartesian closed categories are explained for example in Barr and Wells (1990 Chapter 6) and in Poigne (1992). Barr and Wells give an example of a simple lambda expression with variables and a complicated looking categorical equivalent formula. They suggest an acceptable reformulation of the categorical formula. Both categorical versions essentially replace occurrences of variables by use of projection functions.
Could the language of categories be used for writing programs? Any lambda expression can be translated into a categorical expression, so the language of categories is expressively complete. But this does not make it a suitable language for writing programs. As it stands it is a very low-level language.
On the other hand, category theory has given rise to another model of computation: the CAM or Categorical Abstract Machine, described in Cousineau et al (1985), Cousineau et al (1987) and in Curien (1986). The machine language of the CAM is very elegant, but programs in that language look as inscrutable as low level programs in other machine languages. The language is of course suitable as the target language for compilation from any functional language. A very compact but comprehensive exposition of a compiler from (a mini-) ML to CAM is described in Clement et al (1986). Mauny and Suárez (1986) describe compilers from a strict and a nonstrict version of ML to an eager and a lazy version of the CAM.
The original translation from lambda expressions to categorical combinators was quadratic in the worst case, but Lins (1987) introduces a linear translation to a simplified abstract machine code. Hannan (1991) uses a variant of the CAM for generating more concrete code suitable for register-level implementation or even micro-coding on conventional architectures. An extension of ML for data-parallel computation has been implemented by Hains and Foisy (1993) to run on a distributed version of the CAM.
Combinatory languages should be seen as abstract machine languages. In contrast, Joy was designed to be a high level language to be used by the human programmer.
In all high level languages a program consists of a possibly large collection of definitions and a comparatively small main program. The collection of definitions and their interrelations can become very difficult to comprehend and maintain. Many languages provide some mechanisms for giving additional structure to the definitions of functions and their interrelations. One kind of interrelation is due to their mutual calling patterns. The main program calls functions at level one, and these call each other or functions at level two, and so on. A second structure that can be exploited is that due to common types. For example, functions which concatenate two strings and reverse a string will not call each other but belong together in an implementation of strings. A third possible device only makes sense in imperative languages because procedures have additional interrelations due to global assignable variables which might be written by one procedure and read by another.
One of the simplest but very powerful structuring mechanisms is block structure. It was already used in the earliest Lisp and in Algol 60, and later in Pascal. It has been retained in almost all their descendants. A block consists of any number of definitions followed by a body, and a definition consists of a header and a block. So definitions can contain local definitions and so on, with no intrinsic limit to the levels of nesting. Hence any definition can at least provide some hiding of information that is not needed outside. This hiding could be used even in cases where what is defined does not take any parameters at all. (Thus it could be used in context-free grammars, but apparently never is.) More importantly, if there are parameters in the header of a definition, then the bodies of any enclosed definition can access those parameters. Additionally, for an imperative language, if there are assignable variables in a block, then the bodies of any enclosed block can access these variables. Access to non-local parameters and variables is automatic via a static chain or a display. This mechanism achieves what otherwise would have to be handled by explicitly passing them as parameters.
Joy currently does not have block structure, but it would be easy to implement. Since Joy does not have formal parameters and no assignable variables, block structure would only provide the benefit of information hiding.
The popular imperative C language does not have block structure but it does address the problem of hiding information about assignable variables from function bodies that do not need this information. One such mechanism is that of own variables, or internal static variables in C-terminology. Such a variable can be declared, initialised and used within a single function body and not be visible outside. The value of the variable persists between successive calls of the function. The other mechanism is that of independent compilation units. These are just files containing declarations of global variables and functions without a main program. The variables can be made invisible from outside their unit. A complete program then consists of very few variables visible everywhere, and several files each containing a collection of variables and procedures that belong together.
In Joy the first mechanism does not even make sense because Joy as a functional language does not have any assignable variables. The second mechanism could still allow related functions to be kept together in one file. The current implementation of Joy does not use a single input file but a stack of such files. A new file can be opened and become the current input file by a directive. But no information hiding occurs through this mechanism, there is only one global namespace. Hiding only occurs in explicit1
Independent compilation units in files are not without problems. One criticism is that in order to achieve independence, programs have to be spread over too many files. Another concerns security, since a simple compiler does not check type conformity of formal parameters in the declarations in one unit and the actual parameters in the calls of another unit. A third criticism is that such units cannot be nested. The language Modula2 overcomes these problems. A collection of declarations and definitions can be wrapped up inside a module. A single file can contain several modules. In addition to a detailed implementation module there is a short definition module containing only type information, especially about formal parameters. This interface is used to check type conformity with calls from another module. Finally, modules can be nested. A module specifies explicitly which of its identifiers are to be exported and made visible to the outside. Any others remain hidden. But the total number of identifiers exported from various modules in a program can still be very large. Moreover, if different modules were written by different programmers, the exported identifiers might clash. To avoid this problem, Modula2 allows use of qualified names which are similar to record notation. Such names consist of the name of the module together with the name of the exported identifier.
Structuring devices similar to modules would benefit any language for programming in the large. Because Joy is so weakly2 typed, definition modules would be almost pointless. However, implementation modules with selective export would hide utility functions of a module that are not needed outside the module. Qualified names would then be as in Modula2. A crude substitute for a hiding facility and for qualified names is already used in the Joy libraries. Hidden functions are given names that are not displayed by the class which first occurred in Simula, an early descendant of Algol 60. Classes are a generalisation of record types, but in addition to fields for assignable variables there are fields for procedures and functions. Instances of a class type are called objects, they consist of instance specific assignable variables of their type. From the outside objects are manipulated only by the procedures and functions of their class. This is the essence of object oriented programming, it emphasises hiding of implementation details so that only abstract data types are visible from the outside. Two other notions are inheritance of properties of objects from other objects, and making functions and procedures re-usable and polymorphic. The ability of an object to respond to actual parameters of different types. The C++ language has made these concepts extremely popular. It is being used to build extensive libraries containing re-usable components.
Since Joy does not have any assignable variables, the notion of class and objects are not applicable. However, hiding, inheritance and polymorphism would still be useful. Many operators in Joy are already polymorphic, and it is possible to write libraries which retain or even extend this property.
One of the most sophisticated devices for structuring programs is to be found in the language ML. For a good introduction, see Paulson (1992 Chapter 7) and Sokolowski (1991 Chapters 8 and 9). A signature is very much like a definition module in Modula2, it specifies parameter types and result types of functions, but it does not specify their bodies. A structure is like an implementation module, it specifies the bodies of functors but does not hide implementation details. Real hiding and hence real abstraction is provided by functors, inspired by category theory. In their simplest form they simply hide implementation detail. But they can also take one or more other structures as formal parameters and produce other structures as value. The bodies of functions can access all the functions in the formal structure parameters. Functors must be instantiated with actual structures before they can be used, and they can be instantiated several times with different actual parameters. Instantiation of functors thus resembles calling a function with actual parameters, but like the generics of Ada, it occurs at compile time. However, the functors of ML are far more powerful than the generics of Ada. Functors are again structuring devices for programming in the large. They could be used not just for functional languages but equally well for procedural, logical or actor languages. A mathematically very mature treatment of modules in terms of category theory is given in Ehrig and Mahr (1990).