Lambda Calculus vs. Combinatory Logic vs. Joy

– a very very gentle introduction (honest! I promise!)

by Manfred von Thun

Joy is a high-level purely functional programming language which is not based on the application of functions to several arguments but on the composition of unary functions. This paper introduces the motivation for Joy by contrasting it with with two other paradigms of functional programming: the lambda calculus and combinatory logic. The exposition is intended to be self-contained and does not presuppose prior acquaintance with the field.

Keywords: functional languages, lambda calculus, elimination of variables, combinatory logic, elimination of function application, category theory, function composition, high-level programming languages.


Introduction – functional languages

All natural languages and most artificial languages contain as a component a functional language. This allows expressions to be built up from individual symbols together with functional symbols. In appropriate interpretations the expressions have a value which is an individual. Here are some examples of expressions on the left together with their values after the --> arrow:

    The capital of France                               -->  Paris
    The country between Canada and Mexico               -->  U.S.A.
    The sum of two and three                            -->  five
    or:  2 + 3                                          -->  5

Even statements can be considered to belong here, provided we take predicates to be functions which yield a truth value:

    Siberia is cold                                     -->  true
    Italy is bigger than China                          -->  false

In programming languages lists are used extensively. In one notation the elements are written sequentially, enclosed in, say, square brackets [ and ]. Here are some functions for lists:

    The reverse of  [a b c]                             -->  [c b a]
    The first of  [a b c]                               -->  a
    The rest of  [a b c]                                -->  [b c]
    The concatenation of [a b c] and [1 2 3]            -->  [a b c 1 2 3]

It is highly desirable to have a uniform notation. In one common convention the function symbol is written first, followed by a parenthesised sequence of arguments or parameters separated by commas. Some of the earlier examples might be written:

    capital(France)                                     -->  Paris
    sum(2,3)                                            -->  5
    cold(Siberia)                                       -->  true
    rest([a b c])                                       -->  [b c]
    concat([a b c],[1 2 3])                             -->  [a b c 1 2 3]

Functional expressions can be nested, and this includes statements:

    capital(between(Canada,Mexico))                     -->  Washington
    product(sum(2,3),4)                                 -->  20
    first(rest([a b c]))                                -->  b
    conj(bigger(Italy,China),cold(Siberia))             -->  false

New functions can be introduced by definitions:

    maternal-grandfather(x)     =  father(mother(x))
    square(x)                   =  product(x,x)
    vixen(x)                    =  conj(fox(x),female(x))
    second(x)                   =  first(rest(x))

All the functions considered so far were first order functions. They take individuals as parameters. Sometimes one needs higher order functions, sometimes called functionals or combinators. These can take other functions as parameters. One example is the twice function which takes as a parameter a unary function and yields as a result another unary function. When the result is applied to something, the effect is the same as applying the original function two times. For example one might want to say that the paternal grandfather of William is Philip. Here is one attempt:

    twice(father(x))(William)                           -->  Philip

But this does not make sense because the expression father(x) firstly contains a free variable x, and secondly at best denotes an individual, the father of x. Something else is needed.

Another example of a second order function is the map function which takes as parameters a unary function and a list. It yields as result a list of the same length, obtained by applying the function to each member of the given list. Here is an attempt:

    map(square(x),[1 2 3])                              -->  [1 4 9]

Again the notation does not make sense because of the free x. To handle higher order functions a different mechanism is needed, but what?

The lambda calculus

In set theory the many predicates of languages are replaced by expressions denoting sets. A few new formal predicates are introduced, principally one for membership. If Y is a set, then the sentence "x is a member of Y" is generally written . Expressions can be set abstractionss such as:

    {x : cold(x)}
    {<x,y> : bigger(x,y)}

If these two sets are called C and B, then the two statements Siberia C and <China,Italy> B are true.

In the lambda calculus something similar occurs. Functional expressions are built from variables and constants by just two constructions.

One of these is lambda abstraction, which resembles set abstraction. If (..x..) is an expression containing a variable, then its lambda abstraction is generally written x(..x..). To avoid Greek letters, though, I shall write Fun instead of the lambda . Thus Fun x (..x..) is to be pronounced “the function of one parameter x which yields the result (..x..)”. Here is an example:

    Fun x (product(x,x))

This is the function of one parameter which yields as value the product of the parameter with itself. This of course is the unary function which yields the product of its argument. Another example:

    Fun x (father(mother(x))

This of course is the maternal grandfather function.

The other construction is application which resembles membership in set theory. If f is a function, then f @ a is the result of applying the function to a. Ordinary functional notation is abandoned now, and the only function used is application, written in infix notation. The earlier examples are rewritten like as follows. The first is pronounced “the application of the capital function to France”, or more simply “capital applied to France”:

    capital @ France                                    -->  Paris
    between @ <Canada,Mexico>                           -->  U.S.A
    cold @ Siberia                                      -->  true
    first @ [a b c]                                     -->  a
    first @ (rest @ (reverse @ [a b c d]))              -->  c

The two earlier examples of abstraction should now be written:

    maternal-grandfather  =  Fun x (father @ (mother @ x))
    square                =  Fun x (product @ <x,x>)

Functions of several parameters are still a nuisance because one has to write ordered pairs <x,y> or triples <x,y,z> and so on:

    sum @ <2,3>                                         -->  5
    between @ <Canada,Mexico>                           -->  U.S.A.

By a device called currying1 all functions are taken to be unary, and one writes:

    (sum  @  2)  @  3                                   -->  5

where the subexpression:

    sum  @  2

is the curried unary function of one parameter which adds 2 to its argument. When that unary function is applied to 3 it yields 5. If application is taken to be left-associative, the entire expression can be written without parentheses as:

    sum  @  2  @  3                                     -->  5

The definition of square now looks like this:

    square                =  Fun x (product @ x @ x)

Higher order functions can now be handled very naturally. Here are examples using twice:

    twice @ father  @  William                          -->  Philip
    twice @ rest  @  [a b c d e]                        -->  [c d e]
    twice @ square  @  2                                -->  16

Note that no parentheses are necessary here because twice is being applied to a function. The last example might have been written without relying on the definition of square as follows:

    twice  @  Fun x (product @ x @ x)   @   2           -->  16

Of course twice can be applied to a function which has been “twiced” already:

    twice @ (twice @ rest)  @  [a b c d e f]            -->  [e f]
    twice @ (twice @ square)  @  2                      -->  256

The other example concerned map, which might be written in two ways:

    map  @  square  @  [1 2 3]                          -->  [1 4 9]
    map  @  Fun x (product @ x @ x)  @  [1 2 3]         -->  [1 4 9]

In the first, the subexpression:

    map  @  square

is the curried unary function which, when applied to a list of numbers, produces the list of their squares. Of course map can be applied to a function which is already a mapping. The result has to be applied to a list of lists.

The binary application operation is the only one written in infix notation, so it leads to no confusion if it is simply left out. This is generally done in the literature. Here are some of the examples written in this style:

    capital  France                                     -->  Paris
    between  Canada  Mexico                             -->  U.S.A.
    second-last           =   Fun x  (first  (rest  (reverse  x)))
    paternal-grandfather  =   Fun x  (twice  father  x)

The lambda calculus is an immensely powerful system.2 Its rules as a purely syntactic calculus have to do with simplifying expressions.3 The few rules are deceptively simple but are in fact difficult to use correctly in real examples. The main difficulty arises from the variables which get in each others way. Their purpose is to steer arguments into the right position. Can one do without variables, to make things easier for people or for computers, and still steer arguments into the right position?4

Combinatory logic

Variables can indeed be eliminated completely, provided some appropriate higher order functions or combinators similar to twice and map are introduced.5

One of the few immediately intuitive combinators is the W combinator. Its effect is to create a replica of the argument supplied to the function. It may be defined as:

    W  f  x         =   f  x  x

Something like this can be used for the squaring function, or for other “self”-concepts:

    product  3  3                                       -->  9
    W  product  3                                       -->  9
    distance  Northpole  Equator                        -->  10000 km
    W  distance  Northpole                              -->  0 km
    square          =   W  product
    self-destructs  =   W  destructs
    narcissistic    =   W  loves

Note that the last three definitions do not need a variable x as a formal parameter.

Another immediately intuitive combinator is the one which swaps arguments:

    C  f  x  y      =   f  y  x

For example:

    smaller         =   C  bigger

Neither W nor C are in any sense basic, but they can be defined in any complete combinatory logic.6 Most such systems use as their basis a translation scheme from the lambda calculus to a combinatory calculus which only needs two combinators, the S combinator and the K combinator.7 They can be used to define all other combinators one might desire, or even on their own to eliminate variables and hence lambda abstractions Fun x (..x..). Even recursion can be handled with the “paradoxical” Y combinator which is equivalent to a (hideous) expression just in S and K.

Like the lambda calculus, combinatory logic has progressed from being a curiosity to being an industry.89

So we can do without abstraction but with application together with first and higher order functions. The resultant system is simpler, but it has never been proposed as a language for people to think in.

In his Turing award lecture Backus introduced his FP system, short for “Functional Programming system”. The system is not about programming using functions, as Lisp and its descendents are, but about programming with functionals, also known as higher order functions or combinators or, in his terminology, functional forms. These are operations on unary functions, they constitute a rich but unfamiliar algebra. Importantly, the arguments of the functions do not play any role at all.10 Backus acknowledges a debt to combinatory logic, but his aim was to produce a variable free notation that is amenable to simple algebraic manipulation by people. Such ease should produce clearer and more reliable programs.

Backus also introduces another language, FFP system, short for “Formal Functional Programming system”.11 Both FP and FFP still use application of functions to their arguments. But application, like membership in set theory, has no useful algebraic properties.12

Joy

Consider a long expression, here again written explicitly with the application operator @:

    square @ (first @ (rest @ (reverse @ [1 2 3 4])))   -->  9

All the functions are unary, and unary functions can be composed. The composition of unary functions is again a unary function, and it can be applied like any other unary function. Let us write composition with an infix dot ".". Then the composition of the four functions above is:

    square . first . rest . reverse

This can be applied to the original list:

    (square . first . rest . reverse)  @  [1 2 3 4]     -->  9

One might even introduce definitions in this style:

    second          =   first . rest
    second-last     =   second . reverse

and then write:

    (square . second-last)  @  [1 2 3 4]                -->  9

This and also the preceding definitions would not make sense with application instead of composition. Importantly, a definition can be used to textually replace the definiendum by its definiens to obtain the original long composition expression. Unlike application, composition is semantically associative – no matter how the long expression is parenthesised, it denotes the same function. Furthermore, the textual operation of compounding several expressions to make a larger one is mapped onto the semantic operation of composing the functions denoted by the expressions.

These are highly desirable properties for algebraic manipulation. The only trouble is that the resultant composition expression still has to be applied to an argument, the list [1 2 3 4]. If we could elevate that list to the status of a function, we could eliminate application entirely from the expression and write:

    square . first . rest . reverse . [1 2 3 4]         -->  9

The numeral 9 would also need to denote a function. Can this be done?

Indeed it can be. In the category theory usual constants such as numerals are taken to denote constant functions which return the same value no matter what the argument is. The language of Cartesian closed categories13 is an alternative to the (typed) lambda calculus and to combinatory logic. However, this language has never been entertained as a high-level language suitable for human reasoning.

In the language Joy the functions are applied to a mystery object that is never mentioned explicitly, although it may but need not enter one’s mind when reasoning about the composition expression or when manipulating it perhaps by a computer program. It would be useful if such manipulation could be expressed in the very same language. Only two changes are required for the notation. Firstly, the composition symbol dot "." is the only infix operator, so it might as well be omitted:

    square  first  rest  reverse  [1 2 3 4]             -->  9

Secondly, there is an advantage in reversing the composition notation so that we now write:

    [1 2 3 4]  reverse  rest  first  square             -->  9

Although the mystery object is never referred to in the functional expression, its nature has to be revealed when explaining the semantics of the expressions. The mystery object is in fact a list of values, commonly called a stack (because it is conceived to be organised vertically and the top elements are the most accessible). Literal numerals such as 9 or literal lists denote unary functions which return a stack that is just like the argument stack except that the number or the list has been pushed on top. Other symbols like square or reverse denote functions which expect as argument a stack whose top element is a number or a list. They return the same stack except that the top element has been replaced by its square or its reversal. Symbols for what are normally binary operations also denote unary functions from stacks to stacks except that they replace the top two elements by a new single one.

Stacks are used routinely in computing, one of their many uses is in evaluating expressions written in postfix notation. Joy is written in what superficially looks like postfix notation, but it is not. But what makes Joy unique is the large collection of combinators which minimise the need for users to introduce defined symbols.

A little tutorial

The following is a very brief introduction to Joy.

Two such general recursion combinators are for computing linear recursion and binary recursion without having to introduce definitions.

In the remainder of this section a small program is to be constructed which takes one sequence as parameter and returns the list of all permutations of that sequence. For example, the list of permutations of the string "abcd" is:

    ["abcd" "bacd" "bcad" "bcda" "acbd" "cabd" "cbad" "cbda"
     "acdb" "cadb" "cdab" "cdba" "abdc" "badc" "bdac" "bdca"
     "adbc" "dabc" "dbac" "dbca" "adcb" "dacb" "dcab" "dcba"]

Here is a first draft:

    1 If the sequence S has only zero or one member
    2 then it has only one permutation, so take its unit list
    3 else take the first of S and rest of S,
      recurse to construct the permutations of the rest of S
    4 insert the first of S in all positions in all these permutations

Step 4 can be elaborated like this:

    4.1 Swap the saved first of S and the list of permutations of the rest
        of S
    4.2 construct a program to insert something in all positions in a
        sequence
    4.3 using the saved first of S for the insertion, `map` this over all
        permutations of the rest of S
    4.4 flatten this list of lists to become a single list of their members

After further elaborating step 4.2 extensively and step 4.4 in a small way, the entire program becomes:

    1               [ small ]
    2               [ unitlist ]
    3               [ uncons ]
    4.1             [ swap
    4.2.1             [ swons
    4.2.2.1             [ small ]
    4.2.2.2             [ unitlist ]
    4.2.2.3             [ dup unswons [uncons] dip swons ]
    4.2.2.4             [ swap [swons] cons map cons ]
    4.2.2.5             linrec ]
    4.3               cons map
    4.4               [null] [] [uncons] [concat] linrec ]
    5               linrec.

An essentially identical program is in the Joy library under the name permlist. It is considerably shorter than the one given here because it uses two subsidiary programs insertlist and flatten which are useful elsewhere. The program given above is an example of a non-trivial program which uses the linrec combinator three times and the map combinator twice, with constructed programs as parameters on both occasions. Of course such a program can be written in lambda calculus languages such as Lisp, Scheme, ML or Miranda. The following is a version in Scheme:

(define (permlist lis)
    (define (flatten l)
        (if (null? l)
            l
            (concat (car l) (flatten (cdr l))) ) )
    (define (insertlist item lis)
        (if (null? lis)
            (unitlist (unitlist item))
            (cons (cons item lis)
                  (map (lambda (l) (cons (car lis) l))
                       (insertlist item (cdr lis)) ) ) ) )
    (if (small lis)
        (unitlist lis)
        (flatten
            (map (lambda (l) (insertlist (car lis) l))
                 (permlist (cdr lis)) ) ) ) )

The Joy version uses 15 brackets and 25 names, a total of 40 symbols. The Scheme version uses 34 parentheses and 57 names, a total of 91 symbols.

Various versions of Joy have been implemented several times. The current implementation is written in portable C. It is planned to keep Joy small and simple for as long as possible. Eventual behemothification is irresistible, inevitable, irresponsible and irreversible.14


  1. Generally attributed to Curry, but freely acknowledged by him to be due to Schönfinkel. The term “Schönfinkeling” never caught on.↩︎

  2. It comes as a surprise that all computable functions can be expressed in the lambda calculus with just variables, abstraction and application, and can then be computed by reduction. However, any efficient implementation will need constants, and all practical programming languages based on the lambda calculus provide them. The best known functional programming languages are Lisp and its many earlier descendants, and the newer languages ML and Miranda. Being descendants of Lisp and ultimately the lambda calculus, they are also based on abstraction and application.↩︎

  3. Peyton Jones 1987 Chapter 2 contains a good exposition to the lambda calculus, including many extensions. The survey paper Hudak compares many features of different functional languages, with a minor emphasis on Haskell. A very elegant general introduction to modern functional programming in a non-Lisp language can be found in Hughes. A recent introduction to Miranda is in Turner. A notable variation on the lambda calculus is described in Cartwright.↩︎

  4. Brus et al write “if one wants to have a computational model for functional languages which is also close to their implementations, pure lambda calculus is not the obvious choice anymore”. They present the language Clean in which programs consist of rewrite rules (for graphs) using pattern matching extensively.↩︎

  5. Robinson remarked: “whatever can be expressed in a language based on application and abstraction as fundamental notions can be expressed in a far simpler language based on application alone.” The simpler language is combinatory logic. It is not a way of doing logic in a combinatory way, but it deals with the logic of combinators which denote higher order functions. The key idea came from Schönfinkel but was greatly expanded by Curry. The classic reference, Curry and Feys uses the same notation as is used today. A recent short exposition of the basic combinators is given for example in Henson.↩︎

  6. The calculus of combinators can be understood without reference to its connection with the lambda calculus, as indeed it is done in many expositions. But for the present purposes it is best to keep in mind the goal of eliminating abstraction from the lambda calculus while retaining Turing completenes.↩︎

  7. Abstraction is an operation in the object language, the lambda calculus. In combinatory logic this operation is replaced by an operation in the metalanguage. This new operation is called bracket abstraction. It takes an object language variable and an object language expression as arguments and it yields a new object language expression as value. The new expression will contain some function symbols specific to combinatory logic. If all object language lambda abstraction are removed from a lambda expression by this process of metalanguage bracket abstraction, then the final result will be equivalent to the original expression. So this process should be seen as a compilation. Since all lambda calculus expressions can be compiled in this manner, the language of combinators is again Turing complete.↩︎

  8. Turner introduced some optimisations into the standard translator from lambda calculus to combinator notation. With these optimisations the size of the combinatory code was kept within an acceptable limit. Turner’s implementation method using combinators has been used to build a hardware reduction machine, the CURRY chip, see Ramsdell.↩︎

  9. Peyton Jones 1987 Chapter 16 contains a good exposition on the translation from the lambda calculus to combinators and many details of the implementation of Miranda. Hindley and Seldin provide a very complete parallel account of the lambda calculus, combinatory logic and their relationship. Robinson shows how the language of Schönfinkel and Curry can be used in the mechanisation of theorem proving in higher order logic. Fradet and Le Métayer show how to compile lambda calculus into conventional machine code. Fradet uses what are described as low level indexed combinators as a target language to implement a lambda calculus language. Impressive execution times are reported. The target code is again not intended to be read by human users. Another way of supplying arguments to their values is with director strings used in some implementations of functional languages, (see for example Peyton Jones 1987 p274). But it is unlikely that a high level programming language could be designed which uses these principles.↩︎

  10. Backus gives an elaborate axiomatisation of the algebra; in Williams a smaller version is given comprising just 11 axioms. The FP system is further explained and expanded in Williams. A very useful exposition to the FP systems is found in Henson. The book also gives a very extensive bibliography. For a good exposition to the relation between the lambda calculus, combinatory logic and the FP systems of Backus see Revesz 1988 section 5.3. Givler and Kieburtz present methods for automatically and reliably transforming clear and correct but possibly inefficient FP programs into possibly obscure but efficient equivalent programs. Bellegarde presents a set of convergent rewriting rules for the algebra of FP programs but without conditionals. Whereas FP is a strict language, Dosch and Möller describe the algebraic semantics of a lenient variant of FP allowing infinite objects and using both busy and lazy evaluation. Sheeran uses a variant of FP as a VLSI design language for describing semantics and physical layout of hardware. For a critique of the FP systems, see Harland. A recent descendant of the language FP by Backus is the FL language described in Backus. Another variant is the language GRAAL which implements (“infinite”) streams using call-by-need; it is described in Bellot and Robinet.↩︎

  11. In addition to objects as in FP there are now explicit expressions. In addition to the listforming constructor as in FP there is now a new binary constructor to form applications consisting of an operator and an operand. Operator expressions which are atoms of course denote functions which can be applied to an argument. Operator expressions which are lists must have as their first element an expression denoting a function. This last rule, metacomposition, is immensely powerful. It can be used to define arbitrary new functional forms, including of course the fixed forms from FP. The rule also makes it possible to compute recursive functions without a recursive definition. This is because in the application the functions is applied to a pair which includes the original list operand which in turn contains as its first element the expression denoting the very same function. The method is considerably simpler than the use of the Y combinator. Williams extends the method to mutually recursive functions, even ones that are not primitive recursive. Joy is in fact closer to FFP than any of the languages mentioned so far. Both replace abstraction by other mechanisms, both rely heavily on higher order functions, and both obey the principle that program = data. Both permit construction of first order and higher order recursive and non-recursive functions without ever using named formal parameters. An effect similar to metacomposition is achieved in Joy with the x combinator, which expects a quoted program on top of the stack and executes it without popping it off.↩︎

  12. Meertens 1989 p 72 speaks of “the need of a suitable system of combinators for making functions out of component functions without introducing extra names in the process. Composition should be the major method, and not application.” He also writes (p 71) “The basic problem is that the basic operation of the classical combinator calculus (and also of the closely related lambda calculus) is application instead of composition. Application has not a single property. Function composition is associative and has an identity element (if one believes in the ‘generic’ identity function).” He develops a system of combining functions that is more suitable to formal manipulation than the classical combinators.↩︎

  13. Cartesian closed categories are explained for example in Barr and Wells and in Poigne. Barr and Wells give an example of a simple lambda expression with variables and a complicated looking categorical equivalent formula. Category theory has given rise to another model of computation: the CAM or Categorical Abstract Machine, described in Cousineau et al, Cousineau et al, and in Currien. The machine language of the CAM is very elegant, but programs in that language look as inscrutable as low level programs in other machine languages. The language is of course suitable as the target language for compilation from any functional language. A very compact but comprehensive exposition of a compiler from (a mini-) ML to CAM is described in Clement. Mauny and Suárez describe compilers from a strict and a nonstrict version of ML to an eager and a lazy version of the CAM. The original translation from lambda expressions to categorical combinators was quadratic in the worst case, but Lins introduces a linear translation to a simplified abstract machine code. Hannan uses a variant of the CAM for generating more concrete code suitable for register-level implementation or even micro-coding on conventional architectures. An extension of ML for data-parallel computation has been implemented by Hains and Foisy to run on a distributed version of the CAM.↩︎

  14. The unpublished paper von Thun contains an outline of various aspects of Joy and references to further working papers.↩︎