Survey of reproducing programs

by Manfred von Thun

3-MAY-05

Abstract: A program is said to be reproducing if running it produces another program which is also reproducing. A program is said to be self-reproducing if running it produces another program identical to itself. This note surveys reproducing programs, with emphasis on those in which the produced program is different from the producing program. Starting with the simplest self-reproducing program, two kinds of additions are considered: an internal state which changes with every reproduction step, and a contained program which affects the stack. There can also be interaction between the state and the stack, under control of the contained program. The contained program can also use the state as a count-down to self-destruction of the reproducing program. Some reproducing programs can call themselves recursively, and the ones constructed here differ from the ones normally used for the y-combinator in that they survive even after they return from the initial call. For convenience some operators are defined which facilitate the definitions of such recursive programs. A final discussions deals with achieving similar effects without the use of reproducing programs.

Contents:

Introduction - reproducing programs
Four basic programs
Streams or infinite lists
Interaction and termination
Reproduction for recursion
Convenience operators
Final remarks


Introduction - reproducing programs

Reproducing programs and self-reproducing programs can be written in any programming language; indeed the web offers many examples. Such programs can be very simple or quite complex, depending on whether the language makes it easy to speak about itself. This note is a survey of some techniques for writing such programs which go beyond basic reproduction, by combining reproduction with two principal additions.

The first addition is an internal state of the reproducing program. This state can change during each reproductive step. The change may depend only on the reproducing program, or it may depend also on what the programs find on the stack at each reproductive step. Examples of the first kind are various forms of streams or infinite ("lazy") lists. Examples of the second kind are programs which use their internal state to collect information about the stack as they were reproducing.

The second addition is an internal program which on each reproducing step does something to the stack at each reproductive step. This internal program is quite distinct from the reproducing part. Better known examples are those programs which eliminate explicit recursion by using the y combinator. But a similar technique can also be used for non-recursive programs.

The remainder of this note deals with a new library, a test file and its output. The library is in the file replib.joy. The test file is in reptst.joy. The output from the test program is in reptst.out. The present note consists of comments and discussions added to the last mentioned file.

The library essentially defines one single module called rep (for "reproduction"). The actual definitions occur in the place marked "…", but for ease of reading this note they are here given separately where they are being discussed. The outline of the library is as follows:

(* FILE:   replib.joy *)

"agglib" libload.

LIBRA

    _replib == true;

MODULE rep  # "reproducing programs"

PUBLIC

    ...

END;     # MODULE rep

    REPLIB == "replib.joy - for survey of reproducing programs"

END      # LIBRA

"replib  is loaded\n" putchars.

(* END   replib.joy *)

The bulk of this note consists of commented output from the test file. Each section of the test file is preceded by the relevant definitions from the module library.

Four basic programs

This section deals with four very simple reproducing programs in the library wich illustrate the basic ideas of this note. The first five definitions in the library are just utilities used throughout the library:

    duco == dup cons;
    dureco == dup rest cons;
    durereco == dup rest rest cons;

    count == [0 [succ] infra] swoncat;
    deposit == [dup [first] dip] swoncat;

    self == [duco]         duco;      exe   == [dip duco]   cons       duco;
    ints == [dureco] count dureco;    exe-c == [dip dureco] cons count dureco;

The last two lines contain definitions for the four basic programs; they are here written in a way which displays their similarities. Starting with the simple definition of self in the first line, moving to the right adds dip and cons in both programs in the right column, moving down adds count and replaces duco by dureco in both programs on the second line. What exactly these programs do is illustrated by the test file:

First, there is a definition of selfr in terms of its counterpart self in the module rep from the library:

    DEFINE 
        selfr == rep.self. 
     
    selfr. 
[[duco] duco]
     
    selfr i i i. 
[[duco] duco]

Following the definition, there is a call to selfr, which just shows the (quoted) program which is constructed by selfr. Finally, the program is called by the i combinator, which executes it, in readyness for two further executions by the i combinator. Each of the executions just produces the same quotation. So selfr produces a self-reproducing program.

The next definition in the test file defines a self-reproducing program which on every call by the i combinator will square whatever number it finds on the stack. The definition uses the program exe from the module rep in the library:

    DEFINE 
        squaring == [dup *] rep.exe. 
     
    squaring. 
[[[dup *] dip duco] [dup *] dip duco]
     
    2 squaring i pop. 
4
     
    2 squaring i i pop. 
16
     
    2 squaring i i i pop. 
256

Following the definition, the first call to squaring the quotation is constructed and the result is shown. In the other three calls the quotation is executed once, twice or thrice by the i combinator. This does not change the quotation, so it is here just popped off. What remains on the stack is the result of squaring 2 once, twice or thrice.

The next definition defines a function integers in terms of the function ints from the module rep in the library. The function produces a quotation containing an internal state which changes at every reproduction; for convenience the state can be accessed through the auxiliary function state:

    DEFINE 
        state == first first; 
        integers == rep.ints. 
     
    integers. 
[[0 [succ] infra dureco] [succ] infra dureco]
     
    integers i i i i i state. 
5
     
    integers i i i i i i. 
[[6 [succ] infra dureco] [succ] infra dureco]

The first call to integers just shows the quotation that has been produced, including its initial state 0. The second call shows the state after 5 executions by the i combinator, the third shows the quotation after 6 executions.

The next definition is an example which combines what the second and third definitions did: it contains a program which does something to the stack, and also contains an internal state which changes after each execution:

    DEFINE 
        times10-c ==  [10 *] rep.exe-c. 
     
    times10-c. 
[[0 [succ] infra [10 *] dip dureco] [succ] infra [10 *] dip dureco]
     
    3 times10-c i i i i i pop. 
300000
     
    3 times10-c i i i i i popd state. 
5

The reproducing program constructed by times10-c will multiply any number it finds on the stack by 10. The first call to times10-c shows the quotation that has been constructed; the second shows just the stack after 5 executions by i, and the third shows just the state after 5 executions by i.

Streams or infinite lists

This section deals with reproducing programs which are essentially generalisations of the integers program of the previous section. These programs implement streams or infinite or "lazy" lists. First, here are four definitions inside the module rep from the library. The function c-stream produces programs which implement infinite lists in which all members are identical. The function n-stream produces programs in which the next member is defined in terms of the previous member by means of an internal operation. Both functions have a variant "..-d" which simultaneously deposits the current member and computes the next member. So these variants can be thought of as computing the uncons function:

    c-stream == [dureco] cons dureco;
    c-stream-d == [dureco] deposit cons dureco;

    n-stream == [infra dureco] cons cons dureco;
    n-stream-d == [infra dureco] cons deposit cons dureco;

The first definition in the test file defines the infinite stream whose members are the constant 1:

    DEFINE 
        ones == 1 rep.c-stream. 
     
    ones. 
[[1 dureco] dureco]

    ones i i i i i. 
[[1 dureco] dureco]

    ones i i i i i state. 
1

The first call to ones shows the quotation that is produced, the second shows the (identical) quotation after 5 executions by the i combinator, the third call shows the unchanged state after 5 executions.

The next definition defines a stream which starts with the member 1.0 and on successive executions by the i combinator halves it:

    DEFINE 
        halving == 1.0 [2 /] rep.n-stream. 
     
    halving. 
[[1 [2 /] infra dureco] [2 /] infra dureco]
     
    halving i i i. 
[[0.125 [2 /] infra dureco] [2 /] infra dureco]
     
    halving i i i state. 
0.125

The first call to halving shows the original quotation, the second and third show the resultant quotation and the internal state after three executions by i.

The next definition in the test file defines a variant of the integers program of the previous section, with the difference that the first member can be chosen to be other than 0:

    DEFINE 
        integers-from == [succ] rep.n-stream. 
     
    42 integers-from. 
[[42 [succ] infra dureco] [succ] infra dureco]
     
    42 integers-from i i i i i state. 
47

The first call shows the quotation that is produced when the parameter is 42, the second call shows the state after five executions by the i combinator.

The next definition gives an example of a constant stream which deposits its first member on every execution:

    DEFINE 
        ones-d == 1 rep.c-stream-d. 
     
    ones-d. 
[[1 dup [first] dip dureco] dup [first] dip dureco]
     
    ones-d i i i pop. . . 
1
1
1

The first call shows the quotation, the second call shows the three instances of 1 deposited on the stack by the three executions.

The next definition shows the depositing variant of the halving stream shown earlier:

    DEFINE 
        halving-d == 1.0 [2 /] rep.n-stream-d. 
     
    halving-d. 
[[1 dup [first] dip [2 /] infra dureco] dup [first] dip [2 /] infra dureco]
     
    halving-d i i i i i pop. . . . . 
0.0625
0.125
0.25
0.5
1

As before, the first call shows the quotation, the second call shows the states that have been deposited on the stack.

The next definition is for a depositing version of a primes stream. Note how the enclosed operation for computing the next member can be arbitrarily complex:

    DEFINE 
        primes-d == 2 [succ [prime not] [succ] while] rep.n-stream-d. 
     
    primes-d. 
[[2 dup [first] dip [succ [prime not] [succ] while] infra dureco]
    dup [first] dip [succ [prime not] [succ] while] infra dureco]
     
    primes-d i i i i i pop. . . . . 
11
7
5
3
2

The output from the first call shows the quotation minimally formatted to make reading easier. The other call shows the five primes that have been deposited from the five executions by the i combinator.

In the streams so far successive members were related by a simple function which was used to compute the next member (trivially this is also true of the constant streams). But this is not always possible, namely when there is no simple function to do the job. For this reason the module rep in the library provides another kind of stream in which there are two functions: one (here called [N]) to compute an internal next value from its predecessor, and another (here called [F]) to compute the final value. Two such streams are defined, and both use the same preparatory auxiliary function:

    f-stream-prepare ==                 # s [N] [F]
        swap                            # s [F] [N]
        [dip] cons concat               # s [F [N] dip]
        cons                            # [s F [N] dip]
        [dup] infra                     # [s s F [N] dip]
        dup rest rest infra             # [Fs Ns F [N] dip]
        uncons uncons                   # Fs Ns [F [N] dip]
        [pop dup] swoncat               # Fs Ns [pop dup F [N] dip]
        [infra durereco] cons;          # Fs Ns [[pop .. dip] infra durereco]

    f-stream == f-stream-prepare cons cons durereco;
    f-stream-d == f-stream-prepare deposit cons cons durereco;

The test file defines the stream of squares ([dup *]) of even numbers ([2 +]) starting with 0:

    DEFINE 
        even-squares == 0 [2 +] [dup *] rep.f-stream. 
     
    even-squares. 
[[0 2 [pop dup dup * [2 +] dip] infra durereco]
      [pop dup dup * [2 +] dip] infra durereco]

    even-squares state. 
0
    even-squares i state. 
4
    even-squares i i state. 
16
    even-squares i i i state. 
36
    even-squares i i i i state. 
64

The first call shows the quotation slightly formatted. The other calls show the squares of the first five even numbers.

The next definition is for a depositing stream in which the internal state is a list of two elements: a power of 10 produced by [10 *] and its logarithm base 10. The remainder of the quotation constructs the list:

    DEFINE 
        ten-powers-log10 == 
             1  [10 * ] [[] cons [dup log10] infra]  rep.f-stream-d. 
     
    ten-powers-log10. 
[[[0 1] 10 dup [first] dip [pop dup [] cons [dup log10]
            infra [10 *] dip] infra durereco]
           dup [first] dip [pop dup [] cons [dup log10]
            infra [10 *] dip] infra durereco]
     
    ten-powers-log10 i i i i i i pop. . . . . . 
[5 100000]
[4 10000]
[3 1000]
[2 100]
[1 10]
[0 1]

The first call shows the quotation that is produced, here formatted to extend over four lines. The second call shows the five lists that have been deposited from the five executions by i.

A different and more complete treatment of infinite lists in Joy is in the library lazlib.joy, with testfile laztst.joy and output laztst.out. Another, more theoretical treatment closer to the present note is in jp-reprod.html.

Interaction and termination

The previous section dealt with streams, reproducing programs which contain an internal state which can change by successive executions. These were generalisations of the integers program which was one of the four basic programs. Another of the basic reproducing programs provided a facility for an internal program which on every reproduction could affect the stack below. This section deals with reproducing programs in which there is some kind of interaction between the state of the reproducing program and the stack below. (This is more general than those versions of streams which merely deposit the internal state.)

The rep module in the library contains a useful definition for constructing programs which on every reproduction step allow two-way interaction between the state and the stack:

    inter ==                            #  s [P]
        [dip cons dureco] cons          #  s [[P] dip cons dureco]
        [uncons] swoncat                #  s [uncons [P] dip cons dureco]
        cons dureco;

The definition of inter is used in the following three definitions in the test file:

    DEFINE 
        accu-list == [] [cons] rep.inter; 
        accu-sum == 0 [+] rep.inter; 
        accu-product-list == [] [[*] dip cons] rep.inter. 
     
    accu-list. 
[[[] uncons [cons] dip cons dureco]
     uncons [cons] dip cons dureco]
     
    1 2 3 4 5 accu-list i i i i i state. 
[1 2 3 4 5]
     
    accu-sum. 
[[0 uncons [+] dip cons dureco]
    uncons [+] dip cons dureco]
     
    1 2 3 4 5 accu-sum i i i i i state. 
15

    accu-product-list. 
[[[] uncons [[*] dip cons] dip cons dureco]
     uncons [[*] dip cons] dip cons dureco]
     
    1 10 2 100 3 1000 4 10000 accu-product-list i i i i state. 
[10 200 3000 40000]

Each of the three definitions is used twice: once to show the (slightly formatted) quotation that it produces, and once with its state after a number of executions by i. The first, accu-list, just accumulates whatever single items it finds and enters them into its state which is a list. The second, accu-sum, accumulates single into its state by adding them to its initial value 0. The third accu-product list accumulates in a list the products of pairs of numbers.

The remainder of this section deals with reproducing programs which have a limited life span. This is implemented by a downwards counter; when that reaches 0 the program no longer does what it did before but degenerates into the bland self-reproducing program selfr. Such reproducing programs terminate their activity. The definition below is from the library:

    exe-t ==                            # N [P] => [[N [I] [T] [E] ifte] ..]
        [dip dureco] cons               # N [[P] dip dureco]
        [[pred] infra] swoncat          # N [[pred] infra [P] dip dureco]
        [ifte] cons                     # N [[E] ifte]
        [pop [[duco] duco]] swons       # N [[pop [[duco] duco] [E] ifte]
        [first null] swons              # N [[first null] [T] [E] ifte]
        cons dureco;                    # [[N [I] [T] [E] ifte] ..]

The definition of exe-t in the library is used in each of these two definitions in the test file: The first will add numbers on the stack up to three times and then terminate, the second will do the same up to four times:

    DEFINE 
        max-3-adds == 3 [+] rep.exe-t; 
        max-4-adds == 4 [+] rep.exe-t. 
     
              max-3-adds. 
[[3 [first null] [pop [[duco] duco]] [[pred] infra [+] dip dureco] ifte] [first null] [pop [[duco] duco]] [[pred] infra [+] dip dureco] ifte]
     
          2 1 max-3-adds i pop. 
3
        3 2 1 max-3-adds i i pop. 
6
      4 3 2 1 max-3-adds i i i pop. 
10
    5 4 3 2 1 max-3-adds i i i i pop. . 
10
5
    5 4 3 2 1 max-4-adds i i i i pop. 
15

The first call to max-3-adds just shows the reproducing program that has been constructed, and the first three executions by the i combinator show the results of up to three additions. The fourth shows that max-3-adds wil not perform the fourth addition, and the last shows that max-4-adds does.

Reproduction for recursion

The programs in the previous sections all did their specific tasks before their actual reproduction. The programs in the remaining sections do this in the reverse order: first they reproduce, then they do their specific tasks. This has the consequence that the task may include a recursive call to themselves. The simplest programs of this kind essentially perform the y combinator that is well known in the literature. The more elaborate programs perform various additional tasks.

The four definitions below are from the rep module in the library. The simplest is fix, which is for implementing anonymous recursion. The second adds a counter as an internal state; the counter is incremented on every call whether direct or recursive. The third is more complex, it takes as parameters a program [P] for recursion, a state s and a program [I] for interaction. The fourth is a special case of the third: the state is an empty list and the interaction program simply conses a copy of the top of stack into that list to record a trace of calls:

    fix == [duco] swoncat duco;
    fix-c == [0 [succ] infra dureco] swoncat dureco;
    fix-i ==                            # [P] s [I]
        [dip cons dureco] cons          # [P] s [[I] dip cons dureco]
        [uncons] swoncat                # [P] s [uncons [I] dip cons dureco]
        cons                            # [P] [s uncons [I] dip cons dureco]
        swoncat                         # [s uncons [I] dip cons dureco P]
        dureco;
    fix-a == [] [[cons] unary] fix-i;

In the test file there are the following two definitions for the factorial function in terms of the fix constructor. The first is for the usual form like the y combinator, where the program only reproduces on the way down during recursion and is deleted (by one of the pops) when the recursion bottoms out. Consequently the reproducing program is no longer in the way of the multiplication on the way up. In the second version the reproducing program survives all the way, and after the final result has been produced it is still available for another computation:

    DEFINE 
       fact0 == [[pop null] [pop pop 1  ] [[dup pred] dip i  *     ] ifte]; 
       fact  == [[pop null] [[pop 1] dip] [[dup pred] dip i [*] dip] ifte]. 
     
    6 fact0 rep.fix i. 
720

    6 fact  rep.fix i. . 
[[duco [pop null] [[pop 1] dip] [[dup pred] dip i [*] dip] ifte]
  duco [pop null] [[pop 1] dip] [[dup pred] dip i [*] dip] ifte]
720
     
    3 fact  rep.fix i i. . 
[[duco [pop null] [[pop 1] dip] [[dup pred] dip i [*] dip] ifte]
  duco [pop null] [[pop 1] dip] [[dup pred] dip i [*] dip] ifte]
720

The first example shows the conventional calculation of factorial(6). The second shows the same with the surviving program (slightly formatted) still on the stack. The third shows the calculation of factorial(factorial(3)). The program that survived the first call by the i combinator is used again for a second call by the i combinator.

In the test file the first three definitions below give variants of the reproducing factorial program. The first is just the plain version already seen above. The second also has a counter for the calls, and the third also has an accumulator for the parameters on the stack. The final two definitions are just utilities for showing the counter or the accumulator:

    DEFINE 
        fact-fix   == fact rep.fix; 
        fact-fix-c == fact rep.fix-c; 
        fact-fix-a == fact rep.fix-a; 
     
        steps == "steps: " putchars state putln; 
        trace == "trace: " putchars state putln. 
     
    3 4 5  fact-fix   i swap put i swap put i swap putln pop. 
120 24 6 
     
    3 4 5  fact-fix-c i swap put i swap put i swap putln steps. 
120 24 6 
steps: 15 

    3 4 5  fact-fix-a i swap put i swap put i swap putln trace. 
120 24 6 
trace: [0 1 2 3 0 1 2 3 4 0 1 2 3 4 5] 

The call to fact-fix just gives the factorials for three numbers, 3, 4 and 5. The call to fact-fix-c also gives the count of all calls made, and the third also gives a trace of the parameters.

Also in the test file are the following definitions for computing the Fibonacci function. This function can be implemented very efficiently to run in linear time, but it is often defined using binary recursion, running in exponential time, to illustrate one or the other matter. This latter form is often called the "naive Fibonacci", and this explains the leading "n" in the names below. The first definition just defines the quotation parameter for the other three, similar to the preceding three factorial functions:

    DEFINE 
        nfib == 
            [ [ pop small ] 
              [ [pop 1] dip ] 
              [ [pred dup pred] dip 
                dip swap i 
                [+] dip ] 
              ifte ]; 
     
        nfib-fix == nfib rep.fix; 
        nfib-fix-c == nfib rep.fix-c; 
        nfib-fix-a == nfib rep.fix-a. 
     
    nfib-fix. 
[[duco [pop small] [[pop 1] dip]
     [[pred dup pred] dip dip swap i [+] dip] ifte]
  duco [pop small] [[pop 1] dip]
     [[pred dup pred] dip dip swap i [+] dip] ifte]

    6 nfib-fix i pop. 
13

    6 nfib-fix-c i swap putln steps. 
13 
steps: 25 

    6 nfib-fix-a i swap putln trace. 
13 
trace: [0 1 2 1 0 1 2 3 4 1 0 1 2 3 0 1 2 1 0 1 2 3 4 5 6] 
     
    0 __settracegc. 

    30 nfib-fix-c i swap putln steps. 
1346269 
steps: 2692537 
     

The first call, to nfib-fix, just shows the reproducing program, sligtly formatted. The next three show the results of calls with the parameter 13. Finally, there is a call with the parameter 30 to show the very large number of reproducing steps. Because these steps required many garbage collections, the usual trace information has been switched off beforehand.

Convenience operators

The two functions factorial and Fibonacci in the preceding section were chosen because they are well known examples of linear and binary recursion. In both cases three versions were given, in both cases based on the core quotation containing three quotations and the ifte combinator. Much of the complexity of the core quotations is due to the fact that these are not recursive definitions but use a reproducing program. It would be convenient if there were more natural ways of writing the core quotation. This can indeed be done, by two program constructors whose program parameters are identical to those used by the linrec and binrec combinators that are built into Joy. Both constructors use a common and quite complex feature which here is called _expand. (The leading underscore in the name is just to indicate that it is not a function likely to be wanted elsewhere.) Here is the definition, in the rep module of the library. Following it are the definitions of the linear and binary constructors. As may be seen, the two definitions only differ in how often the internal recursion occurs:

    _expand ==                          # [I] [T] [R1] [R2] [r]  ==>
                          # [ [pop I] [[T] dip] [[R1] dip r [R2] dip] ifte ]
        swap                            # [I] [T] [R1] [r] [R2]
        [dip] cons                      #  .. .. .. .. [[R2] dip]
        concat                          #  .. .. .. [r [R2] dip]
        [dip] swoncat                   #  .. .. .. [dip r [R2] dip]
        cons                            #  .. .. [[R1] dip r [R2] dip]
        [ [dip] cons                    #  .. [[T] dip] ..
          [ [pop] swoncat] dip ]        #  [pop [I]] .. ..
        dip
        [ifte] cons cons cons;          # [ .. .. .. ifte ]

    linear == [i] _expand;
    binary == [dip swap i] _expand

In the test file the constructors linear and binary are used to define the more readable versions of three core quotations, one for factorial, one for the length of sequences, and one for the naive Fibonacci. The syntactic similarity of the parameters to the parameters for the inbuilt linrec and binrec combinators should be noted. These are then used to construct the three reproducing programs, the one for length with an accumulator, the other two plain vanilla:

    DEFINE 
        fact-lin == [null] [pop 1] [dup pred] [*] rep.linear; 
        length-lin == [null] [pop 0] [rest] [succ] rep.linear; 
        nfib-bin == [small] [pop 1] [pred dup pred] [+] rep.binary; 
     
        fact-fix == fact-lin rep.fix; 
        length-fix-a == length-lin rep.fix-a; 
        nfib-fix == nfib-bin rep.fix. 
     
    4 fact-fix i pop. 
24

    [2 5 3 7 6] [a b c] length-fix-a i swap put i swap putln trace. 
3 5 
trace: [[] [6] [7 6] [3 7 6] [5 3 7 6] [2 5 3 7 6] [] [c] [b c] [a b c]] 

    6 nfib-fix i pop. 
13

    6 nfib-bin [] [[dup put] dip] rep.fix-i i newline pop. 
6 5 4 3 2 1 0 1 2 1 0 3 2 1 0 1 4 3 2 1 0 1 2 1 0 
13

In the calls, the reproducing factorial program is tested just once, the length program twice in succession on two lists, with a trace of what the accumulated parameters were. The Fibonacci program is called twice with the parameter 6. The second version uses the interactive fix-i constructor from the library to construct a reproducing program which writes ([dup put]) the current parameter to the output. This gives the trace of the parameters in the reverse order than they would be given if an accumulator had been used.

Also in the test file are a readable definition of the core quotation for quicksort, followed by a definition of a reproducing version with a counter of the calls. The third definition is for tests which will show the counter:

    DEFINE 
        qsort-bin == [small] [] [uncons [>] split] [enconcat] rep.binary; 
        qsort-fix-c == qsort-bin rep.fix-c; 
        qtest == qsort-fix-c i state. 
     
    [5 10 9 14 7 18 1 4 15 3 20 19 8 11 2 6 12 13 16 17] qtest. . 
29
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
     
    [10 5 3 2 4 1 8 7 9 6 15 13 12 14 11 18 17 19 16 20] qtest. . 
25
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
     
    [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20] qtest. . 
39
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
     
    [20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1] qtest. . 
39
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
     
    "You can also sort strings, of course" qtest. . 
55
"      ,Yaaccefgilnnooooorrrsssssttuu"

In all the calls except the last, qtest is used on a list of the first 20 positive integers. In each case the same sorted list is produced, but the number of calls needed to produce the result depends a great deal on the distribution in the original list. In the first call the original is more or less random, and 29 calls are required. In the second call the original has been carefully constructed to minimise calls to 25. In the last two calls the original lists are strictly ascending or descending, requiring 39 calls both. This is a familiar feature of the quicksort algorithm. The very last call to quicksort shows that it also sorts strings.

Final remarks

How efficient are reproducing programs? To start, here again is the first, the selfreproducing program:

        [[duco] duco]  i   ==>   [[duco] duco]

The steps required for one reproductions are:

  1. Execute the i combinator to activate the outer quotation
  2. Push the inner quotation [duco]
  3. Execute the duco (from the outer quotation)
  4. This expands to duplicating what has just been pushed,
  5. followed by consing the two together.

Hence a total of five steps. This could be optimised by inlining the duco:

        [[dup cons] dup cons]  i   ==>   [[dup cons] dup cons]   

In this way step 3 is eliminated, and only four steps are required. For the more complex programs in this note the actual reproducing steps are much the same: 1. execute the i combinator, 2. push the inner quotation, a step which is independent of the size of the inner quotation, then 3. execute duco or dureco or durereco, in 2 or 3 or 4 further steps. Again step 3. could be eliminated by inlining. By turning the three operators duco, dureco and durereco into Joy primitives, each reproduction would be executed in three steps: the i combinator, the push for the internal quotation, and then the new primitive. So reproduction of itself is not an intrinsically expensive operation.

Of course the more complex programs did other things apart from just reproducing, so they are longer in the first place. But half their actual length is due to their internal quotation, which always has to be pushed – in one simple step. The remainder of their length is due to two sources: doing their intended job, and doing some additional work to fit in with the implementation by reproduction. The part due to the intended job varies with the nature of the job, of course. And the additional work is probably much the same with any other implementation.

But it is worth considering which of the several things done in this note by reproduction could also be done in other ways. Most of the reproducing programs discussed here added either an internal state or an internal program or both to a simple reproducing program. Can their effects be achieved by other means? Clearly the internal state could be replaced by an item on the stack, and the internal program which does the principal job could well be an ordinary one which does not have to reproduce. Now the question does arise whether extra jobs such as keeping a count of the number of calls or maintaining an accumulator can be done in a clean manner. If the programs are designed that way from the start they clearly can. However, some of the constructions for reproducing programs in this note have been more flexible than that: they supplied the basis for an ordinary program, such as for sorting, together with several ways of executing it: Either plain vanilla, or with an execution count, or with an accumulator.

So in some respects these constructions behaved like combinators, in that various basic programs could be mixed with various ways of executing them. There seems to be a superficial difference: ordinary combinators do not change the actual code inside the quotation, whereas these constructions seem to do just that. But firstly, this difference would be more than superficial, and secondly, this is not what actually occurs. This needs some elaboration.

Let X be one of the reproducing recursive programs of the previous section, for factorial, Fibonacci or sorting, together with the various constructions that enabled plain vanilla execution, or execution with a call count or with a trace. They all consisted of a basic program, say basic-X, specific to one of the three functions, and then the constructions that created three varieties, say do-X, do-counting-X and do-tracing-X, each to be called by the i combinator. For the counting and the tracing versions the basic code was substantially modified, but mostly this involved replacing a simple quotation [Q] by a quotation [[Q] dip]. So [Q] was not touched at all. A few other modifications involved replacing a quotation [Q] by [pop Q], but again without changing [Q]. Therefore the only difference between ordinary combinators and the constructions here was that the result of the construction was to be called by the i combinator – something which is not true of ordinary combinators.

It is probably true that any of the effects achieved by reproducing programs can also be achieved by other means. Instead of making the state a part of the reproducing program, the state could be an additional item on the stack – and as such it does not need to be extracted from, and later re-incorporated into the reproducing program at every step. The additional code required to produce a call count or to trace or anything similar can be incorporated mechanically into just about any ordinary recursive Joy program. For such features in non-recursive programs there is also the less often used x combinator, which might have been defined by dup i.

In conclusion, although this survey might not have unearthed any techniques dramatically different from more conventional ones, the topic of reproducing programs is of some intrinsic interest. After all, many woes of computer security are precisely due to malicious reproducing programs that infect our systems.