by Manfred von Thun
NOTE: Some of the definitions in here have been included in the matrix/vector library, but under quite different systematic names.
A matrix is a rectangular collection of items, most often numbers. Two matrices M1 and M2 can be added just in case both have the same number of rows and both have the same number of columns. In that case their sum M3 has that same number of rows and that same number of columns. Each element in the result matrix M3 is just the arithmetic sum of the corresponding elements of the other two matrices M1 and M2. Two matrices M1 and M2 can be multiplied just in case M1 has as many columns as M2 has rows. In that case their product M3 has as many rows as M1 and as many columns as M2. In detail, if M1 has I rows and J columns, and M2 has J rows and K columns, then the element in row i and column k of M3 is computed by multiplying the J pairs of corresponding elements of row i of M1 and column k of M2 and then adding the products.
In Joy a matrix of I rows and J columns can be represented as a list of I lists each of J elements. The remainder of this note deals with matrix addition and matrix multiplication in Joy. The first sections deal with such operators just for numerical matrices. In Joy0 this means just integer, in Joy1 this means integer or float. The sections which then follow deal with general addition and multiplication combinators suitable for matrices built from other datatypes.
How can we write a program num-matadd to add two matrices of numbers giving a third? We want:
M1 M2 num-matadd ==> M3
As a first step consider element-wise addition of two vectors of the
same length, represented in Joy just as a list. The result vector has
that same length. Some languages have a zipwith combinator, but in Joy
it is just called map2
. Whereas map
takes one
list parameter, map2
takes two. Both combinators also take
a quotation parameter which determines how the elements of the parameter
list(s) are to be used to form the elements of the result list. For
vector addition the elements have to be added, so the required program
could be defined by:
num-vectadd == [+] map2
Now to add two matrices of numbers we just have to use
map2
again to add the rows as vectors:
M1 M2 [num-vectadd] map2 ==> M3
Hence the following definition:
num-matadd == [num-vectadd] map2
For later purposes it is useful to "inline" the definition of num-vectadd, giving:
num-matadd == [[+] map2] map2
How can we write a program matmul to multiply two matrices to give a third? This is a little harder. We want:
M1 M2 num-matmul ==> M3
Notice that M3 has exactly as many rows as M1, and hence the Joy lists M1 and M3 are of the same length. Each of the I component lists of M1 is mapped to a corresponding list of M3. The details of the mapping of course also depend on M2. So the mapping function will make use of M2. As a first step we may write:
M1 [M2 P] map ==> M3
where P is to be determined later. Even then, M2 is not really supposed to be part of the program, because M2 will already be on the stack. But it is easy enough to take it off the stack and put it in front of P:
M1 M2 [P] cons map ==> M3
P must be program which encounters M2 on the stack and below that a list of J numbers left behind from the map. P must use the K columns of M2 to produce a list of K numbers for M3. This looks like another case of map, with Q to be determined later: P == [Q] map. By substitution we have:
M1 M2 [[Q] map] cons map ==> M3
Two things are not quite right. Firstly, M2 is a list of J rows of K numbers, but the new map just introduced needs a list of K columns of J numbers. So it needs the transpose of M2:
M1 M2 transpose [[Q] map] cons map ==> M3
Secondly, the new map will consume each of the K columns from the
transposed M2 to produce a list of K numbers for M3, but it will also
leave behind the list of J numbers from M1 which the earlier, outer map
left behind. This list of J numbers needs to be removed after the new
map has finished. But the list to be removed is not on top of the stack
but just below. So it must be removed by popd
, which could
be defined by popd == [pop] dip
:
M1 M2 transpose [[Q] map popd] cons map ==> M3
Program Q will find two lists of length J on the stack: a row list
from M1 and a column list from M2. It must multiply corresponding
numbers from the two lists and then add their products. This operation
is useful elsewhere, it is called scalar product. One way to compute it
is to produce a list of products first, and then add the elements of
that list. The list of products is similar to vector addition defined
earlier, but it uses multiplication. So the list of products is produced
by [*] map2
. The resulting list can then be added with the
sum
operator. The sum
operator works like
this: starting with 0, add to that all the elements of the list. This is
called "folding" the list, using the binary +
operator.
These are the definitions:
sum == 0 [+] fold
scalar-product == [*] map sum
So the entire matrix multiplication problem now is simply:
M1 M2 transpose [[scalar-product] map popd] cons map ==> M3
So we may define:
num-matmul == transpose [[scalar-product] map popd] cons map
For later purposes it is again useful to inline scalar-product and then sum. The new definition is here written over several lines for comparison with later variations:
num-matmul == transpose
[[[*] map2 0 [+] fold] map popd] cons
map
The addition and multiplication operators of the preceding sections
only work for numerical matrices. But the only four parts in the
definitions that depend on that are [+] inside num-matadd
,
and [*], 0 and [+] inside num-matmul
. However, matrices
could have other kinds of things as elements. For example, one might
have logical matrices in which the elements are the truth values
true
and false
. In that case one would want
two further definitions with just the four parts replaced by [or],
[and], false and [or]. Or one might want two operators for different
datatypes again. In fact, for matrix addition and multiplication the
three matrices M1 M2 M3 might have up to three different kinds of
elements. It would be useful to have general matrix manipulation
combinator that abstract from details.
How can we write a general matrix addition combinator? It should satisfy:
M1 M2 [+] gen-matadd ==> M3
which behaves just like the numerical addition operator, and with
[or] instead of [+] behaves like a logical addition operator? In either
case it has to transform the quotation [+] into
[[+] map2] map2
, or the quotation [or] into
[[or] map2] map2
. This is easy enough, in the numerical
case it is:
M1 M2 [+] [map2] cons map2 ==> M3
So we may define a general matrix addition combinator gen-matadd by:
gen-matadd == [map2] cons map2
Now matrix addition for numbers or for truth values can be defined as:
num-matadd == [+] gen-matadd
log-matadd == [or] gen-matadd
In Joy truth values and sets are treated in a very similar way, and collectively they are called Boolean values. For truth values the or-operator produces the disjunction, and for sets the or-operator produces the union. So the above logical matrix addition operator would work equally well for matrices in which the elements are sets. Because of that it is better to call it a Boolean matrix addition operator, defined as:
bool-matadd == [or] gen-matadd
As always, instead of actually having the definition one could just as easily use the right hand side directly.
A general matrix multiplication combinator is a little harder. Let Q be any program for combining a row from M1 and a column from M2 to form an element for M3. In the numerical case the Q would have been the scalar product program. Then a combinator gen-matmul would be used like this:
M1 M2 [Q] gen-matmul ==> M3
Now M2 needs to be transposed as before, but it is below the [Q], so
gen-matmul must use dip
:
M1 M2 [Q] [transpose] dip S ==> M3
S must construct [[Q] map popd] and then use that with T:
M1 M2 [Q] [transpose] dip [map popd] cons T ==> M3
But T is just as before, T == cons map:
M1 M2 [Q] [transpose] dip [map popd] cons cons map ==> M3
So we can define the general combinator:
gen-matmul == [transpose] dip
[map popd] cons cons
map
Now the multiplication of numerical and logical matrices can be defined just in terms of the corresponding scalar products:
num-matmul == [ [*] map2 0 [+] fold] gen-matmul
log-matmul == [[and] map2 false [or] fold] gen-matmul
Compared with the earlier, explicit version of
num-matmul
, this version must execute one extra
dip
and one extra cons
in
gen-matmul
. But this is negligible compared with the amount
of real work done later by either version, especially the I*J*K
numerical multiplications (in [*]) and numerical additions (in [+])
required for any numerical matrix multiplication program. Exactly the
same is true for the logical version.
There is really nothing one could do to improve the addition
combinator gen-matadd
. But as the two examples of
applications of the gen-matmul
combinator show, they will
all have to use the map2
and the fold
combinator. It would be cleaner to have the map2
and the
fold
as part of the general combinator, so that users only
have to include what is specific to the datatype of the matrices. The
specific parts are a binary operator [B1] (which is [*] or [and] in the
examples), and also a zero element Z2 and a binary operator [B2] (which
are 0 and [+] or false and [or] in the examples). The value Z2 has to be
the zero element for the binary operation in [B2]. As
gen-matmul
stands, one has to provide one quotation
parameter, in the form:
[ [B1] map2 Z2 [B2] fold ] gen-matmul
It would be cleaner if one could provide just what is really needed:
[B1] Z2 [B2] gen-matmul
The required change to gen-matmul
is quite simple, all
that is needed is a preliminary program P which changes the three simple
parameters to the one complicated parameter:
[B1] Z2 [B2] P ==> [ [B1] map2 Z2 [B2] fold ]
The preliminary program can first use [fold] cons cons
,
and this will at least change Z2 [B2] into [Z2 [B2] fold]
.
Following that some other changes C are needed:
[B1] Z2 [B2] [fold] cons cons C ==> [ [B1] map2 Z2 [B2] fold ]
The first parameter [B1] has to be changed to
[[B1] map2]
, but this has to be done with dip
just below the last constructed quotation. It is done by
[[map2] cons] dip
:
[B1] Z2 [B2] [fold] cons cons [[map2] cons] dip F ==> [ [B1] map2 Z2 [B2] fold ]
where the finalisation F simply has to concatenate the two previous constructions:
[B1] Z2 [B2] [fold] cons cons [[map2] cons] dip concat
==> [B1] map2 Z2 [B2] fold ]
The above program becomes a new first line in the otherwise unchanged
program for the gen-matmul
combinator:
gen-matmul == [fold] cons cons [[map2] cons] dip concat
[transpose] dip
[map popd] cons cons
map
Now the multiplication operators for numerical and logical matrices can be given by the more perspicuous definitions:
num-matmul == [*] 0 [+] gen-matmul
log-matmul == [and] false [or] gen-matmul
The two definitions just above will be suitable for many purposes.
Like all definitions so far, they even work for degenerate matrices with
no rows and no columns. Such degenerate matrices are of course
represented as [], and their arithmetic or logical product is again [].
For other datatypes it is often easy to give the appropriate definition.
For example one might need an operation for the multiplication of
matrices in which the elements are sets. In Joy set intersection is just
and
, whereas set union is just or
. The zero
element for union is {}. So a suitable definition is:
set-matmul == [and] {} [or] gen-matmul
Comparing logical matrix multiplication and set matrix
multiplication, the two are almost identical except for the different
zero elements false
and {}. This difference has the
consequence that whereas the and-operator and the or-operator work
equally for logical and set operands, we cannot define a single matrix
multiplication operator that works equally for logical and set
operations. That would be unfortunate, although one could first do a
very simple test on the first element in the first row of M2 to
determine whether it is a truth value or a set.
But there are other cases where there could not be such a simple device. Consider the problem of multiplying second order matrices in which the elements themselves are numerical matrices of compatible rows and columns but where the number of rows and columns are not known in advance. This means that the zero element for the addition of the submatrices is also not known in advance, and hence in the definition below the ??? is not known in advance:
num-matmul2 == [num-matmul] ??? [num-matadd] gen-matmul
What is particularly annoying is that ??? is the only part that needs to be known for the definition to work.
There would be other cases in which the same problem arises. They
will all involve the datatype of the Z2 zero element for the binary [B2]
operation, where that datatype is of indeterminate size, shape or
structure. There are no such restrictions on [B1] and [B2]. It is true
that a suitable Z2 can always be constructed from two sample elements
from M1 and M2 by applying the [B1] operation, and then making a
suitable Z2 from that. All this would have to be encoded in a very
complex version of gen-matmul
.
But there is a simpler solution. Consider again the versions of the numerical matrix multiplication operators of sections 1 and 2. They required, in some way or another, a program to compute a scalar product:
[ [*] map2 0 [+] fold ]
In all but the degenerate case the list produced by map2
will not be null. So in the normal case the list can be taken apart with
unswons
. That will leave the first element of that list,
followed by the rest of the list. The rest can now be folded by using
not 0 as the initial value of the accumulator, but the first element of
the list. In other words, the above program fragment could be replaced
by:
[ [*] map2 unswons [+] fold ]
So it would be possible to have just [*] and [+] as the parameters to
the general matrix multiplication combinator, and let the combinator
supply map2
, unswons
and fold
.
This would imply changing the first line of the last version by a
program P which does the conversion:
[B1] [B2] P ==> [ B1] map2 unswons [B2] fold ]
The conversion is quite similar to the earlier one. But now only one
parameter, [B1]
has to be consed into [fold]
,
then [B1]
has to be consed, through dip
, into
[map2 unswons]
, and then these results concatenated. So the
required program P is:
[fold] cons [[map2 unswons] cons] dip concat
This program replaces the first line of the previous definition of the general matrix multiplication combinator:
gen-matmul == [fold] cons [[map2 unswons] cons] dip concat
[transpose] dip
[map popd] cons cons
map
Now the multiplication operators can be defined quite simply. The first is for numerical matrices. In Joy0 this means just integer, in Joy1 this means integer or float. The second is for Boolean matrices, of either truth values or sets. The third is for second order matrices of numerical matrices. The fourth is for second order matrices of Boolean matrices:
num-matmul == [*] [+] gen-matmul
bool-matmul == [and] [or] gen-matmul
num-matmul2 == [num-matmul] [num-matadd] gen-matmul
bool-matmul2 == [bool-matmul] [bool-matadd] gen-matmul
It is easy to see how multiplication of third order matrices would be defined. But it is doubtful whether even second order matrices arise.
It could be useful to have a single matrix multiplication operator
which can handle at least numerical and Boolean matrices. To do so, it
must inspect the type of the elements. If they are numerical it must
behave like num-matmul
above, and if they are Boolean it
must behave like bool-matmul
above. If the type is neither,
an error must be reported and the execution aborted. As a first draft,
the following is useful:
dup first first
(* branching on the type,
push [*] and [+], or push [and] and [or],
or report error and abort *)
gen-matmul
The branching looks like a three-way branch which could be handled by
two ifte
combinators, one inside another, or by a single
cond
combinator. But in Joy1 the numerical and the Boolean
cases both split into two: the numerical types are integer and float,
and the Boolean cases are logical and set. (In Joy0 the only numerical
type is integer.) So each of the two correct situations can arise in two
ways: when the element type is integer or float, and when the element
type is logical or set. The predicates numerical
and
boolean
test for that. The cond
combinator
takes the two normal cases and as its default the error condition:
[ [ [numerical] pop [*] [+] ]
[ [boolean] pop [and] [or] ]
[ "unknown operands\n" putchars abort ] ]
cond
Alternatively, the branching could be based on those four cases, with
a fifth, default case for the error situation. Since this branching is
on the basis of types, the opcase
operator is most
suitable:
[ [ 0 pop [*] [+] ]
[ 0.0 pop [*] [+] ]
[ true pop [and] [or] ]
[ {} pop [and] [or] ]
[ "unknown operand\n" putchars abort] ]
opcase
It may be a matter of taste whether the version with the
cond
combinator or the version with the opcase
combinator is preferable. Either of the two could be used as the
insertion in the earlier skeleton.
But before that, it is useful to remember at this point that the
gen-matmul
combinator will fail for degenerate matrices.
But that is easily fixed right here: the product of two degenerate
matrices is just the degenerate matrix []. So the earlier skeleton
should be wrapped inside an ifte
combinator: if the top
matrix is degenerate, pop
it off and return the other
matrix (which should be degenerate, too). Otherwise, proceed with the
ordinary version:
poly-matmul ==
[ null ]
[ pop ]
[ dup first first
[ [ [numerical] pop [*] [+] ]
[ [boolean] pop [and] [or] ]
[ "unknown operand type\n" putchars abort] ]
cond
gen-matmul ]
ifte
This definition of the polymorphic matrix multiplication operator is typical of how generality can be achieved in Joy without the use of object oriented concepts. The extra computation needed for such polymorphism might seem a waste, but it is essential for writing useful general libraries.
[MYNOTES: Some of the definitions in here have been included in the matrix/vector library, but under quite different systematic names.]