Practical and theoretical aspects of software development

## The first thing you should know about matrices.

When you look at NumPy, Python’s numerical library, the first object you will encounter is the ndarray, or n-dimensional array. Also known as a matrix, right? Wait, there is also a matrix class in the linalg submodule, that can’t be right. And worse, matrix is not nearly as general as ndarray, it only allows two dimensions. Is that really good enough for real-world data?

Based on experience in higher education, I’m confident that over 90% of those that have seen matrices missed the point. If you don’t know why a matrix must have exactly two dimensions, than you don’t know what a matrix really is. (So read on.)

Matrices are used in such a variety of ways, it’s easy to miss the essential connection. And it doesn’t help that you encounter them backwords. First you see them, then you learn how easily you can add or subtract them, or multiply by scalars. Eventually, you see that multiplication isn’t quite so obvious. But you get the hang of it, without recognizing that this multiplication is their very reason for existence. As a first application, you solve equations by row-reduction, which seems like a bookkeeping trick. After that, it’s probably time to move on to another topic …

Let’s try to re-do this right. We need a definition:

A (m × n) matrix is a representation of a linear function from n-dimensional space to m-dimensional space.

Forget about the way it looks, we need to think about what a matrix does. A matrix is a function that can take as its input a vector with n-dimensions, and outputs a vector with m-dimensions.

Consider an example:

$\begin{bmatrix} 1 & 2 & 4 \\ 0 & 1 & -1 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 0 \\ 2 \end{bmatrix} = \begin{bmatrix} 1 \cdot 1 + 2 \cdot 0 + 4 \cdot 2 \\ 0 \cdot 1 + 1 \cdot 0 - 1 \cdot 2 \end{bmatrix} = \begin{bmatrix} 9 \\ -2 \end{bmatrix}$

The matrix on the left, along with this way of multiplying a matrix and a vector, is a function whose takes the input vector $(1,0,2)$ to the output vector $(9,-2)$. Moreover, the set of all $2 \times 3$ arrays corresponds exactly with the set of all linear functions that take three-dimensional vectors to two dimensional vectors.

In symbols, linear means that $A (\mathbf{x} + \mathbf{y}) = A\mathbf{x} + A \mathbf{y}$ and $cA\mathbf{x} = A(c \mathbf{x})$ where $A$ is a matrix, $\mathbf{x}, \mathbf{y}$ are vectors, and $c$ is a scalar. In words, a function is linear if the sum of the function is the function of the sums. In the one-by-one case, the linear functions are simply the functions that multiplies by a constant. In higher dimensions, linear functions are far more interesting and non-trivial.

Matrices would be functions with or without our concern for linearity. For every input, there is a single output, which is all that is required for functionhood. The importance of linearity here is that the vast majority of vector functions cannot be represented by matrices, since they are non-linear. To see how we find a matrix from a linear transformation (a linear vector function) let’s consider an example.

Obtaining a matrix to represent a linear transformation:

Let $T$ be a linear, and suppose we know the values of $T$ on the standard unit vectors:

$T \left(\begin{smallmatrix} 1 \\ 0 \\ 0 \end{smallmatrix}\right) = \left(\begin{smallmatrix} 3 \\ 1 \end{smallmatrix}\right)$, $T \left(\begin{smallmatrix} 0 \\ 1 \\ 0 \end{smallmatrix}\right) = \left(\begin{smallmatrix} 2 \\ 0 \end{smallmatrix}\right)$, and $T \left(\begin{smallmatrix} 0 \\ 0 \\ 1 \end{smallmatrix}\right) = \left(\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right)$.

Then to determine $T \left(\begin{smallmatrix} 3 \\ 4 \\ 5 \end{smallmatrix}\right)$, using the linearity of T, we have:

$T \begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix} = 3 \cdot T \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + 4 \cdot T \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + 5 \cdot T \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} =3 \begin{bmatrix} 3 \\ 1 \end{bmatrix} + 4 \begin{bmatrix} 2 \\ 0 \end{bmatrix} + 5 \begin{bmatrix} 0 \\ 1 \end{bmatrix}$
$= \begin{bmatrix} 3 \cdot 3 + 4 \cdot 2 + 5 \cdot 0 \\ 3 \cdot 1 + 4 \cdot 0 + 5 \cdot 1 \end{bmatrix} = \begin{bmatrix} 3 & 2 & 0 \\ 1 & 0 & 1 \end{bmatrix} \cdot \begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix}$

Notice in particular that the columns of the matrix that represent $T$ are the values of $T$ applied to the standard unit vectors. To check that $A$ represents $T$, you could multiply $A$ by the standard unit vectors.

What’s it all mean?

Why are matrices are two dimensional? Because a function has an input and an output, or a domain and a range. The number of rows is the dimension of the output vector, and the number of columns is the dimension of the input vector.

Why should we care that matrices are functions? It helps us recognize that matrices are the things that do something, rather than the data that we do something with. Also, thinking of matrices as functions helps us make sense of the various matrix operations:

• Matrix multiplication is function composition — which is highly noncommutative.
• Matrix addition and subtraction is just the addition and subtraction of functions — easy.
• Inverting matrices should be hard, and sometimes impossible. After all, not all functions have inverses.

Finally, it makes it clear that if you have a $(100 \times 100 \times 100)$ cube of data, you do not have a matrix. You should not wonder how to multiply it, or if it has an inverse. Don’t try to fit it into NumPy’s matrix class.

What you do have is a million 3-dimensional vectors … which might be the input of a three-column matrix function.

Written by Eric Wilson

September 15, 2011 at 9:52 pm

Posted in didactics

Tagged with , , ,