Chapter 2B: Linear Algebra (Matrices)
Learning Objectives
By the end of this lecture, you will be able to:
- multiply a matrix with a vector (and use
glMatrix vecN.transformMatN ),
- multiply a matrix with another matrix (and use
glMatrix matN.multiply ),
- invert a 2x2 matrix (by hand) and 3x3 matrices with
glMatrix matN.invert ,
- interpolate parameters defined at chosen frames with either lines or cubic curves,
- apply de Casteljau's algorithm to evaluate and render a cubic Bézier curve,
- compute the transpose of a matrix.
|
 |
Our topic today builds off of what we introduced in Chapter 2A. Specifically, our goal is to talk about matrices and the various matrix operations we need for computer graphics. In particular, we'll talk about matrix-vector and matrix-matrix multiplication, as well as matrix inverses and the transpose.
Instead of developing all of this in an abstract context, let's motivate this by an actual computer graphics example. In animation, an artist will typically use keyframes to specify the different poses of scene objects at different points in time. It would be inefficient for an artist to specify the poses at every frame. Instead they specify them at selected frames and then figure out what the scene should look like at "in-between" frames (historically, "inbetweeners" were people hired to draw these in-between frames).
We're going to use a computer to be our inbetweener. But, in order to do so, we need to mathematically specify how the scene parameters (locations, poses, colors) change. We're working in a space where our conventional x-axis is time ($t$) and the y-axis is some parameter we're trying to model. We'll assume this parameter is the height of the ball, denoted by $h$.
Linear interpolation
Let's start off by fitting a straight line to define how parameters change between pairs of keyframes. So we are given two pairs $(t_0, h_0)$ and $(t_1, h_1)$. Assuming the motion varies linearly with time, the height of the ball can be represented as:
$$
h(t) = a t + b,
$$
where $a$ and $b$ need to be determined. We can plug in the two values we know (defined at the two keyframes):
$$
\begin{align*}
h_0 & = a t_0 + b \\
h_1 & = a t_1 + b.
\end{align*}
$$
We could substitute one equation into the other to solve for $a$ and $b$ but this will be harder to apply to more general curves. This is where vector and matrix notation comes in. We'll represent the left-hand-side of the equation as some vector $\vec{h}$ (which has two components, $h_0$ and $h_1$). We can also introduce the vector $\vec{t}$ with components $t_0$ and $t_1$. Our equation can now be written as:
$$
\vec{h} = \left[\begin{array}{c}h_0 \\
h_1\end{array}\right] = a \left[\begin{array}{c}t_0 \\
t_1\end{array}\right] + b \left[\begin{array}{c}1 \\
1\end{array}\right] = a \vec{t} + b\vec{1},
$$
where we have written $\vec{1}$ to be the vector where both components are $1$. Written in this way, we can say that $\vec{h}$ is a linear combination of the vectors $\vec{t}$ and $\vec{1}$, where the weights on these vectors are $a$ and $b$, respectively.
We still need to solve for $a$ and $b$ and this is where a matrix will come into play. Let's stack $a$ and $b$ into its own vector called $\vec{c}$ (you can take the letter $c$ to denote "coefficients"). Our equation can be written as:
$$
\vec{h} = \left[\begin{array}{c}h_0 \\
h_1\end{array}\right] = a \left[\begin{array}{c}t_0 \\
t_1\end{array}\right] + b \left[\begin{array}{c}1 \\
1\end{array}\right] = a \vec{t} + b\vec{1} = \left[\begin{array}{cc} t_0 & 1 \\
t_1 & 1 \end{array}\right] \left[\begin{array}{c} a \\
b \end{array}\right] = \mathbf{A} \vec{c}.
$$
The thing with 2 rows and 2 columns ($\mathbf{A}$) is called a matrix - in this course, we will only ever work with square matrices where # rows = # columns. The product $\mathbf{A}\vec{c}$ is called a matrix-vector multiplication, and we can think of the result (which is a vector) as a linear combination of the columns of $\mathbf{A}$, with the weights on each column being the components of $\vec{c}$. The number of columns of $\mathbf{A}$ needs to be equal to the number of components of $\vec{c}$.
To solve for $\vec{c}$ (which would give us $a$ and $b$), we can multiply both sides of this equation by the inverse of $\mathbf{A}$, denoted by $\mathbf{A}^{-1}$:
$$
\mathbf{A}^{-1} \vec{h} = \mathbf{A}^{-1} \mathbf{A} \vec{c} = \mathbf{I}\ \vec{c} = \vec{c}.
$$
The inverse of a matrix $\mathbf{A}$ has the property that $\mathbf{A}^{-1}\mathbf{A}$ is equal to a special matrix called the identity matrix $\mathbf{I}$, where there is a $1$ in the diagonal entries and zero in the off-diagonal entries. Think of this as multiplying a scalar $a$ by $\frac{1}{a}$: $a\ a^{-1} = 1$. For a 2x2 matrix $\mathbf{A}$, there is an analytic formula for the inverse:
$$
\mathbf{A} = \left[\begin{array}{cc}a_{0,0} & a_{0,1} \\
a_{1,0} & a_{1,1}\end{array}\right], \quad \mathbf{A}^{-1} = \frac{1}{a_{0,0}a_{1,1} - a_{0,1}a_{1,0}}\left[\begin{array}{cc}\phantom{-}a_{1,1} & -a_{0,1} \\
-a_{1,0} & \phantom{-}a_{0,0}\end{array}\right].
$$
Before we confirm that $\mathbf{A}^{-1}\mathbf{A}$ results in the identity matrix $\mathbf{I}$, it's worth commenting on how we compute $\mathbf{A}^{-1}\mathbf{A}$ in the first place. In general for two matrices $\mathbf{A}$ and $\mathbf{B}$ (which we will assume are both square $n\times n$ matrices), the matrix-matrix multiplication $\mathbf{A}\mathbf{B} = \mathbf{C}$ results in an $n \times n$ matrix $\mathbf{C}$ in which each column of $\mathbf{C}$ is the result of the matrix-vector product between $\mathbf{A}$ and each column of $\mathbf{B}$. For 2x2 matrices this looks like:
$$
\mathbf{C} = \mathbf{A}\mathbf{B} = \left[\begin{array}{c}
\boxed{\phantom{---} \vec{a}_0\phantom{---}} \\
\boxed{\phantom{---} \vec{a}_1\phantom{---}}\end{array}\right]
\left[\begin{array}{c}
\boxed{\phantom{0}\\
\vec{b}_0\\
}\end{array}
\begin{array}{c}
\boxed{\phantom{0}\\
\vec{b}_1\\
}\end{array}
\right] =
\left[\begin{array}{c}
\boxed{\phantom{0}\\
\mathbf{A}\vec{b}_0\\
}\end{array}
\begin{array}{c}
\boxed{\phantom{0}\\
\mathbf{A}\vec{b}_1\\
}\end{array}
\right]
$$
A particular entry $c_{i,j}$ (in the $i$-th row and $j$-th column of $\mathbf{C}$) can also be viewed as the dot product between the $i$-th row of $\mathbf{A}$ and the $j$-th column of $\mathbf{B}$: $c_{i,j} = \vec{a}_i \cdot \vec{b}_j$. This can also be visualized as:
$$
\mathbf{C} = \mathbf{A}\mathbf{B} = \left[\begin{array}{c}
\boxed{\phantom{---} \vec{a}_0\phantom{---}} \\
\boxed{\phantom{---} \vec{a}_1\phantom{---}}\end{array}\right]
\left[\begin{array}{c}
\boxed{\phantom{0}\\
\vec{b}_0\\
}\end{array}
\begin{array}{c}
\boxed{\phantom{0}\\
\vec{b}_1\\
}\end{array}
\right] =
\left[\begin{array}{cc} \boxed{\vec{a}_0 \cdot \vec{b}_0} & \boxed{\vec{a}_0 \cdot \vec{b}_1} \\
\boxed{\vec{a}_1 \cdot \vec{b}_0} & \boxed{\vec{a}_1 \cdot \vec{b}_1} \end{array}\right]
$$
In general, $\mathbf{A}\mathbf{B} \neq \mathbf{B}\mathbf{A}$, but equality may hold for special matrices.
Let's now compute what $\mathbf{A}^{-1}\mathbf{A}$ is equal to (using the formula for the inverse):
$$
\mathbf{A}^{-1}\mathbf{A} = \frac{1}{a_{0,0}a_{1,1} - a_{0,1}a_{1,0}} \left[\begin{array}{c}
\boxed{\phantom{-}a_{1,1} \ \ -a_{0,1}}\\
\boxed{-a_{1,0} \ \ \ \phantom{-}a_{0,0}}\end{array}\right]
\left[\begin{array}{c}
\boxed{a_{0,0}\\
a_{1,0}}
\end{array}
\begin{array}{c}
\boxed{a_{0,1}\\
a_{1,1}}
\end{array}\right] = \frac{1}{a_{0,0}a_{1,1} - a_{0,1}a_{1,0}} \left[\begin{array}{cc}
\boxed{a_{1,1}a_{0,0} - a_{0,1}a_{1,0}} & \boxed{a_{1,1}(-a_{0,1} + a_{0,1})} \\
\boxed{a_{0,0}(-a_{1,0} + a_{1,0})} & \boxed{-a_{1,0}a_{0,1} + a_{0,0}a_{11}}\end{array}\right] = \left[\begin{array}{cc} 1 & 0 \\
0 & 1\end{array}\right],
$$
which is the identity matrix. All of the concepts here extend to more dimensions, but it just gets more tedious to compute the inverse (we'll use glMatrix
to do this for us, anyway).
Example
At $t = 0.5$ seconds, let the ball have a height of $0.25$ (meters) and at $t = 0.75$ seconds, the height of the ball is $2$ meters. Using linear interpolation, determine an expression for the height as a function of $t$.
Plugging the values into the equation we derived earlier, we have:
$$
\begin{align*}
0.25 & = a (0.5)\phantom{7} + b \\
2 & = a (0.75) + b,
\end{align*}
$$
which leads to the system of equations (written with matrix and vector notation):
$$
\left[\begin{array}{cc}0.5 & 1 \\
0.75 & 1\end{array}\right] \left[\begin{array}{c} a \\
b \end{array}\right] = \left[\begin{array}{c} 0.25 \\
2 \end{array}\right] \quad \rightarrow \quad \left[\begin{array}{c} a \\
b \end{array}\right] = \frac{1}{(0.5 - 0.75)}\left[\begin{array}{cc}1 & -1 \\
-0.75 & 0.5\end{array}\right]\left[\begin{array}{c} 0.25 \\
2 \end{array}\right] = -4 \left[\begin{array}{c} -\frac{7}{4} \\
\phantom{-}\frac{13}{16} \end{array}\right] = \left[\begin{array}{c} \phantom{-}7 \\
-\frac{13}{4}\end{array}\right],
$$
meaning $a = 7$ and $b = -3.25$. We can also calculate this using glMatrix
, using the mat2.invert
function (inverse), mat2.multiply
(matrix-matrix multiplication) and vec2.transformMat2
(matrix-vector multiplication):
let rhs = vec2.fromValues(0.25, 2);
let A = mat2.fromValues(0.5, 0.75, 1, 1);
let Ainv = mat2.invert(mat2.create(), A);
let c = vec2.transformMat2(vec2.create(), rhs, Ainv); // note that the vector is first
console.log(c); // [7, -3.25]
// we can also check that A * Ainv is equal to the identity matrix
let I = mat2.multiply(mat2.create(), A, Ainv);
console.log(I); // [1, 0, 0, 1] (again printed in column-major order)
Note that when using fromValues
for a matrix, the parameters are passed in COLUMN-MAJOR ORDER. In other words, imagine reading the entries with an outer for
-loop over the columns (from left-to-right) and then an inner for
-loop over the column rows (top-to-bottom). This is another common source of bugs so please be mindful that glMatrix
assumes entries are stored in column-major order.
Although it appears to be more work to do it this way, the concepts extend to larger matrices.
Putting the pieces together for an animation tool
Remember, our job is to put together the tools an artist might need. Let's use the math we derived above and make our own computer-based inbetweener using linear interpolation. In the example below, the pink dots denote points we can move around. Originally, this example is set to use Linear Interpolation to determine the height of the ball in between consecutive keys. Click and drag the pink dots to adjust the motion of the ball.
One of the main issues with this method (Linear Interpolation) is that the motion looks kind of choppy. We can make it smoother by computing a higher-order curve. A quadratic curve would allow us to control the slope of the curve and a cubic curve would allow us to control the curvature. We'll use a cubic curve since cubic curves are commonly used in practice.
Representing the height as a cubic function of time means we will now use the following representation:
$$
h(t) = a\ t^3 + b\ t^2 + c\ t + d,
$$
for some coefficients $a$, $b$, $c$ and $d$. To determine these four coefficients, we need four equations, and these four equations can come from forcing our curve to go through four known keys. Note that the animation above actually has two curves: one for the downards motion of the ball and another for the upwards motion.
Example
Let's focus on the first curve (the downwards motion of the ball until it hits the floor), and suppose the artist has specified the following keys for the motion of the ball:
time |
height |
0 |
1 |
0.25 |
0.25 |
0.5 |
0.125 |
1 |
0 |
Determine the equation of the curve that passes through all of these points and evaluate the resulting curve at $t = 0.75$. Open up the Console (right-click on this page and click Inspect) to do all the operations. Use glMatrix
to do all the calculations (which is already loaded in this webpage). The solution procedure will be very similar to what we did for linear interpolation, but you'll need the vec4
and mat4
namespaces instead of the vec2
and mat2
namespaces we used. Specifically, you'll need vec4.fromValues
, mat4.create()
(or maybe mat4.fromValues
), mat4.invert
and vec4.transformMat4
. When you set up the 4x4 matrix, remember that glMatrix
uses column-major order.
You can (roughly) verify your solution by setting the mode to Cubic Interpolation in the dropdown above.
Representing the motion with Bezier curves.
Try to click and drag the red dots again (leaving the mode as Cubic Interpolation). Notice that the curve wiggles a lot and makes it hard to prevent the ball from going below the floor, which doesn't make any physical sense.
Because of this, interpolation is not really used in practice. In fact, Bézier curves are more common because they provide a more intuitive way to control the motion of the ball. With cubic Bézier curves, we still control four points, but we'll interpret the two interior points a bit differently. Instead of forcing the curve to pass through these two interior (pink) points, we'll force the tangent (or slope) of the curve to align with the line connecting the exterior (black) points with the interior (pink) points. This means that the dashed lines in the example above will graze the curve at the endpoints.
The four points defining each curve segment are called control points and the nice thing about Bézier curves is that the entire curve is contained within the convex hull of the control points. If you imagine wrapping an elastic band around the control points, this means the curve is entirely contained within the elastic band.
There is also an efficient algorithm called de Casteljau's algorithm that can be used to evaluate and render Bézier curves.
de Casteljau's algorithm
Given a set of control points for a Bézier curve, we may want to render the curve. One option is to split up our interval into a uniformly spaced set of $t$ values, evaluate the basis and connect the dots. However, we can do better by employing another property of Bézier curves. Consider the control polygon formed by the points $\vec q_0,\ \vec q_1,\ \vec q_2$ and $\vec q_3$, which control a Bézier curve parametrized in the interval $[0,1]$. To evaluate the curve at a point $t = 0.5$:
- Take the midpoint of the 3 segments defining your control polygon: $\vec m_0,\ \vec m_1,\ \vec m_2$ (in red).
- Connect the three midpoints to form two new segments (dashed red lines).
- Compute the midpoints of these two segments: $\vec r_0$ and $\vec r_1$ (in gold).
- Connect the two midpoints to form a single new segment (dashed gold line, a bit hard to see).
- Compute the midpoint of this new segment: this is the Bézier curve evaluated at $t = 0.5$ (gray point).
This procedure is known as de Casteljau's algorithm. Effectively, this produces two new Bézier curves on either side of the point at $t = 0.5$. Need a more accurate rendering of your Bézier curve? Keep applying the above steps on the Bézier curves produced at every level of de Casteljau's algorithm. If you need to evaluate the Bézier curve at any $t$, divide the segments into fractions of $t$ and $1-t$ instead of a fraction of $1/2$ (i.e. at the midpoint). For example, $\vec m_0 = \vec q_0 + t (\vec q_1 - \vec q_0)$, and so on.
Rendering a Bézier curve recursively.
We can apply de Casteljau's algorithm recursively to render a Bézier curve with a specified "depth." That is, we start by evaluating the Bézier curve (with de Casteljau's algorithm) at $t = 0.5$ which produces two new Bézier curves. The first Bézier curve is defined by the control points: ${ \vec q_0,\vec m_0,\vec r_0,\vec p(0.5) }$. The second Bézier curve is defined by the control points: ${ \vec p(0.5),\vec r_1,\vec m_2,\vec q_3 }$. We then recurse on these two Bézier curves, decrementing our depth until a depth of zero is reached. When we reach a depth of zero (the base case of our recursion), then we just draw straight lines connecting the control points.
A few other matrix operations.
This is everything we need for now, but one other matrix operation we'll need later is called the transpose. The transpose of a matrix with $n$ rows and $n$ columns swaps the $i$-th row with the $i$-th column ($0 \le i \lt n$) and is denoted with a $^T$ superscript. For example, the transpose of a 2x2 matrix $\mathbf{A}$ is denoted by $\mathbf{A}^T$ and is computed as:
$$
\mathbf{A} = \left[\begin{array}{cc}a_{0,0} & a_{0,1} \\
a_{1,0} & a_{1,1}\end{array}\right], \quad \mathbf{A}^T = \left[\begin{array}{cc}a_{0,0} & a_{1,0} \\
a_{0,1} & a_{1,1}\end{array}\right].
$$
Although we won't use it much in our course, it's good to know that matrix addition and subtraction also exist. These operations follow the same idea as with vector addition and subtraction: the entries of the resulting matrix from the addition (subtraction) are the entry-wise addition (subtraction) of the entries in the involved matrices.
If you stick to the convention that vectors are always understood to be column vectors, then we technically should have written the rows in the matrix-matrix multiplication section as a transpose, e.g. $\vec{a}_0^T$ and $\vec{a}_1^T$, but it was mentioned that they represented rows.
Final note!
Notice that the ball kind of sqashes and stretches when it reaches the floor. This is one of John Lasseter's Principles of Traditional Animation applied to 3D Computer Animation, written in 1987. If you're curious, see the paper and think about these principles while watching the 1986 Pixar short called Luxo Jr. :)