13 February, 2016

Linear Algebra

While linear algebra is, strictly speaking, a subject of algebra and not of analysis or
differential equations (which are, along with probability theory, the basic mathematical
tools used for studying quantitative finance), the theory of vector spaces is so fundamen-
tal to the modern formulation of either, that it is necessary to study it before proceeding
too far. Indeed, the theory has proven itself to be perhaps the most influential in unifying
significant branches of modern mathematics, much as algebra did for classical mathemat-
ics centuries ago. We thus begin Part I: Review of Undergraduate Mathematics with the
subject of linear algebra, which we divide into two further sub-parts, one covering the
general finite-dimensional theory including that of vector spaces, inner product spaces,
dual spaces, and linear transformations on these spaces, and the other a more detailed
analysis of linear operators, focusing in particular on general canonical representations
and spectral analysis of specific types of commonly encountered operators (e.g., normal
operators). The infinite-dimensional theory (i.e., functional analysis) will be postponed
to a later part, and any reference to linear algebra here implies the finite-dimensional
case, unless the contrary is stated.
We develop the theory in close accordance to the classic textbook [1]. The reader is
assumed to have studied linear algebra previously; accordingly, and also in keeping with
the main objective of quantitative finance, several elements of the theory are omitted,
many concepts implicitly implied, and the proofs of many theorems left as exercises. The
development of the theory will be mostly pedagogical; but at the end we shall summarize
many of the results into one of the most remarkable theorems in all of linear algebra: the
invertible matrix theorem (essentially a very long list of surprising equivalent conditions
for a matrix to be, among many other things, invertible, hence also a linear operator on a
finite-dimensional vector or inner produce space to be invertible). An analogous theorem
holds for linear transformations between spaces not necessarily of the same dimension,
but of course with necessarily fewer equivalences.
1. Linear Systems, Matrices and Elementary Row Operations
Before reviewing finite dimensional vector spaces and their transformations, it is first
helpful to review linear systems of equations and matrices, both as a source of examples
later, but mainly in view of the practicality that almost all concrete calculations aris-
ing from vector spaces require one to solve a system of linear equations, the standard
algorithm by which matrices help facilitate in executing. More importantly than that,
matrices provide a concrete realization of linear transformations on finite-dimensional
vector spaces and make much of the corresponding theory of these operators possible, as
we shall see.
Theorem 1.1. Let A and B be m×n matrices. If the linear systems Ax = b and By = c
are equivalent, then x = y
Remark. In words, if two systems of linear equations are equivalent (whether they are
homogeneous or inhomogeneous), then the solution sets are the same.
Proof. . . .
1
Theorem 1.2. Let A, B, and C be matrices with dimensions such that the products
(AB)C and A(BC) are defined. Then A(BC) = (AB)C; that is, matrix multiplication
is associative.
Proof. . . .
Theorem 1.3. Elementary row operations are invertible and their inverses are of the
same type. Proof. ... Theorem I.4 Let A be an m × n matrix, e an elementary row
operation, I the m × n identity matrix (i.e., I
ij
= δ
ij
), and define E := e(I). Then
e(A) = e(I)A = EA.
Corollary. A is row-equivalent to B if and only if B = P A, where P = E
1
. . . E
k
for
some finite sequence of elementary matrices.
Proof. . . .
Theorem 1.4. Let A be an m × n matrix. There exists a unique row-reduced (echelon)
matrix R that is row-equivalent to A.
Proof. . . .
Theorem 1.5. Let A and B be m × n matrices. If A is row-equivalent to B, then the
corresponding homogeneous linear systems Ax = 0 and By = 0 are equivalent and x = y.
Remarks. Applying Theorem 1.5, we see then that Rx = 0 has the same solution set
as Ax = 0, where R is the row-reduced (echelon) form of A obtained by applying a
particular sequence of elementary operations (cf. Theorem 1.4). If we apply this same
sequence of elementary operations to the column vector b in the inhomogeneous system
Ax = b, then A R and b P b and we see from Theorem 1.1 (with B = R and
c = P b) that Ax = b and Rx = P b have the same solution set. In particular, we have a
robust way of solving both homogeneous systems and inhomogeneous systems of linear
equations. This method is colloquially known as Gaussian elimination.
As we will see in later sections when we discuss linear transformations, the converse also
holds: if Ax = 0 and Bx = 0 have the same solution set, then A is row-equivalent to B.
Proof. . . .
We conclude our discussion of m × n systems before moving onto n × n systems with
one of the most fundamental theorems in linear algebra (it’s importance being revealed
when we discuss linear transformations).
Theorem 1.6 (Fundamental Theorem of Homogeneous Linear Systems). Let A be an
m × n matrix with m < n. Then the corresponding homogeneous linear system Ax = 0
has a non-trivial solution x 6= 0.
Proof. . . .
We now move on to discuss n × n matrices and their corresponding systems of linear
equations. The theory that follows is simpler and more extensive than the theory for
m × n matrices and their systems. The reason for this is largely to do with the concept
of invertability (together with finite dimensionality), which only makes sense for n × n
matrices because of the definition of matrix multiplication given earlier.
Theorem 1.7. Let A be an invertible n × n matrix. Then A
1
is an invertible n × n
matrix and
(A
1
)
1
= A.
Furthermore, if A = A
1
. . . A
k
and each of the A
j
(j = 1, 2, . . . , k) are invertible, then so
is A and
A
1
= A
1
k
. . . A
1
1
.
Remark. The converse to the second assertion is also true and will be proved as a corollary
to Theorem 1.8.
Proof. . . .
Theorem 1.8 (Invertible Matrix Theorem Part I). Let A be an n × n matrix. Then the
following are equivalent:
(1) A is invertible;
(2) A is row-equivalent to the identity matrix I; item A is the product of P and I,
P being a product of elementary matrices: A = P I = E
1
. . . E
k
I;
(3) The corresponding homogeneous system Ax = 0 has only the trivial solution
x = 0;
(4) The corresponding inhomogeneous system Ax = b has only the solution x = A
1
b
for every n × 1 column matrix b.
Corollaries. The following corollaries follow from this theorem:
(1) If BA = I = AC, then B = C = A
1
. In particular, for n × n matrices, either
A is invertible or it has no right or left inverse (equivalently, an n × n matrix
cannot have nonequal left and right inverses);
(2) If P = E
1
. . . E
k
, then P A = I = A
1
= P = P I. In particular, if A is reduced
to I by a sequence of elementary operations, then the result of the corresponding
sequence applied to I is A
1
;
(3) If A = A
1
. . . A
k
is invertible, then each A
j
is invertible (j = 1, . . . , k). This is the
converse to Theorem 1.7. In particular, invertability is preserved under products
when each factor is invertible, and destroyed if just one factor is non-invertible;
(4) For m×n matrices A and B, A B is equivalent to the existence of an invertible
m × m matrix P such that B = P A holds. This is essentially a weaker statement
than that of Theorem 1.4.
Proof. . . .
2. Structure of Vector Spaces: Bases, Dimension, and Coordinates
Definition (Vector Space). A vector (or linear) space (X, F, +, ·) consists of a set X,
whose elements we refer to as vectors, an algebraic field F, whose elements we refer to
as scalars, a binary operator + : X × X X, which we call addition, and a binary
operator · : F × X X, which we call scalar multiplication, where the operations obey
the following vector space axioms for arbitrary x, y, z X and c, d F:
(1) (Closure) x + c · y X
(2) (Associativity) (x + y) + z = x + (y + z)
(3) (Commutativity) x + y = y + x
(4) (Additive Identity 0 X) x + 0 = x
(5) (Additive Inverse x X) x + (x) = 0
(6) (Compatibility with Scalar Multiplication and Field Multiplication) (cd) · x =
c · (d · x)
(7) (Compatibility with Field Identity) 1 · x = x
(8) (Distribution over Vector Addition) c · (x + y) = c · x + c · y
(9) (Distribution over Scalar Addition) (c + d) · x = c · x + d · x
A subset S X is called a (vector or linear) subspace of X if (S, F, +, ·) is itself a vector
space. Because of axiom #6, we will not explicitly use the notation · for scalar multipli-
cation; whether the multiplication refers to scalar multiplication or field multiplication
is now obvious. Moreover, we use the same notation 1 and 0 for the special identity
elements in both F and X; again, to which one is being referred should be obvious from
the context.
It is customary to only refer to the set of vectors X when talking about a vector space;
in other words, we do not ordinarily explicitly state the field and binary operations +
and ·, even though technically all of these must be specified in order for there to be a
vector space. This practice is observed throughout mathematics, e.g. when talking about
measure spaces or topological spaces, we do not usually explicitly reference the under-
lying sigma algebras, topologies, measures, etc. (except in cases where that particular
component of the space is what is specifically under consideration or subject to change
- it will often be the case in linear algebra where the field must be explicitly stated as
certain results depend on whether one is working on R or C, or more generally, a field
that is algebraically complete or of a particular characteristic).
The above axioms imply all of the usual properties one is accustomed to using when
working with vectors on the Euclidean vector space R
n
. For instance, one can prove
simple facts such as (1)x = x and 0x = 0 and thus if cx = 0 then x = 0.
Finally, it is clear that in evaluating whether a given subset S of X is itself a vector
space, one only needs to verify that S is closed under the vector space operations (i.e.,
s + cr S for every s, r S and c F). Definition II.2 (Linear Span / Independence)
Let X = {x
1
, . . . , x
n
} X be a collection of vectors in X. We call the set
S
X
:= {x X : x = c
1
x
1
+ . . . + c
n
x
n
, c
j
F (j = 1, . . . n)}
the (linear) span of X . If the equation
c
1
x
1
+ . . . + c
n
x
n
= 0
has only the trivial solution x
j
= 0 (j = 1, . . . , n), then X is said to be a linearly
independent subset of X, or simply independent. Otherwise X is said to be dependent.
It is clear that the span of any set S := S
X
is actually a subspace of X, regardless of
whether the spanning set X is independent or dependent. If X is independent, then X
is called a minimal spanning set or basis of S and |X | (cardinality of X ) is called the
dimension of S. Every vector in S can be uniquely expressed as a linear combination of
vectors in X . If X is dependent, then this representation is not unique; however, any
such spanning set can be reduced to a linearly independent set by eliminating certain
dependent vectors without affecting the set’s span, and hence a minimally spanning set or
basis. Conversely, every subspace S in X has a basis (note that the basis of a vector space
is far from unique; it is the number of elements that make-up a basis which is unique,
i.e. a set that simultaneously spans S and is linearly independent and necessarily has
cardinality equal to the dimension of S). We record these important facts in the following
fundamental theorem:
Theorem 2.1. Fundamental Basis and Dimension Theorem . . .
Proof. . . .
3. Linear Transformations, Field Isomorphism, and Duality
Definition (Linear Transformation). If X and Y are two vector spaces over the same
field F, then a map T : X Y is said to be a linear transformation from X to Y if
T preserves the vector space structure; in particular, for each x, y X and c F the
following holds:
T (cx + y) = cT x + T y.
If Y = X, then T is called a (linear) operator on X.
If x =
P
n
j=1
c
j
x
j
is a linear combination of vectors in X, then successive application of
the defining property for T implies
y = T x = T
n
X
j=1
c
j
x
j
!
=
n
X
j=1
c
j
T x
j
,
so that T also preserves linear combinations. As we shall see, this fact allows us to
characterize linear transformations based soley on from their actions on a finite set of
vectors in X; equivalently, every ”rule” that assigns a vector in Y to each vector in such
a set can be extended to a unique linear transformation T .
Theorem 3.1. . . .
Proof. . . .
Theorem 3.2 (Rank-Nullity Theorem). . . .
Proof. . . .
Theorem 3.3. . . .
Proof. . . .
Theorem 3.4. . . .
Proof. . . .
Theorem 3.5. . . .
Proof. . . .
Theorem 3.6 (Invertible Linear Transformation Theorem). . . .
Proof. . . .
Theorem 3.7 (Riesz Representation Theorem for the dual space X
). . . .
Proof. . . .
Theorem 3.8 (Subspace Annihilator Theorem). . . .
Corollaries. Theorem 3.8 implies the following corollaries:
(1) . . .
(2) . . .
(3) . . .
(4) . . .
Proof. . . .
Theorem 3.9 (Riesz Representation Theorem for the double dual space X
∗∗
). . . .
Corollaries. Theorem 3.9 implies the following corollaries:
(1) . . .
(2) . . .
(3) . . .
(4) . . .
Proof. . . .
Theorem 3.10. . . .
Proof. . . .
Theorem 3.11. . . .
Corollaries. Theorem 3.11 implies the following corollaries:
(1) . . .
(2) . . .
(3) . . .
(4) . . .
Proof. . . .
Theorem 3.12. . . .
Proof. . . .
Theorem 3.13. . . .
Rank-Nullity Theorem for the Adjoint Transformation T
. . . .
Theorem 3.14 (Matrix Representation Theorem and Vector Space Field Isomorphism).
. . .
Corollaries. Theorem 3.14 implies the following corollaries:
(1) . . .
(2) . . .
(3) . . .
(4) . . .
Proof. . . .
Theorem 3.15 (Matrix Representation Theorem for the Adjoint Transformation T
).
. . .
Proof. . . .
Theorem 3.16 (Invertible Matrix Theorem Part II). . . .
Proof. . . .
4. Inner Product Spaces and Orthogonality
5. Summary
References
1. K. Hoffman and R. Kunze, Linear Algebra, 2
nd
Edition, Prentice-Hall, Inc. (1971).