Reading:
When does a solution $x$ exist for $Ax=b$?
If $A$ is a matrix of data, what parts of that data are most important?
Key info: column spaces, null spaces, eigenvectors and singular vectors
A vector space is a set of vectors with three additional properties (that not all sets of vectors have).
Start with a set of vectors
$$ S = \{\mathbf v_1, \mathbf v_2, ..., \mathbf v_n\} $$A vector space is a new set consisting of all possible linear combinations of vectors in $S$.
This is called the span of a set of vectors:
\begin{align} V &= Span(S) \\ &= \{\alpha_1 \mathbf v_1 + \alpha_2 \mathbf v_2 + ... + \alpha_n \mathbf v_n \text{ for all } \alpha_1,\alpha_2,...,\alpha_n \in \mathbf R\} \end{align}If the vectors in $S$ are linearly independent, they form a basis for $V$.
The dimension of a vector space is the cardinality of (all) its bases.
Treat matrix as set of vectors defined by columns, and define vector space spanned by this set:
$$ \mathbf A \rightarrow S = \{ \mathbf a_1, \mathbf a_2, ..., \mathbf a_n \} $$$$ \text{Columnspace a.k.a. } C(\mathbf A) = Span(S) $$Using definition of Span we can write: \begin{align} Span(S) &= \{\alpha_1 \mathbf a_1 + \alpha_2 \mathbf a_2 + ... + \alpha_n \mathbf a_n \text{ for all } \alpha_1,\alpha_2,...,\alpha_n \in \mathbf R\} \\ &= \{ \mathbf A \boldsymbol\alpha \text{ for all } \boldsymbol\alpha \in \mathbf R^n\} \end{align}
Where we defined $\boldsymbol\alpha$ as vector with elements $\alpha_i$.
Consider a 3x2 matrix $A$ multiplied by a 2x1 vector $x$
Column space: the set of $Ax$ for all possible $x$
Solution: the $x$ for which $Ax=b$.
...so when does $x$ exist? (and when does $x$ not exist?)
A solution $x$ exists iff $b$ is in the column space of $A$. (draw it).
Consider column spaces formed by all possible linear combination of $n$ (linearly independent) columns. What shape are they?
A set of vectors is linearly dependent if one can be written as a linear combination of the others.
What happens for the previous subspace cases?
A linearly independent set of vectors which span the same space as the columns of $A$
Algorithm for finding basis given set of columns by construction:
The dimension of a space is the cardinality of a (every) basis
The rank of a matrix the dimension of its column spac
Consider the basis $C$ for the column space of $A$
Every column in $A$ can be written as a linear combination of the columns in $C$
Express these relationships as matrices, with the linear combination coefficients as columns in a matrix $R$
Our first factorization of $A$
where $A=CR$. How do these numbers relate?
Three "methods" now
What is the equation for this? I.e. $C_{i,j} = ?$
Derive equation for this from previous version
All columns are multiples of $u$.
All rows are multiples of $v^T$
$2 \times 2$ example on page 10
Consider rank-1 matrices as components, some tiny some large
For data science we usually want the most important info in the data, i.e. most important components
Common theme: factoring matrix $A = CR$ and look components $c_kr_k^T$
The "big picture" of linear algebra
What are:
What are:
$r$ independent equations $Ax=0$ (where $A$ is $m \times n$) have $n-r$ independent solutions
(Rank-nullity theorem)
Dim of rowspace = dim of column space = $r$
Dim of nullspace = $n-r$
Dim of left nullspace = $m-r$
Note asymmetry of the left vs. right nullspaces
Row of incidence matrix describes an edge
A linearly dependent set of rows (i.e. a solution to $A^Ty = 0$) forms a loop
Recall can compute rank as number of independent columns, so consider columns of sum or product for above facts
Most fundamental problem of linear algebra: solve $Ax=b$
Assume square $A$ is $n \times n$ and and $x$ is $n \times 1$, so $n$ equations and $n$ unknowns.
Upper triangular $\mathbf U$ (a.k.a. $\mathbf R$)
Lower triangular $\mathbf L$. What is $\mathbf L^T$?
Elimination: $[A | b] = [LU | b] \rightarrow [U|L^{-1}b] = [U|c] $
Back substitution: solve $Ux = c$ (easy since $U$ is triagonal) to get $x = U^{-1}c = U^{-1}L^{-1}b = A^{-1}b$
Basically all take advantage of ease of dealing with triangular matrices
$R$ and $N$ where for every $u\in R$ and $v \in N$ $u^Tv=0$
Example: rowspace and nullspace of a matrix
Nullspace: $Ax=0$ directly implies $row\cdot x = 0$
where $q_i$ is the $i^{th}$ column of $Q$
So $Q$ doesn't change the length of a vector
$Pb$ is the orthogonal projection of $b$ onto the column space of $P$
Recall from physics, orthogonal unit vectors $\hat{x}$, $\hat{y}$, $\hat{z}$, use as columns of $P$
Key property of projection matrix: $P^2 = P$
Exercise: prove this
Rotates a vector by $\theta$
Reflects a vector about line at angle $\frac{1}{2}\theta$
An orthogonal basis $\{q_1, ..., q_n\}$ can be viewed as axes of a coordinate system in $\mathbb R^n$
The coordinates within this system of a vector $v$ are the coefficients $c_i = q_i^T v$
Suppose $q_i$ are the rows of a matrix $Q$ (which is therefore an orthogonal matrix)
How do we compute $c_i$ given $v$ and $Q$?
Reflection matrices $Q = H_n$
Defined as Identity matrix minus projection onto a given unit vector $u$... basically flip this component
Ex: $u = \frac{1}{\sqrt{n}}ones(n,1)$, $H_n = I - 2uu^T = I - ones(n,n)$
Exercise: find eigenvectors and eigenvalues for $A^k$ in terms of those from $A$
Exercise: find eigenvectors and eigenvalues for $A^{-1}$ in terms of those from $A$
Eigenvalues are $i$ and $-i$ for eigenvectors $\begin{pmatrix}1\\-i\end{pmatrix}$ and $\begin{pmatrix}1\\i\end{pmatrix}$. Test this.
Note that for complex vectors we define the inner product as $u^Hv$, conjugate transpose.
Consider $A_s = A+sI$
What are the eigenvalues and eigenvectors of $A_s$ if we know $Ax=\lambda x$?
Consider $A_B = BAB^{-1}$, where $B$ is an invertible matrix
$A_B$ and $A$ are called similar matrices
How do the eigenvalues of similar matrices relate?
Can completely describe with a vector $\mathbf d$ with $d_i = D_{i,i}$
Hence we write "$\mathbf D = \text{diag}(\mathbf d)$" and "$\mathbf d = \text{diag}(\mathbf D)$"
Relate to Hadamard product of vectors $\mathbf D \mathbf v = \mathbf d \odot \mathbf v$.
Suppose we have $n$ independent eigenvectors for $Ax_i=\lambda_i x_i$, $i=1,...,n$
Combine these into the matrix decomposition $A = X\Lambda X^{-1}$
What are the properties of $X$ and $\Lambda$?
Another illustration of diagonalization is the fact that the diagonalization formula can also be used to construct a matrix that has specified eigenvalues and eigenvectors.
Example:
Let $\lambda_1 = -1$ and $\lambda_2 = 2$ be the eigenvalues corresponding to the eigenvectors ${\bf u}_1 = \begin{bmatrix}5\\3\end{bmatrix}$ and ${\bf u}_2 = \begin{bmatrix}3\\2\end{bmatrix}$.
Find a matrix $\bf A$ which gives these eigenvalues and eigenvectors.
We start by constructing $\bf U$ and $\boldsymbol\Lambda$:
$${\bf U} = \left[ \begin{matrix} {\bf u}_1 & {\bf u}_2 \end{matrix} \right] = \left[ \begin{matrix} 5 & 3 \\ 3 & 2 \end{matrix} \right]~~~~\text{and}~~~~{\boldsymbol\Lambda} = \left[ \begin{matrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{matrix} \right] = \left[ \begin{matrix} -1 & 0 \\ 0 & 2 \end{matrix} \right]$$Thus $${\bf U}^{-1} = \left[ \begin{matrix} 2 & -3 \\ -3 & 5 \end{matrix} \right]$$ This gives $${\bf A} = {\bf U}\boldsymbol\Lambda{\bf U}^{-1} = \left[ \begin{matrix} 5 & 3 \\ 3 & 2 \end{matrix} \right]\left[ \begin{matrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{matrix} \right]\left[ \begin{matrix} 2 & -3 \\ -3 & 5 \end{matrix} \right] = \left[ \begin{matrix} -28 & 45 \\ -18 & 29 \end{matrix} \right]$$
Check that these satisfy the eigenvector equation.
Hence $\lambda_1$ and $\lambda_2$ are eigenvalues of $\bf A$ associated with eigenvectors ${\bf u}_1$ and ${\bf u}_2$
Suppose we DON'T have $n$ independent eigenvectors for $Ax_i=\lambda_i x_i$, $i=1,...,n$
$$A = \begin{pmatrix}3 &1\\0 & 3\end{pmatrix}$$Compute the eigenvalues and eigenvectors
Properties
Therefore we choose to pick orthonormal eigenvectors
For every real symmetric matrix $S$ we can decompose as
$$S = Q \Lambda Q^T$$Where $Q$ is orthonormal and $\Lambda$ is diagonal and real.
Exercise: test symmetry of $Q \Lambda Q^T$
Use it to:
"Energy-based definition" - Positive definite matrix $S$ has $x^TSx>0$ for all $x\ne 0$
multiply both sides by $s^T$
Exercise: prove $S_1+S_2$ is PD if both $S_1$ and $S_2$ are PD
A symmetric Positive definite matrix $S$ can be factored as
$$S=AA^T$$Where $A$ has independent columns.
There are many possible choices for $A$
Consider $A=Q\sqrt{\Lambda}Q^T = A^T$ (a matrix "square root")
$A = LDL^T$, where $D$ is diagonal and $L$ comes from $LU$ decomposition.
Algorithms take advantage of symmetry so roughly twice as fast as $LU$.
Same handy uses as $LU$ otherwise.
Surface described by $x^TSx$ is elliptical and has a unique minimum when $S$ is PD
Axes described by eigenvectors and eigenvalues. Consider diagonalization as coordinate transform.
(particularly non-square real matrices)
$$A_{tall} = \begin{pmatrix}2 &0\\1 & 2\\5 &0\end{pmatrix}, \,\, A_{fat} = \begin{pmatrix}1 &3 &1\\2 & 0 & 1\end{pmatrix} $$