Linear Algebra
From charlesreid1
Intro
Basic idea behind linear algebra: solving large number of linear algebraic equations simultaneously.
$ a_{i,1} x_1 + a_{i,2} x_2 + a_{i,3} x_3 + \dots + a_{i,N} x_{N} = b_{N} \qquad i=1 \dots M $
There are N unknowns and M equations, forming a linear system from matrices:
$ \mathbf{A \cdot x} = \mathbf{b} $
where lowercase letters denote vectors, and the dot represents multiplication of two matrices or a matrix and a vector, or dot product of two vectors.
Nonsingular vs Singular
We are not guaranteed to find a solution to a given linear system Ax=b
- row degeneracy - one or more of the M equations is a linear combination of other equations
- column degeneracy - one or more of the N variables are interchangeable and cannot be distinguished (are non-unique)
- for square matrices, these two conditions are equivalent
Having singular system of equations means we can't find good solutions.
Other things preventing good solution:
- Equations being equivalent to within roundoff error (approximately degenerate)
- Accumulation of roundoff error during solution can swamp solution
Many linear equation solving packages dedicate a good chunk of their code to solving these problems
Sense of scale:
- Linear equations with 20-50 unknowns are routine, can almost always be solved in single precision with straightforward routines
- Linear equations with up to 1,000 unknowns require double precision
- Larger sets of thousands or millions of equations can be solved with coefficients are sparse (mostly zero) by using specialty sparse methods
Tasks of Computational Linear Algebra
Beyond simply solving Ax=b for the unknowns x, we may want to do other things:
- Solve more than one matrix equation, $ \mathbf{A \cdot x}_j = \mathbf{b}_j $, where the matrix A does not change but unknown/RHS vectors do change
- Calculating the inverse of the matrix
- Calculating the determinant of a square matrix
- Calculating the condition number of a matrix
If M < N (row degeneracy), or if M = N and equations are degenerate, there are fewer equations than unknowns. If fewer equations than unknowns:
- Either no solution vector, or
- More than one solution vector
- If more than one solution, solution space consists of particular solution added to linear combination of N-M vectors (which are in the nullspace of the matrix A)
- SVD can be used to solve this problem, by find these vectors (SVD finds the nullspace of a matrix A)
If M > N (column degeneracy), there are fewer unknowns than equations. If fewer unknowns than equations:
- Think least squares - more data points than unknowns
- Looking for best fit/compromise solution
- Linear least-squares problem can be used to solve overdetermined linear system
Linear Algebra Software
Software:
- Blas is short for "basic linear algebra subroutines" and provides the linear algebra operations that compose Lapack
- Lapack is the most common linear algebra package (Linpack is old and was replaced by Lapack)
- ScaLapack is version of Lapack for parallel architectures
- Lapack routines divided by precision, algorithm, simplifications (tridiag., sparse, dense, etc.)
- Atlas is version of Lapack that automatically tunes itself to run as fast as possible for your system's architecture when it is compiled
Direct vs Iterative:
- Major dividing line between routines
- Direct - algorithms that execute in a predictable number of operations
- Iterative - attempt to converge on the desired answer to within a desired tolerance, taking as many steps as necessary
Gauss-Jordan Elimination
Classic technique for solving linear equations: combine equations to cancel out unknowns, turning A into the identity matrix
Gauss-Jordan elimination produces the solution of the equations and the matrix inverse $ \mathbf{A}^{-1} $ - this is an expensive extra thing to produce, not always desirable to have it
Deficiencies:
- All right hand side values must be stored and manipulated at once (not feasible for very large systems of equations)
- Gauss-Jordan is three times slower than alternative techniques that do not produce the matrix inverse
Advantages:
- Straightforward
- Solid
- Pivoting
Pivoting:
- Pivoting is just interchanging rows (partial pivoting) or rows and columns (full pivoting) to put desired elements on matrix diagonals
- Partial pivoting is easier, full pivoting requires keeping track of permutations, and un-doing these when solution vector achieved
Storage requirements of method:
- The matrix inverse of A is gradually built up in A, as original A is overwritten
- Likewise solution vector x can gradually replace right-hand side vector b (solution vector is overwritten one row element at a time, each time an element is overwritten it will not be used again)
On the subject of row vs. column elimination strategy:
- if we perform row operations, row transformations build right to left, so right hand side is transformed at each stage from one vector to another
- if we perform column operations, the column operations build from left to right, so we have to remember each transformation at each stage, and apply them all in reverse order at the end
- this means column operations are much less useful
Gaussian Elimination with Back-Substitution
Like Gauss-Jordan, Gaussian elimination utilizes pivoting to eliminate elements from the matrix A and simplify it, but doesn't go as far as simplifying it to the identity matrix - simplifies it to an upper-triangular matrix
Triangular matrices make back-substitution easy - solving first equation involves one unknown, solving second equation involves only one new unknown, etc.
Represents midpoint between full elimination (Gauss-Jordan) and triangular decomposition (LU, QR)
Advantage of Gaussian elimination with back-substitution over Gauss-Jordan elimination is faster in terms of number of operations:
- Inner loops of Gauss-Jordan elimination have one subtraction and one multiplication, executed $ N^3 $ and $ N^2 m $ times (where m is number of right hand sides)
- Inner loops in Gaussian elimination are only executed $ \dfrac{1}{3} N^3 $ times and $ \dfrac{1}{2} N^2 m $ times
- Back-substitution of RHS is $ \dfrac{1}{2} N^2 $ executions of similar loop (one mult plus one subt)
- Gaussian elimination has a factor of three advantage over Gauss-Jordan, which erodes as m approaches N (number of RHS approaches number of equations/unknowns)
Calculation of inverse matrix is equivalent to the case of m = N (basically solving Ax=b, and letting b be each column of an identity matrix)
- Gaussian elimination and back-subst. requires $ \dfrac{1}{3} N^3 $ operations for matrix reduction, $ \dfrac{1}{2}N^3 $ for RHS manipulation, and $ \dfrac{1}{2} N^3 $ for N back-substitutions, for a total of $ \dfrac{4}{3} N^3 $ operations
- Gauss-Jordan only requires $ N^3 $ operations for matrix inversion, so it seems to win out
- However, because all but one of RHS vector elements are 0, the operations count for Gaussian elimination is actually lower
- Gauss-Jordan matrix inversion costs $ \dfrac{1}{6} N^3 $ operations
Note: the 1/6 comes from the fact that the matrix multiplication step normally requires 8 operations, but due to the sparseness of the RHS, we are only performing 1 of the 8 operations. Hence, $ \dfrac{1}{8} \times \dfrac{4}{3} = \dfrac{1}{6} $
LU Decomposition
This is based on decomposing the matrix A into two matrices, L and U:
$ \mathbf{A} = \mathbf{L \cdot U} $
L is a lower triangular matrix, U is an upper triangular matrix.
To solve the linear system:
$ \mathbf{A \cdot x} = \mathbf{b} $
we proceed in two steps:
Equation 1 is solved by forward substitution:
$ \mathbf{L \cdot y} = \mathbf{b} $
and the second equation is solved by back substitution:
$ \mathbf{U \cdot x} = \mathbf{y} $
The advantage is that solving a linear system with a triangular matrix is much easier.
The back and forward substitutions execute $ N^2 $ operations. If there are N RHS vectors (i.e., if we're explicitly inverting a matrix), and we take into account the reduced number of operations due to all but one element in each RHS being zero, the total number of operations is:
For forward substitution, the operation count is: $ \dfrac{1}{6} N^3 $
For back substitution, the operation count is: $ \dfrac{1}{2} N^3 $
BUT, there is also the cost of performing the decomposition itself, which is not free. This can be done with Crout's Algorithm.
Adding the cost of the LU decomposition itself brings the total cost of inverting a matrix to $ N^3 $, which is the same complexity class as Gaussian elimination. This highlights an important fact: LU decomposition is simply a modified version of Gaussian elimination.
LU Decomposition Advantages
We mention above that LU decomposition does not significantly improve on the cost of solving a linear system using Gaussian elimination. The advantage of LU decomposition is that, once A has been decomposed into L and U, it remains unchanged and can be used to solve as many RHS as desired, one at a time. (Gauss-Jordan and Gaussian elimination both depend on manipulating the left and right sides simultaneously.)
Crout's Decomposition Algorithm
To decompose the matrix A into the matrices L and U, there are $ N^2 + N $ equations that relate the components of L, U, and A. Crout's Algorithm provides a way to solve for the components by rearranging the equations.
Crout's Algorithm does not have any extra storage requirements - it works by changing the matrix A into L and U in-place.
Pivoting is essential for Crout's Algorithm to be stable. Partial pivoting can be implemented efficiently, and partial pivoting is enough to keep the algorithm stable. The implementation requires a loop over three indices, so there are six possible permutations of the indices i, j, and k. There is is a particular permutation that works best for row-based storage: kij or ikj. Pivoting is easiest with kij indexing. This method of pivoting is called "outer product Gaussian elimination" (Golub and Van Loan).
Overall, LU decomposition using Crout's algorithm can be solved for a single RHS vector by executing $ \dfrac{1}{3} N^3 $ operations in inner loops (one multiply and one add per operation). This is a factor of 3 better than the $ N^3 $ cost of Gauss-Jordan elimination.
To invert a matrix using LU decomposition, the forward and back substitution steps should be included, making the total cost $ \left( \dfrac{1}{2} + \dfrac{1}{3} + \dfrac{1}{6} \right) N^3 = N^3 $ which is the same as Gauss Jordan elimination.
Pseudocode
Pseudocode to illustrate how to solve a matrix system Ax=b using LU decomposition and a class called "LUdcmp" is given below:
const Int n = ... MatrixDouble a(n,n); VectorDouble b(n), x(n); // initialize a, b, and x LUdcmp alu(a); alu.solve(b,x);
Determinant
To calculate a determinant of an LU decomposed matrix, take the product of the diagonal elements of the decomposed matrix LU. (Recall Crout's Algorithm computes this decomposed matrix place and replaces elements of A with elements of the decomposed matrix).
The only complication here is that Crout's algorithm performs a rowwise permutation of the matrix. If the number of row permutations was even (positive parity), the determinant is the product of each of the diagonals. If the number of row permutations was odd (negative parity), the determinant is the opposite of the product of each of the diagonals.
Complex Matrix Systems
If matrix A is real and RHS vector is complex, matrix equation becomes:
$ \mathbf{A} \cdot \left( \mathbf{x} + i \mathbf{y} \right) = \left( \mathbf{b} + i \mathbf{d} \right) $
In this case:
- Perform LU decomposition as usual
- Back substitute real component of RHS vector to get real part of solution
- Back substitute complex component of RHS vector to get complex part of solution
If the matrix A is complex, the matrix equation becomes:
$ (\mathbf{A} + i \mathbf{C} \right) \cdot \left( \mathbf{x} + i \mathbf{y} \right) = \left( \mathbf{b} + i \mathbf{d} \right) $
If A is complex, two options:
- Rewrite decomposition routine to use complex arithmetic (more complicated code, fewer operations)
- Solve real and imaginary parts of matrix separately (utilizes same code, costs twice as many operations/storage space)
The second approach writes the real and imaginary linear systems separately, leading to a matrix-vector system of size 2x2, 2x1, and 2x1, where each element is a matrix. The 2x2 matrix stores A and C twice. Thus, the method costs twice as much storage. It also costs twice as much time (complex multiplication would use 4 operations per multiply, but matrix multiplication uses 8 operations per multiply).