Linear Algebra Basics
Linear Algebra is a branch of mathematics that is widely applied in science and engineering. It is the study of linear sets of equations and their transformation properties.
- A scalar (n) is a real number, it is a 0D array and its rank is 0 e.g. 5
- A vector (x) is a 1D array, it has a rank of 1 e.g. [1,2]
- A matrix (X) is a 2D array, that has a rank of 2 e.g. [[1,2], [3,4]]
- A tensor is a general term for 3 or more dimensions array e.g. [ [[1,2], [3,4]], [[5,6,7], [8,9,10]] ]
Vectors
A vector is a quantity that has both magnitude and direction, it can be represented as a list of numbers, given in small letters as . Two equal length vectors can be added, subtracted, multiplied and divided as , where to give .
We can calculate the sum of the multiplied elements of two vectors called the dot product. In general, for vectors , the inner product is . In this case, .
Two vectors given as columnar matrices , can be multiplied to produce a matrix . The outer product of two vectors . Let , the outer product is given by .
Vectors can also be multiplied by real numbers or quantities that have a magnitude only called scalars. A number can be multiplied by the vector such that to give .
Linear Combination and Independence
A Linear combination of two or more vectors is a vector obtained from the sum of the vectors multiplied by scalar values. Given two vectors , we can get , for scalar values .
Given and the linear combination , we can get the scalar multiples using the RREF (move towards an identity matrix by swapping, multipling and adding rows).
. By (R2 - R1), (-3R1+R2), and (R2/2), we get . Since , , thus .
Vectors are linear independent if none of them can be expressed as a linear combination of the others. In this case , with . If the only way to to get requires all , we have linearly independent vectors. To find if a = (1,3) and b = (2,5) are linear independent.
. By (-3R1+ R2), . Since and , the two vectors are linearly independent.
Vector space
A vector space is a set of vectors and all their possible combinations. Elements of are a set of real numbers, such as . Given a set of vectors , the span is the subspace composed of all possible linear combinations of the vectors in that set, which is . The standard basis , can write every vector in for example, . In general, given .
Another set is a spanning set of but not a basis for . Having both (1,0) and (2,0) in the set creates linear dependence, thus cannot be a basis for .
Finding the basis of a subspace
Given , the subspace V is the column matrix of the matrix . The reduced echelon is . The first two columns are pivot columns, the basis V is . The number of linearly independent columns of a matrix is the rank of the matrix.
Decomposition of a vector by the basis vectors
Decompose vector by basis vectors .
We use the coefficients for x, y to form the equation , that is . The , we conclude that .
Angle between two vectors
Given two vectors , we can find the angle between them.
Orthogonal basis
A basis is orthogonal if the vectors that form it are perpendicular , their dot product is zero. Given , then ;
In cases where the dot product is zero (0) and the length of each vector is one (1), it is called orthonormal basis. Given , then . In addition, the lengths of .
Vector norm (length/magnitude)
For the p-norm of is
- -norm also called the taxicab/manhattan distance which is a sum of the absolute value of vector elements, the p = 1. Given then .
- -norm also called the Euclidean distance is a straight line, the p = 2. Given , then .
- -norm also called the Chebyshev/max distance is the maximum value. Given , then .
Cross Product
Two vectors can be multiplied to produce another vector that is perpendicular to both of the two vectors. In general, with , the new vector . Given .
Matrices
A matrix is a rectangular array of numbers arranged into columns and rows. The matrix A has m rows and n columns. Each entry of A becomes and . For a matrix , then
.
Zero Matrix
A null matrix, has a zero in all entries.
def zeros_matrix(i=None, j=None):
"""
A matrix filled with zeros.
params:
i : number of rows
j: number of columns
return: list of lists
"""
if j == None:
j = i
M=[]
#Create rows
while len(M) < i:
M.append([])
#Enter values for each row
while len(M[-1]) < j:
M[-1].append(0)
return M
In []: zeros_matrix(3)
Out []: [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
Identity Matrix
It is a square matrix in which all elements of the principal diagonal running from upper left to lower right are ones and all other elements are zero.
def eye(n=None):
"""
Create an identity matrix for a square matrix
shorter method: [[float(i==j) for i in range(n)] for j in range(n)]
params
n: number of rows and columns
return: list of lists
"""
if (n == None):
matrix = []
else:
matrix = zeros_matrix(n)
for i in range(n):
matrix[i][i] = 1
return matrix
In []: eye(3)
Out []: [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
Copying a Matrix
To prevent unexpected bugs, it is a good practice to make a copy of a matrix before using it.
def copy_matrix(M):
"""
Creates a copy of the matrix provided
params:
M: a matrix / list of lists
return: A list of lists
"""
NM = []
for i in range(len(M)):
temp = []
for j in range(len(M[0])):
temp.append(M[i][j])
NM.append(temp)
return NM
def test_matrix(M):
for x in range(len(M)):
M[x][x] = 1.0
return M
matrix = [[0]*2]*2
cmatrix = copy_matrix(matrix)
In []: test_matrix(matrix) #python 3.7
Out []: [[1,1],[1,1]]
In []: test_matrix(cmatrix)
Out []: [[1,0],[0,1]]
Matrix Transformation
Matrix Addition
Matrices of the same shape can be added together . The sum of and is given by:
def shape(M=None):
"""
Give the size i.e. the number of rows and columns
params:
M: A matrix that has rows and columns
return: a list containing [rows,columns]
"""
return [len(M), len(M[0])]
def matrix_addition(MA,MB):
"""
Add two matrices of the same shape
params:
MA: a matrix
MB: a matrix
return: matrix
"""
A = copy_matrix(MA)
B = copy_matrix(MB)
if shape(A) != shape(B):
raise ArithmeticError("Matrices are not of the same size.")
C = [];
for i in range(len(A)):
row_vals = []
for j in range(len(B[0])):
if ( type(A[i][j]) == int or type(A[i][j]) == float) and ( type(B[i][j]) == int or type(B[i][j]) == float):
#For matrix subtraction change sign to (-)
row_vals.append(A[i][j] + B[i][j])
else:
raise TypeError("All Matrix values must be numbers.")
C.append(row_vals)
return C
In []: matrix_addition([[1,2],[2,3]], [[4,5],[6,7]])
Out []: [[5,7,],[8,10]]
Given two matrices of unequal dimensions, we can get their direct sum as follows:
Matrix Multiplication
Matrix can be multiplied by a scalar value to give . can also be multiplied (dot product) by a matrix as long as the columns of are equal to the rows .
- Associativity : A(BC) = (AB)C
- Not Commutative :
In hardamard product, we multiply two square matrices similar to addition (A * B).
def matrix_multiply(MA, MB):
"""
Multiply two matrices (dot product), Aij X Bmn => j == m
shorter method: [[sum((av * bv) for av, bv in zip(a, b)) for b in zip(*MB)] for a in MA ]
params:
MA: a matrix
MB: a matrix
return: matrix
"""
A = copy_matrix(MA)
B = copy_matrix(MB)
if not shape(A)[1] == shape(B)[0]:
raise TypeError("Columns of A are not equal to rows of B")
new_matrix = []
for i in range(len(A)):
cols_vals = []
for n in range(len(B[0])):
s = 0
for m in range(len(B)):
s += A[i][m] * B[m][n]
cols_vals.append(s)
new_matrix.append(cols_vals)
#new_matrix(i*n)
return new_matrix
In []: matrix_multiply([[1,2],[4,5]], [[7],[9]])
Out []: [[25], [73]]
Determinants
The determinant of a matrix is the multiplicative change from tranforming a space with the matrix. We check the determinant while looking for an inverse of a square matrix. A square matrix that has a determinant of zero is singular and it has no inverse. For a matrix the determinant . In matrices larger than 2x2, we use cofactor expansion to get the determinants.
def recursive_determinant(RM, multi=1):
"""
The determinant of a square matrix using cofactor
Cofactor is created by an element of a matrix e.g. m[0][1] when it excludes m[i][1] and m[0][j]
params:
RM: a matrix/ list of lists
multi: a multiplier number
returns: the total for the levels of recursion
"""
width = len(RM)
if width == 1:
return multi * RM[0][0]
else:
sign = -1
total = 0
#Starts here
for i in range(width):
m = []
#loop removes first row and exclude values from the ith column all the way down till we have 1 row, select the kth
#multi is the (ith) matrix[0][i] before the last 2 rows, sign*matrix[0][i]: ith value when two rows are remaining
#m is kth value when only 1 row remains
for j in range(1, width):
temp = []
for k in range(width):
if k != i:
temp.append(RM[j][k])
m.append(temp)
sign *= -1
total += multi * recursive_determinant(m, sign * RM[0][i])
return total
In []: rmx = [[5,2,3,16],[5,6,7,8],[9,10,11,12],[13,14,15,4]]
In []: recursive_determinant(rmx)
Out []: 192
There is a faster way of finding the determinant by reducing the square matrix into an upper triangle matrix as explained here.
def faster_determinant(MF):
"""
The upper triangle method.
params:
A: a square matrix
return: a number
"""
matrix = copy_matrix(MF)
n = len(matrix)
for d in range(n):
for i in range(d+1, n):
if matrix[d][d] == 0:
matrix[d][d] = 1.8e-18
row_scaler = matrix[i][d] / matrix[d][d]
for j in range(n):
matrix[i][j] = matrix[i][j] - row_scaler * matrix[d][j]
product = 1.0
#determinant is the product of diagonals
for i in range(n):
product *= matrix[i][i]
return product
In []: rmx = [[5,2,3,16],[5,6,7,8],[9,10,11,12],[13,14,15,4]]
In []: faster_determinant(rmx)
Out []: 192
Inverse of a matrix
Similar to an inverse of a number (), an inverse of a matrix multiplied by the matrix yields 1. The inverse of a matrix will satisfy the equation . For a matrix to be invertible, it must be a square and its determinant must not be zero.
def matrix_inverse(M):
"""
Find inverse by transforming the matrix into an identity matrix:
1. Scale the each row by its diagonal value
2. Subtract (diagonal column value * diagonal row values) from each row except the diagonal row
params:
A: a square matrix
return a matrix
"""
n = len(M)
IM = eye(n)
matrix = copy_matrix(M)
imatrix = copy_matrix(IM)
assert shape(matrix)[0] == shape(matrix)[1], "Make sure the matrix is squared"
assert faster_determinant(matrix) != 0, "The matrix is singular, it has no inverse."
indices = list(range(n))
for d in range(n):
ds = 1.0 / matrix[d][d]
#diagonal scaler
for j in range(n):
matrix[d][j] *= ds
imatrix[d][j] *= ds
#row scaler
for i in range(n):
if i != d:
rs = matrix[i][d]
for j in range(n):
matrix[i][j] = matrix[i][j] - rs * matrix[d][j]
imatrix[i][j] = imatrix[i][j] - rs * imatrix[d][j]
return imatrix
In []: aa= [[3,7],[2,5]]
In []: matrix_inverse(aa)
Out []: [[ 5., -7.],[-2., 3.]]
Matrix Division
To divide matrices we multiply them by an inverse. Lets consider where as and . To find ;
In []: A = [[13,26],[39,13]]
In []: B_inv = matrix_inverse([[7,4],[2,3]])
In []: matrix_multiply(A, B_inv)
Out []: [[-1,10],[7,-5]]
Solving a System of Linear Equations
The matrix solution is represented as , therefore .
In []: A_inv = matrix_inverse([[5,3,7],[9,2,6],[4,2,5]])
In []: matrix_multiply(A_inv, [[74], [73], [53]])
Out []: [[3.0], [8.0], [5.0]]
Transpose
We pivot matrices to enable multiplication and division, this is done by switching columns by rows . From to .
def transpose(MT=None):
"""
Create a transpose of a matrix ( Mij => Mji)
shorter method: [list(x) for x in zip(*MT)]
params:
matrix : a value, list or list of lists
return: list of lists
"""
matrix = copy_matrix(MT)
if matrix == None:
matrix = [[]]
elif not isinstance(matrix, list):
matrix = [[matrix]]
elif not isinstance(matrix[0], list):
matrix = [matrix]
new_matrix = []
for j in range(len(matrix[0])):
new_vals = []
for i in range(len(matrix)):
new_vals.append(matrix[i][j])
new_matrix.append(new_vals)
return new_matrix
In []: transpose([1,2])
Out []: [[1],[2]]
In []: tranpose([[1,2,3],[4,5,6]])
Out []: [[1,4],[2,5],[3,6]]
Special Matrices
Square Matrix
Rectangular Matrix
Diagonal Matrix
Anti-diagonal Matrix
Symmetric Matrix
Scalar Matrix
Pivoting
In Gaussian elimination, the pivot element is required to not be zero. Given , we swap to produce . It is also desired to have the pivot element with a large absolute value to reduce the effect of rounding off error. Given , after swapping we get . In a partial pivot, we select the entry with the largest absolute value from the column of the matrix that is currently being considered as the pivot element.
def pivot(m):
"""
Rearrange row values, the larger value becomes the pivot element
params:
m: matrix
returns: a matrix containing 0,1's
"""
n = len(m)
Id = [[float(i==j) for i in range(n)] for j in range(n)]
#get the max value in each, if not at pivot position, swap.
for j in range(n):
row = max( range(j, n), key=lambda i: abs(m[i][j]) )
if j != row:
Id[j], Id[row] = Id[row], Id[j]
return Id
In []: vals = [[5,3,8],[6,4,5],[2,11,9]]
In []: pivot(vals)
Out []: [[0,1,0],[0,0,1],[1,0,0]]
In []: matrix_multiply(vals, pivot(vals))
Out []: [[8,5,3],[5,6,4],[9,2,11]]
MATRIX FACTORIZATION
Matrix decompositions reduce a matrix into constituent parts that make it easier to do complex operations. They can be used to calculate determinants, inverse and in solving systems of linear equations.
Lower and Upper Triangle Matrix Decomposition (LU)
A non-singular square matrix is decomposed in the form . To prevent division by zero error we pivot, such that .
LU decomposition using Crout's algorithm
The diagonals of a U matrix in Crout decomposition are 1, in Doolittle algorithm the diagonals of L matrix are all 1.
To calculate the upper values:
We derive the formula for U
To calculate the lower values:
We derive the formula for L
def lup(M):
"""
Perform a Lower and Upper Decomposition.
params:
M: a nxn matrix
return: 3 nxn matrices; Lower, Upper and Pivot.
"""
n = len(M)
L = [[0.0]*n for i in range(n)]
U = [[0.0]*n for i in range(n)]
P = pivot(M)
PM = matrix_multiply(P,M)
for j in range(n):
L[j][j] = 1.0
for i in range(j+1):
s1 = sum(U[k][j] * L[i][k] for k in range(i))
U[i][j] = PM[i][j] - s1
for i in range(j, n):
s2 = sum(U[k][j] * L[i][k] for k in range(j))
L[i][j] = (PM[i][j] - s2) / U[j][j]
return L,U,P
In []: lup([2,3,5],[2,4,7],[6,8,0]])
Out []: (
[[1.0,0.0,0.0],[0.3,1.0,0.0],[0.3,4.0,1.0]], #L
[[6.0,8.0,0.0],[0.0,0.3,5.0],[0.0,0.0,-13.0]], #U
[[0.0,0.0,1.0],[1.0,0.0,0.0],[0.0,1.0,0.0]] #P
)
Solving a Linear System using LU decomposition
Given an , and . The , P is a permutation matrix that swaps rows of A. A linear system is , since , then . We can shift the parenthesis such that , lets create a d such that . Finally, we find x using .
Solving for d using forward substitution
def forward_substitution(l, b):
"""
Calculating the value of d given Ld = b
params:
L: a non-singular nxn matrix
b: a vector
return: a vector
"""
n = len(l)
for i in range(0, n-1):
for j in range(i+1, n):
b[j] -= l[j][i] * b[i]
return b
Solving for x using back substitution
def back_substitution(u, d):
"""
Calculating the value of x given Ux = d
params:
U: a non-singular, nxn matrix
d: a vector
return: a vector
"""
n = len(u)
#the upper matrix is upside down
#start with the last row and move upwards
for i in range(n)[::-1]:
d[i] /= u[i][i]
for j in range(0, i):
d[j] -= u[j][i] * d[i]
return d
In []: l,u,p = lup([[3,-0.1,-0.2],[0.1,7,-0.3],[0.3,-0.2,10]])
In []: b = [7.85, -19.3, 71.4]
In []: d = forward_substitution(l, b)
In []: back_substitution(u, d)
Out []: [3.0, -2.5, 7.0] #x1, x2, x3