Linear Algebra Basics

2020, Mar 07

Linear Algebra is a branch of mathematics that is widely applied in science and engineering. It is the study of linear sets of equations and their transformation properties.

  • A scalar (n) is a real number, it is a 0D array and its rank is 0 e.g. 5
  • A vector (x) is a 1D array, it has a rank of 1 e.g. [1,2]
  • A matrix (X) is a 2D array, that has a rank of 2 e.g. [[1,2], [3,4]]
  • A tensor is a general term for 3 or more dimensions array e.g. [ [[1,2], [3,4]], [[5,6,7], [8,9,10]] ]

Vectors

A vector is a quantity that has both magnitude and direction, it can be represented as a list of numbers, given in small letters as v1,v2,...,vnv_1, v_2, ..., v_n. Two equal length vectors a=[1,2],b=[3,4]a = [1,2], b = [3,4] can be added, subtracted, multiplied and divided as c=a+bc = a + b, where c[1]=a[1]+b[1]c[1] = a[1] + b[1] to give c=[4,6]c = [4,6].

We can calculate the sum of the multiplied elements of two vectors called the dot product. In general, for vectors a,bRna, b \in R^n, the inner product is ab:=i=1naibia'b := \sum_{i=1}^n a_i b_i. In this case, a.b=(1×3+2×4)=11a.b = (1 \times 3 + 2 \times 4) = 11.

Two vectors given as columnar matrices um,vnu_m, v_n, can be multiplied to produce a matrix AmnA_{mn}. The outer product of two vectors uv=uvT=Au \otimes v = uv^T = A. Let u=[u1u2]  and  v=[v1v2v3]u = \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} \; and \; v = \begin{bmatrix} v_1 \\ v_2 \\ v_3 \end{bmatrix}, the outer product is given by uvT=[u1u2][v1v2v3]=[u1v1u1v2u1v3u2v1u2v2u2v3]uv^T = \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} \begin{bmatrix} v_1 & v_2 & v_3 \end{bmatrix} = \begin{bmatrix} u_1 v_1 & u_1 v_2 & u_1 v_3 \\ u_2 v_1 & u_2 v_2 & u_2 v_3 \end{bmatrix}.

Vectors can also be multiplied by real numbers or quantities that have a magnitude only called scalars. A number γ=5\gamma = 5 can be multiplied by the vector aa such that γa=[γa1γa2]\gamma a = \begin{bmatrix} \gamma a_1 \\ \gamma a_2 \end{bmatrix} to give [5,10][5,10].

Linear Combination and Independence

A Linear combination of two or more vectors is a vector obtained from the sum of the vectors multiplied by scalar values. Given two vectors v1=(2,3)andv2=(4,5)v_1 = (2,3) \: and \: v_2 = (4,5), we can get v3=c1v1+c2v2v_3 = c_1v_1 + c_2v_2, for scalar values c1=2andc2=3c_1 =2 \: and \: c_2=3.

c1v1+c2v2=2[23]+3[45]=[2×2+3×42×3+3×5]=[1621]c_1v_1 + c_2v_2 = 2 \begin{bmatrix} 2 \\ 3 \end{bmatrix} + 3 \begin{bmatrix} 4 \\ 5 \end{bmatrix} = \begin{bmatrix} 2 \times 2 + 3 \times 4 \\ 2 \times 3 + 3 \times 5 \end{bmatrix} = \begin{bmatrix} 16 \\ 21 \end{bmatrix}

Given v1,v2v_1, v_2 and the linear combination v3v_3, we can get the scalar multiples using the RREF (move towards an identity matrix by swapping, multipling and adding rows).

[v1v2v324163521]\begin{bmatrix} v_1 & v_2 & | & v_3 \\ 2 & 4 & | & 16 \\ 3 & 5 & | & 21 \end{bmatrix}. By (R2 - R1), (-3R1+R2), and (R2/2), we get [c1c2115013]\begin{bmatrix} c_1 & c_2 & & \\ 1 & 1 & | & 5 \\ 0 & 1 & | & 3 \end{bmatrix}. Since c2=3c_2 = 3, c1+3=5c_1 + 3 = 5, thus c1=2c_1 =2.

Vectors are linear independent if none of them can be expressed as a linear combination of the others. In this case c1v1+c2v2=0=(0,0)c_1v_1 + c_2v_2 = 0 = (0,0), with c1=c2=0c_1 = c_2 = 0. If the only way to to get 0=i=1kcivi0 = \sum_{i=1}^{k}c_{i}v_{i} requires all c1,...,ck=0c_1,...,c_k = 0, we have linearly independent vectors. To find if a = (1,3) and b = (2,5) are linear independent.

[ab120350]\begin{bmatrix} a & b & | & \\ 1 & 2 & | & 0 \\ 3 & 5 & | & 0 \end{bmatrix}. By (-3R1+ R2), [c1c2120010]\begin{bmatrix} c_1 & c_2 & | & \\ 1 & 2 & | & 0 \\ 0 & 1 & | & 0 \end{bmatrix}. Since c2=0c_2 = 0 and c1=0c_1 = 0, the two vectors are linearly independent.

Vector space

A vector space is a set of vectors and all their possible combinations. Elements of RnR^n are a set of real numbers, such as x=[12]=[x1x2]R2x = \begin{bmatrix} 1 \\2 \end{bmatrix} = \begin{bmatrix}x_1 \\ x_2 \end{bmatrix} \in \mathbb R^2. Given a set of vectors V={[01],[10]}V = \begin{Bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \end{bmatrix} \end{Bmatrix}, the span is the subspace composed of all possible linear combinations of the vectors in that set, which is R2R^2. The standard basis {[01],[10]}\begin{Bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \end{bmatrix} \end{Bmatrix}, can write every vector in R2R^2 for example, [ab]=a[10]+b[01]\begin{bmatrix} a \\ b \end{bmatrix} = a \cdot \begin{bmatrix} 1 \\ 0 \end{bmatrix} + b \cdot \begin{bmatrix} 0 \\ 1 \end{bmatrix}. In general, given V={v1,v2,...,v3},  the  span(V)=αv1+αv2+...+αvnV = \begin{Bmatrix} v_1, v_2, ..., v_3 \end{Bmatrix}, \; the \; span(V) = \alpha v_1 + \alpha v_2 + ... + \alpha v_n.

Another set {[01],[10],[20]}\begin{Bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 2 \\ 0 \end{bmatrix} \end{Bmatrix} is a spanning set of R2R^2 but not a basis for R2R^2 . Having both (1,0) and (2,0) in the set creates linear dependence, thus cannot be a basis for R2R^2.

Finding the basis of a subspace

Given V=Span{(10),(31),(17)}V = Span \begin{Bmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \begin{pmatrix} 3 \\ 1 \end{pmatrix}, \begin{pmatrix} -1 \\ 7 \end{pmatrix} \end{Bmatrix}, the subspace V is the column matrix of the matrix A=(131017)A = \begin{pmatrix} 1 & 3 & -1 \\ 0 & 1 & 7 \end{pmatrix}. The reduced echelon is (1022017)\begin{pmatrix} 1 & 0 & -22 \\ 0 & 1 & 7 \end{pmatrix}. The first two columns are pivot columns, the basis V is {(10),(31)}\begin{Bmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \begin{pmatrix} 3 \\ 1 \end{pmatrix} \end{Bmatrix}. The number of linearly independent columns of a matrix is the rank of the matrix.

Decomposition of a vector by the basis vectors

Decompose vector b=(10,30)b = \begin{pmatrix} 10,30 \end{pmatrix} by basis vectors p=(2,2)  and  q=(1,4)p = \begin{pmatrix} 2, 2 \end{pmatrix} \; and \; q = \begin{pmatrix} -1, 4 \end{pmatrix}.

We use the coefficients for x, y to form the equation xp+yq=b\vec{xp} + \vec{yq} = \vec{b}, that is {2xy=102x+4y=30\begin{cases} 2x - y = 10 \\ 2x + 4y = 30 \end{cases}. The x=1,y=3x = -1, y = 3, we conclude that b=p+3q\vec {b} = \vec{-p} + \vec{3q}.

Angle between two vectors

Given two vectors a=(3,4),b=(5,12)\vec{a}= \begin{pmatrix}-3, 4 \end{pmatrix}, \vec{b} = \begin{pmatrix}{5, 12} \end{pmatrix}, we can find the angle between them.

cos  α=abab=3(5)+4(12)32+4252+122=336559.5cos \; \alpha = \frac{\vec{a} \cdot \vec{b}} {| \vec{a} | \cdot |\vec{b}| } = \frac{ -3(5) + 4(12)} {\sqrt{ -3^2 + 4^2} \cdot \sqrt{ 5^2 + 12^2}} = \frac {33} {65} \approx{59.5}

Orthogonal basis

A basis V={v1,v2}V = \begin{Bmatrix} v_{1}, v_{2} \end{Bmatrix} is orthogonal if the vectors that form it are perpendicular (900)\begin{pmatrix}90^0 \end{pmatrix}, their dot product is zero. Given v1={4;2},v2={5;10}v_1 = \begin{Bmatrix}4;-2 \end{Bmatrix}, v_2 = \begin{Bmatrix}5;10 \end{Bmatrix}, then v1v2=45+(2)10=0v_1 \cdot v_2 = 4 \cdot 5 + (-2) \cdot 10 = 0;

In cases where the dot product is zero (0) and the length of each vector is one (1), it is called orthonormal basis. Given v1={1;0},v2={0;1}v_1 = \begin{Bmatrix}1;0 \end{Bmatrix}, v_2 = \begin{Bmatrix}0;-1 \end{Bmatrix}, then v1v2=10+0(1)=0v_1 \cdot v_2 = 1 \cdot 0 + 0 \cdot (-1) = 0. In addition, the lengths of v1=12+02=1  and  v2=02+(1)2=1|\vec{v_1}| = \sqrt{ 1^2 + 0^2} = 1 \; and \; |\vec{v_2}| = \sqrt{0^2 + (-1)^2} = 1.

Vector norm (length/magnitude)

For p1p \ge 1 the p-norm of v=(v1,...,vn)v = \begin{pmatrix} v_1,...,v_n \end{pmatrix} is xp:=(i=1nxip)1/p\| x \|_{p} := \begin{pmatrix} \sum_{i=1}^{n} |x_i|^p \end{pmatrix}^{1/p}

  • l1l^1-norm also called the taxicab/manhattan distance which is a sum of the absolute value of vector elements, the p = 1. Given v=[1,2,3]v = [1,2,3] then v1=1+2+3=6||v||_1 = 1+2+3 = 6.
  • l2l^2-norm also called the Euclidean distance is a straight line, the p = 2. Given v=[1,2,3]v = [1,2,3] , then v2=12+22+32=3.74||v||_2 = \sqrt{1^2 + 2^2 + 3^2} = 3.74.
  • linfl^{inf}-norm also called the Chebyshev/max distance is the maximum value. Given v=[1,2,3]v = [1,2,3], then vinf=max(1,2,3)=3||v||_{inf} = max(1,2,3) = 3.

Cross Product

Two vectors can be multiplied to produce another vector that is perpendicular to both of the two vectors. In general, with a=(x,y,z)  and  b=(x,y,z)a = (x, y, z) \; and \; b = (x, y, z), the new vector c=(aybzazby,azbxaxbz,axbyaybx)c = (a_y \cdot b_z - a_z \cdot b_y, a_z \cdot b_x - a_x \cdot b_z, a_x \cdot b_y - a_y \cdot b_x). Given (1,2,3)×(4,5,6)=(2635,3416,1524)=(3,6,3)(1,2,3) \times (4,5,6) = (2*6 - 3*5, 3*4 - 1*6, 1*5 - 2*4) = (-3,6,-3).

Matrices

A matrix is a rectangular array of numbers arranged into columns and rows. The matrix A has m rows and n columns. Each entry of A becomes aij,  where  i=1...ma_{ij}, \; where \; i=1...m and j=1...nj=1...n. For a matrix ARm×nA \in \mathbb R^{m \times n}, then

A:=[a11a1nam1amn],aijRA := \begin{bmatrix}a_{11} & \cdots & a_{1n} \\ \vdots & \cdots & \vdots \\ a_{m1} & \cdots & a_{mn} \end{bmatrix}, a_{ij} \in \mathbb R.

Zero Matrix

A null matrix, has a zero in all entries.

[000000000]\begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}
def zeros_matrix(i=None, j=None):
    """
    A matrix filled with zeros.
    
    params:
        i : number of rows
        j: number of columns
    return: list of lists
    
    """
    
    if j == None:
        j = i
        
    M=[]
    #Create rows
    while len(M) < i:
        M.append([])
        #Enter values for each row 
        while len(M[-1]) < j:
            M[-1].append(0)
    return M
In []: zeros_matrix(3)

Out []: [[0, 0, 0], [0, 0, 0], [0, 0, 0]]

Identity Matrix

It is a square matrix in which all elements of the principal diagonal running from upper left to lower right are ones and all other elements are zero.

[100010001]\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}
def eye(n=None):
    """
    Create an identity matrix for a square matrix
    shorter method: [[float(i==j) for i in range(n)] for j in range(n)]
    params
        n: number of rows and columns
    return: list of lists   
    """
    
    if (n == None):
        matrix = []
    else:
        matrix = zeros_matrix(n)
    for i in range(n):
        matrix[i][i] = 1
    return matrix  
In []: eye(3)

Out []: [[1, 0, 0], [0, 1, 0], [0, 0, 1]]

Copying a Matrix

To prevent unexpected bugs, it is a good practice to make a copy of a matrix before using it.

def copy_matrix(M):
    """
    Creates a copy of the matrix provided
    params: 
        M: a matrix / list of lists
    return: A list of lists
    """
    NM = []
    for i in range(len(M)):
        temp = []
        for j in range(len(M[0])):
            temp.append(M[i][j])
        NM.append(temp)
        
    return NM

def test_matrix(M):
	for x in range(len(M)):
		M[x][x] = 1.0
	return M

matrix = [[0]*2]*2    
cmatrix = copy_matrix(matrix)
In []: test_matrix(matrix)  #python 3.7
Out []: [[1,1],[1,1]]
In []: test_matrix(cmatrix)
Out []: [[1,0],[0,1]]

Matrix Transformation

Matrix Addition

Matrices of the same shape can be added together Aij+Bij=CijA_{ij} + B_{ij} = C_{ij}. The sum of ARm×nA \in \mathbb R^{m \times n} and BRm×nB \in \mathbb R{m \times n} is given by:

A+B=[a11+b11a1n+b1nam1+bm1amn+bmn]A + B = \begin{bmatrix} a_{11} + b_{11} \cdots a_{1n} + b_{1n} \\ \vdots \ddots \vdots \\ a_{m1} + b_{m1} \cdots a_{mn} + b_{mn} \end{bmatrix}

def shape(M=None):
    """
    Give the size i.e. the number of rows and columns
    params: 
        M: A matrix that has rows and columns
    return: a list containing [rows,columns]
    """
    return [len(M), len(M[0])]

def matrix_addition(MA,MB):
    """
    Add two matrices of the same shape
    params: 
        MA: a matrix
        MB: a matrix
        
    return: matrix
    """
    A = copy_matrix(MA)
    B = copy_matrix(MB)

    if shape(A) != shape(B):
        raise ArithmeticError("Matrices are not of the same size.")
        
    C = [];
    for i in range(len(A)):
        row_vals = []
        for j in range(len(B[0])):
            if ( type(A[i][j]) == int or type(A[i][j]) == float) and ( type(B[i][j]) == int or type(B[i][j]) == float):
            	#For matrix subtraction change sign to (-)
                row_vals.append(A[i][j] + B[i][j])
            else: 
                raise TypeError("All Matrix values must be numbers.")
        C.append(row_vals)
    return C
In []: matrix_addition([[1,2],[2,3]], [[4,5],[6,7]])
Out []: [[5,7,],[8,10]]

Given two matrices A23=[123456]  and  B22=[78910]A_{23} = \begin{bmatrix}1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix} \; and \; B_{22} = \begin{bmatrix} 7 & 8 \\ 9 & 10 \end{bmatrix} of unequal dimensions, we can get their direct sum as follows:

AB=[A00B]=[a11a1n00am1amn0000b11b1j00bi1bij]=[123004560000078000910]A \oplus B = \begin{bmatrix} A & 0 \\ 0 & B \end{bmatrix} = \begin{bmatrix} a_{11} & \cdots & a_{1n} & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \vdots & \ddots & \vdots\\ a_{m1} & \cdots & a_{mn} & 0 & \cdots & 0\\ 0 & \cdots & 0 & b_{11} & \cdots & b_{1j}\\ \vdots & \ddots & \vdots & \vdots & \ddots & \vdots\\ 0 & \cdots & 0 & b_{i1} & \cdots & b_{ij} \end{bmatrix} = \begin{bmatrix} 1 & 2 & 3 & 0 & 0 \\ 4 & 5 & 6 & 0 & 0 \\ 0 & 0 & 0 & 7 & 8 \\ 0 & 0 & 0 & 9 & 10 \end{bmatrix}

Matrix Multiplication

Matrix AA can be multiplied by a scalar value α\alpha to give α(Aij)\alpha (A_{ij}). AA can also be multiplied (dot product) by a matrix BB as long as the columns of AA are equal to the rows BB.

cij:=l=1nailbljc_{ij} := \sum{l=1}^n a_{il} b{lj}

  • Associativity : A(BC) = (AB)C
  • Not Commutative : AB˙BA˙A \dot B \neq B \dot A

In hardamard product, we multiply two square matrices similar to addition (A * B).

def matrix_multiply(MA, MB):
    """
    Multiply two matrices (dot product), Aij X Bmn => j == m
    shorter method: [[sum((av * bv) for av, bv in zip(a, b)) for b in zip(*MB)] for a in MA ]
    params: 
        MA: a matrix
        MB: a matrix
    return: matrix
    """
    A = copy_matrix(MA)
    B = copy_matrix(MB)

    if not shape(A)[1] == shape(B)[0]:
        raise TypeError("Columns of A are not equal to rows of B")
        
    new_matrix = []
    for i in range(len(A)):
        cols_vals = []
        for n in range(len(B[0])):            
            s = 0
            for m in range(len(B)):
                s += A[i][m] * B[m][n]                
            cols_vals.append(s)            
        new_matrix.append(cols_vals)
        
    #new_matrix(i*n) 
    return new_matrix   
    
In []: matrix_multiply([[1,2],[4,5]], [[7],[9]])
Out []: [[25], [73]]

Determinants

The determinant of a matrix is the multiplicative change from tranforming a space with the matrix. We check the determinant while looking for an inverse of a square matrix. A square matrix that has a determinant of zero is singular and it has no inverse. For a matrix A22=[a11a12a21a22]A_{22}= \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} the determinant A=a11.a22a12.a21|A| = a_{11}.a_{22} - a_{12}.a_{21}. In matrices larger than 2x2, we use cofactor expansion to get the determinants.

def recursive_determinant(RM, multi=1):
        """
        The determinant of a square matrix using cofactor
        Cofactor is created by an element of a matrix e.g. m[0][1] when it excludes m[i][1] and m[0][j]
        params:
            RM: a matrix/ list of lists
            multi: a multiplier number
        returns: the total for the levels of recursion
        """
        
        width = len(RM)
        if width == 1:
            return multi * RM[0][0]
        else:
            sign = -1
            total = 0
        #Starts here
        for i in range(width):
            m = []
            #loop removes first row and exclude values from the ith column all the way down till we have 1 row, select the kth
            #multi is the (ith) matrix[0][i] before the last 2 rows, sign*matrix[0][i]: ith value when two rows are remaining
            #m is kth value when only 1 row remains
            for j in range(1, width):
                temp = []
                for k in range(width):
                    if k != i:
                        temp.append(RM[j][k])  
                m.append(temp)
                
            sign *= -1            
            total += multi * recursive_determinant(m, sign * RM[0][i])
            
        return total
In []: rmx = [[5,2,3,16],[5,6,7,8],[9,10,11,12],[13,14,15,4]]
In []: recursive_determinant(rmx)
Out []: 192

There is a faster way of finding the determinant by reducing the square matrix into an upper triangle matrix as explained here.

def faster_determinant(MF):
	"""
	The upper triangle method.

	params: 
		A: a square matrix
	return: a number
	"""
	matrix = copy_matrix(MF)

    n = len(matrix)
    
    for d in range(n):
        for i in range(d+1, n):
            if matrix[d][d] == 0:
                matrix[d][d] = 1.8e-18
                
            row_scaler = matrix[i][d] / matrix[d][d]
    
            for j in range(n):
                matrix[i][j] = matrix[i][j] - row_scaler * matrix[d][j]
    
    
    product = 1.0
    
    #determinant is the product of diagonals
    for i in range(n):
        product *= matrix[i][i]
    return product
In []: rmx = [[5,2,3,16],[5,6,7,8],[9,10,11,12],[13,14,15,4]]
In []: faster_determinant(rmx)
Out []: 192

Inverse of a matrix

Similar to an inverse of a number (5×1/5=15 \times 1/5 = 1), an inverse of a matrix multiplied by the matrix yields 1. The inverse of a matrix will satisfy the equation A(A1)=IA (A^{^-1}) = I. For a matrix to be invertible, it must be a square and its determinant must not be zero.

def matrix_inverse(M):
    """
    Find inverse by transforming the matrix into an identity matrix:
        1. Scale the each row by its diagonal value
        2. Subtract (diagonal column value * diagonal row values) from each row except the diagonal row
    params: 
        A: a square matrix
    return a matrix
    """
   
    n = len(M)
    IM = eye(n)
    
    matrix = copy_matrix(M)
    imatrix = copy_matrix(IM)
    
    
    assert shape(matrix)[0] == shape(matrix)[1], "Make sure the matrix is squared"
    assert faster_determinant(matrix) != 0, "The matrix is singular, it has no inverse."
   
    indices = list(range(n))
    for d in range(n):
        ds = 1.0 / matrix[d][d]
        #diagonal scaler 
        for j in range(n): 
            matrix[d][j] *= ds
            imatrix[d][j] *= ds
    
        #row scaler    
        for i in range(n):
            if i != d:
                rs = matrix[i][d]  
                for j in range(n):
                    matrix[i][j] = matrix[i][j] - rs * matrix[d][j]
                    imatrix[i][j] = imatrix[i][j] - rs * imatrix[d][j]
    return imatrix
In []: aa= [[3,7],[2,5]]
In []: matrix_inverse(aa)
Out []: [[ 5., -7.],[-2.,  3.]]

Matrix Division

To divide matrices we multiply them by an inverse. Lets consider A÷BA \div B where as A2X2=[13263913]A_{2X2} = \begin{bmatrix} 13 & 26 \\ 39 & 13 \end{bmatrix} and B2X2=[7423]B_{2X2} = \begin{bmatrix} 7 & 4 \\ 2 & 3 \end{bmatrix}. To find A×BA \times B';

In []: A = [[13,26],[39,13]]
In []: B_inv = matrix_inverse([[7,4],[2,3]])
In []: matrix_multiply(A, B_inv)
Out []: [[-1,10],[7,-5]]

Solving a System of Linear Equations

The matrix solution is represented as Ax=bAx = b, therefore x=bAx = bA'.

  1. 5a+3b+7c=745a + 3b + 7c = 74
  2. 9a+2b+6c=739a + 2b + 6c = 73
  3. 4a+2b+5c=534a + 2b + 5c = 53
A=[537926425],x=[abc],b=[747353]A = \begin{bmatrix} 5 & 3 & 7 \\ 9 & 2 & 6 \\ 4 & 2 & 5 \end{bmatrix}, x = \begin{bmatrix} a \\ b \\ c \end{bmatrix}, b = \begin{bmatrix} 74 \\ 73 \\ 53 \end{bmatrix}
In []: A_inv = matrix_inverse([[5,3,7],[9,2,6],[4,2,5]])
In []: matrix_multiply(A_inv, [[74], [73], [53]])
Out []: [[3.0], [8.0], [5.0]]

Transpose

We pivot matrices to enable multiplication and division, this is done by switching columns by rows Aij=AjiTA_{ij} = A_{ji}^T. From A3X2=[a11a12a21a22a31a31]A_{3X2} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{31} \end{bmatrix} to A2X3T=[a11a12a13a21a22a23]A^T_{2X3} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \end{bmatrix}.

def transpose(MT=None):
    """
    Create a transpose of a matrix ( Mij =>  Mji)
    shorter method: [list(x) for x in  zip(*MT)]
    params:
        matrix : a value, list or list of lists
    return: list of lists
    """
    matrix = copy_matrix(MT)

    if matrix == None:
        matrix = [[]]
    elif not isinstance(matrix, list):
        matrix = [[matrix]]
    elif not isinstance(matrix[0], list):
        matrix = [matrix]
    
    new_matrix = []
    for j in range(len(matrix[0])):
        new_vals = []
        
        for i in range(len(matrix)):
            new_vals.append(matrix[i][j])
        new_matrix.append(new_vals)
        
    return new_matrix
In []: transpose([1,2])
Out []: [[1],[2]]

In []: tranpose([[1,2,3],[4,5,6]])
Out []: [[1,4],[2,5],[3,6]]

Special Matrices

Square Matrix

[123456789]\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}

Rectangular Matrix

[123456]\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}

Diagonal Matrix

[300040005]\begin{bmatrix} 3 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & 5 \end{bmatrix}

Anti-diagonal Matrix

[003040500]\begin{bmatrix} 0 & 0 & 3 \\ 0 & 4 & 0 \\ 5 & 0 & 0 \end{bmatrix}

Symmetric Matrix

[123215351]\begin{bmatrix} 1 & 2 & 3 \\ 2 & 1 & 5 \\ 3 & 5 & 1 \end{bmatrix}

Scalar Matrix

[200020002]\begin{bmatrix} 2 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{bmatrix}

Pivoting

In Gaussian elimination, the pivot element is required to not be zero. Given [123405678]\begin{bmatrix}1 & 2 & 3 \\ 4 & 0 & 5 \\ 6 & 7 & 8 \end{bmatrix}, we swap to produce [123678405]\begin{bmatrix}1 & 2 & 3 \\ 6 & 7 & 8 \\ 4 & 0 & 5 \end{bmatrix}. It is also desired to have the pivot element with a large absolute value to reduce the effect of rounding off error. Given [0.00359.145.2916.13]\begin{bmatrix}0.003 & 59.14 \\ 5.291 & -6.13 \end{bmatrix}, after swapping we get [5.2916.130.00359.14]\begin{bmatrix}5.291 & -6.13 \\ 0.003 & 59.14 \end{bmatrix}. In a partial pivot, we select the entry with the largest absolute value from the column of the matrix that is currently being considered as the pivot element.

def pivot(m):
    """
    Rearrange row values, the larger value becomes the pivot element 
    params:
        m: matrix
    returns: a matrix containing 0,1's
    """
    n = len(m)
    Id = [[float(i==j) for i in range(n)] for j in range(n)]
    #get the max value in each, if not at pivot position, swap.
    for j in range(n):
        row = max( range(j, n), key=lambda i: abs(m[i][j]) )
     
        if j != row:
            Id[j], Id[row] = Id[row], Id[j]
    return Id
In []: vals = [[5,3,8],[6,4,5],[2,11,9]]

In []: pivot(vals)
Out []: [[0,1,0],[0,0,1],[1,0,0]]

In []: matrix_multiply(vals, pivot(vals))
Out []: [[8,5,3],[5,6,4],[9,2,11]]

MATRIX FACTORIZATION

Matrix decompositions reduce a matrix into constituent parts that make it easier to do complex operations. They can be used to calculate determinants, inverse and in solving systems of linear equations.

Lower and Upper Triangle Matrix Decomposition (LU)

A non-singular square matrix is decomposed in the form A=L.UA = L.U. To prevent division by zero error we pivot, such that PA=LUPA = LU.

[a11a12a13a21a22a23a31a32a33]=[l1100l21l220l31l32l33][u11u12u130u22u2300u33]\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix} = \begin{bmatrix} l_{11} & 0 & 0 \\ l_{21} & l_{22} & 0 \\ l_{31} & l_{32} & l_{33} \end{bmatrix} \begin{bmatrix} u_{11} & u_{12} & u_{13} \\ 0 & u_{22} & u_{23} \\ 0 & 0 & u_{33} \end{bmatrix}

LU decomposition using Crout's algorithm

The diagonals of a U matrix in Crout decomposition are 1, in Doolittle algorithm the diagonals of L matrix are all 1.

To calculate the upper values:

u11=a11,u12=a12,u13=a13u_{11} = a_{11}, u_{12} = a_{12}, u_{13} = a_{13} u22=a22u12l21,u23=a23u13l21u_{22} = a_{22} - u_{12}l_{21}, u_{23} = a_{23} - u_{13}l_{21} u33=a33(u13l31+u23l32)u_{33} = a_{33} - (u_{13}l_{31} + u_{23}l_{32})

We derive the formula for U

uij=aijk=1i1ukjliku_{ij} = a_{ij} - \sum_{k=1}^{i-1}u_{kj}l_{ik}

To calculate the lower values:

l21=1u11a21,l31=1u11a31l_{21} = \frac{1}{u_{11}}a_{21}, l_{31} = \frac{1}{u_{11}}a_{31} l32=1u22(a32u12l31)l_{32} = \frac{1}{u_{22}}(a_{32} - u_{12}l_{31})

We derive the formula for L

lij=1ujj(aijk=1j1ukjlik)l_{ij} = \frac{1}{u_{jj}}(a_{ij}-\sum_{k=1}^{j-1}u_{kj}l_{ik})
def lup(M):
    """
    Perform a Lower and Upper Decomposition.
    params:
        M: a nxn matrix
    return: 3 nxn matrices; Lower, Upper and Pivot.
    """
    n = len(M)
    L = [[0.0]*n for i in range(n)]
    U = [[0.0]*n for i in range(n)]
    P = pivot(M)
    PM = matrix_multiply(P,M)
    for j in range(n):
        L[j][j] = 1.0
        
        for i in range(j+1):
            s1 = sum(U[k][j] * L[i][k] for k in range(i))
            U[i][j] = PM[i][j] - s1
        for i in range(j, n):
            s2 = sum(U[k][j] * L[i][k] for k in range(j))
            L[i][j] = (PM[i][j] - s2) / U[j][j]
    return L,U,P
In []: lup([2,3,5],[2,4,7],[6,8,0]])
Out []: (
	[[1.0,0.0,0.0],[0.3,1.0,0.0],[0.3,4.0,1.0]],  #L
	[[6.0,8.0,0.0],[0.0,0.3,5.0],[0.0,0.0,-13.0]],  #U
	[[0.0,0.0,1.0],[1.0,0.0,0.0],[0.0,1.0,0.0]]     #P
	)

Solving a Linear System using LU decomposition

Given an L=[10.00.00.31.00.00.10.031.0]L = \begin{bmatrix} 1 & 0.0 & 0.0 \\ 0.3 & 1.0 & 0.0 \\ 0.1 & -0.03 & 1.0 \end{bmatrix}, U=[3.00.10.20.07.00.30.00.010.0]U = \begin{bmatrix} 3.0 & -0.1 & -0.2 \\ 0.0 & 7.0 & -0.3 \\ 0.0 & 0.0 & 10.0 \end{bmatrix} and b=[7.8519.371.4]b = \begin{bmatrix}7.85 \\ -19.3 \\ 71.4 \end{bmatrix}. The PA=LUPA = LU, P is a permutation matrix that swaps rows of A. A linear system is Ax=bAx = b, since A==LUA == LU, then (LU)x=b(LU)x = b. We can shift the parenthesis such that L(Ux)=bL(Ux) = b, lets create a d such that Ld=bLd = b. Finally, we find x using Ux=dUx = d.

Solving for d using forward substitution

def forward_substitution(l, b):
    """
    Calculating the value of d given Ld = b
    params:
        L: a non-singular nxn matrix
        b: a vector
    return: a vector
    """
    n = len(l)
    for i in range(0, n-1):
        for j in range(i+1, n):
            b[j] -= l[j][i] * b[i]
    return b

Solving for x using back substitution

 def back_substitution(u, d):
    """
    Calculating the value of x given Ux = d
    params: 
        U: a non-singular, nxn matrix
        d: a vector
    return: a vector
    """
    n = len(u)
    #the upper matrix is upside down
    #start with the last row and move upwards
    for i in range(n)[::-1]:
        d[i] /= u[i][i]
        for j in range(0, i):
            d[j] -= u[j][i] * d[i]   
    return d
In []: l,u,p = lup([[3,-0.1,-0.2],[0.1,7,-0.3],[0.3,-0.2,10]])
In []: b = [7.85, -19.3, 71.4]
In []: d = forward_substitution(l, b)
In []: back_substitution(u, d)
Out []: [3.0, -2.5, 7.0]  #x1, x2, x3

FURTHER READING