# LU decomposition

## Contents

# LU decomposition#

## Motivation#

The Gaussian elimination solves a matrix system with one RHS vector \(\pmb{b}\). We often have to deal with problems where we have multiple RHS vectors, all with the same matrix \(A\).

We could call the same code (e.g. `upper_triangle`

and `back_substitution`

from Gaussian elimination notebook) multiple times to solve all of these corresponding linear systems, but note that as the elimination algorithm is actually performing operations based upon (the same) \(A\) each time, we would actually be wasting time repeating exactly the same operations - this is therefore clearly not an efficient solution to the problem.

We could easily generalise our Gaussian elimination/back substitution algorithms to include multiple RHS column vectors in the augmented system, perform the same sequence of row operations (but now only once) to transform the matrix to upper-triangular form, and then perform back substitution on each of the transformed RHS vectors from the augmented system - cf. the use of Gaussian elimination to compute the inverse to a matrix by placing the identity on the right of the augmented system.

However, it is often the case that each RHS vector depends on the solutions to the matrix systems obtained from some or all of the earlier RHS vectors, and so this generalisation would not work in this case. For example, our discrete system could be of the form

where \(\pmb{x}^{n+1}\) is the numerical solution of an ordinary or partial differential equation at time level \(n+1\), the RHS is a function of the the solution at the previous time level \(n\) (and possibly other things such as forcing terms, represented by the \(\ldots\) above. \(A\) is then a discretisation of the differential equation written in the form that it maps the old solution to the new one. Time stepping this problem invovles solving the same linear system multiple times with different RHSs.

## Theory#

To deal with this situation efficiently we decompose or factorise the matrix \(A\) in such a way that it is cheap to compute a new solution vector \(\pmb{x}\) for any given RHS vector \(\pmb{b}\). This decomposition involves a lower- and an upper-triangular matrix, hence the name **LU decomposition**. These matrices encode the steps conducted in Gaussian elimination, so we don’t have to explicilty conduct all of the operations again and again.

Mathematically, let’s assume that we have already found/constructed a lower-triangular matrix (\(L\) - where all entries above the diagonal are zero) and an upper-triangular matrix (\(U\) - where all entries below the diagonal are zero) such that we can write

In this case the matrix system we need to solve for \(\pmb{x}\) becomes

Notice that the matrix-vector product \(U\pmb{x}\) is itself a vector, let’s call it \(\pmb{c}\) for the time-being (i.e. \(\pmb{c}=U\pmb{x}\)).

The above system then reads

where \(L\) is a matrix and \(\pmb{c}\) is an unknown.

As \(L\) is in the lower-triangular form, we can use forward substitution (generalising the back subsitution algorithm/code) to very easily find \(\pmb{c}\) in relatively few operations (we don’t need to call the entire Gaussian elimination algorithm).

Once we know \(\pmb{c}\) we then solve the second linear system

where now we can use the fact that \(U\) is upper-triangular to use our back substitution algorithm again very efficiently to give the solution \(\pmb{x}\) we require.

So for a given \(\pmb{b}\) we can find the corresponding \(\pmb{x}\) very efficiently, we can therefore do this repeatedly as each new \(\pmb{b}\) is given to us.

Our challenge is therefore to find the matrices \(L\) and \(U\) allowing us to perform the decomposition \(A=LU\).

## Doolittle algorithm#

Recall the comment above on the \(L\) and \(U\) matrices encoding the steps taken in Gaussian elimination. Let’s see how this works through the development of the so-called **Doolittle algorithm**.

Let’s consider an example matrix:

The first step of Gaussian elimination is to set the sub-diagonal elements in the first column to zero by subtracting multiples of the first row from each of the subsequent rows.

For this example, this requires the row operations

or mathematically, and for each element of the matrix (remembering that we are adding rows together - while one of the entries of a row will end up being zero, this also has the consequence of updating the rest of the values in that row, hence the iteration over \(j\) below):

Notice that we can also write these exact operations on elements in terms of multiplication by a carefully chosen lower-triangular matrix where the non-zero’s below the diagonal are restricted to a single column, e.g. for the example above

The lower-triangular matrix (let’s call this one \(L_0\)) is thus encoding the first step in Gaussian elimination.

The next step involves taking the second row of the updated matrix as the new pivot (we will ignore partial pivoting for simplicity), and subtracting multiples of this row from those below in order to set the zeros below the diagonal in the second column to zero.

Note that this step can be achieved here with the multiplication by the following lower-triangular matrix (call this one \(L_1\))

and finally for this example to get rid the of the leading 14 on the final row:

where this lower triangular matrix we will call \(L_2\), and the RHS matrix is now in upper-triangular form as we expect from Gaussian elimination (call this \(U\)).

In summary, the above operations (starting from \(A\), first multipling on the left by \(L_0\), the multiplying the result on the left by \(L_1\) and so on to arrive at an upper trianglualr matrix \(U\)) can be written as

Note that these special lower-triangular matrices \(L_0, \ldots\) are all examples of what is known as *atomic* lower-triangular matrices: a special form of unitriangular matrix - the diagonals are unity, where the off-diagonal entries are all zero apart from in a single column.

Notice that for these special, simple matrices, their inverse is simply the original with the sign of those off-diagnonals changed:

### Exercise#

Convince yourselves of the following facts:

the above statement on what the inverse of the special \(L\) matrices is - check on a small example.

the multiplication of arbitrary lower-triangular square matrices is also lower-triangular.

\(L_2(L_1(L_0A)) = U \implies A = L_0^{-1}(L_1^{-1}(L_2^{-1}U))\)

and hence that \(A=LU\) where \(U\) is the upper-triangular matrix found at the end of Guassian elimination, and where \(L\) is the following matrix

\[L = L_0^{-1}L_1^{-1}L_2^{-1}.\]finally, compute this product of these lower-triangular matrices to show that

\[\begin{split}L = \begin{bmatrix} {\color{black}1} & {\color{black}0} & {\color{black}0} & {\color{black}0}\\ {\color{black}{1}} & {\color{black}1} & {\color{black}0} & {\color{black}0}\\ {\color{black}{4}} & {\color{black}7} & {\color{black}1} & {\color{black}0}\\ {\color{black}{5}} & {\color{black}8} & {\color{black}2} & {\color{black}1} \end{bmatrix} \end{split}\]i.e. that the multiplication of these individual atomic matrices (importantly in this order) simply merges the entries from the non-zero columns of each atomic matrix, and hence is both lower-triangular, as well as trivial to compute.

```
import numpy as np
from scipy import linalg
from pprint import pprint
# as above: A matrix,
A = np.array([[5., 7. , 5., 9.],
[5., 14., 7., 10.],
[20., 77., 41., 48.],
[25., 91., 55., 67.]])
# lower triangular matrices,
L0 = np.array([[1,0,0,0],[-1,1,0,0],[-4,0,1,0],[-5,0,0,1]])
L1 = np.array([[1,0,0,0],[0,1,0,0],[0,-7,1,0],[0,-8,0,1]])
L2 = np.array([[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,-2,1]])
# and their inverse matrices
L0_ = linalg.inv(L0)
L1_ = linalg.inv(L1)
L2_ = linalg.inv(L2)
# our base matrix
pprint(A)
# ad: The multiplication of arbitrary lower-triangular
# square matrices is also lower-triangular
print("Question 1:")
pprint(np.dot(L0, L1))
pprint(np.dot(L2, L1))
# ad: L2(L1(L0 A)) = U
print("Question 2:")
U = np.dot(L2, np.dot(L1, np.dot(L0, A)))
pprint(U)
# ad: A = L0^-1(L1^-1(L2^-1 U))
print("Question 3:")
A = np.dot(L0_, np.dot(L1_, np.dot(L2_, U)))
pprint(A)
# ad: L = L0^-1 L1^-1 L2^-1
print("Question 4:")
L = np.dot(L0_,np.dot(L1_,L2_))
pprint(L)
# ad: A = LU
# numpy.allclose takes two numpy arrays and checks
# whether all elements are identical to some tolerance
print("Question 5:")
print(np.allclose(A, np.dot(L, U)))
```

```
array([[ 5., 7., 5., 9.],
[ 5., 14., 7., 10.],
[20., 77., 41., 48.],
[25., 91., 55., 67.]])
Question 1:
array([[ 1, 0, 0, 0],
[-1, 1, 0, 0],
[-4, -7, 1, 0],
[-5, -8, 0, 1]])
array([[ 1, 0, 0, 0],
[ 0, 1, 0, 0],
[ 0, -7, 1, 0],
[ 0, 6, -2, 1]])
Question 2:
array([[5., 7., 5., 9.],
[0., 7., 2., 1.],
[0., 0., 7., 5.],
[0., 0., 0., 4.]])
Question 3:
array([[ 5., 7., 5., 9.],
[ 5., 14., 7., 10.],
[20., 77., 41., 48.],
[25., 91., 55., 67.]])
Question 4:
array([[1., 0., 0., 0.],
[1., 1., 0., 0.],
[4., 7., 1., 0.],
[5., 8., 2., 1.]])
Question 5:
True
```

## Implementation#

We can build an LU code easily from Gaussian elimination code. The final \(U\) matrix we need here is as was already constructed through Gaussian elimination, the entries of \(L\) we need are simply the \({A_{ik}}/{A_{kk}}\) multipliers we computed as part of the elimination, but threw away previously.

For a given pivot row \(k\), for each of these multipliers (for every row below the pivot), as we compute them we know that we are going to transform the augmented matrix in order to achieve a new zero below the diagonal - we can store each multiplier in this position before moving on to the following row, we implicitly know that the diagonals of \(L\) will be unity and so don’t need to store these (and noting that we don’t actually have a space for them anyway!). We then move on to the following pivot row, replacing the zeros in the corresponding column we are zero’ing, but again using the now spare space to store the multipliers.

For example, for the case above

Starting from Gaussian elimination code we will compute the LU decomposition of a matrix. First, we will store \(L\) and \(U\) in two different matrices:

```
def LU_dec(A):
# Upper triangular matrix contains gaussian elimination result
# We won't change A in-place but create a local copy
A = A.copy()
m, n = A.shape
# Lower triangular matrix has identity diagonal and scaling factors
# Start from the identity:
L = np.identity(n)
# Loop over each pivot row.
for k in range(n):
# Loop over each equation below the pivot row.
for i in range(k+1, n):
# Define the scaling factor outside the innermost
# loop otherwise its value gets changed.
s = (A[i,k]/A[k,k])
for j in range(k, n):
A[i,j] = A[i,j] - s*A[k,j]
# Scaling factors make up the lower matrix
L[i,k] = s
# A now is the upper triangular matrix U
return L, A
```

Test the code:

```
A = np.array([[ 5., 7., 5., 9.],
[ 5., 14., 7., 10.],
[20., 77., 41., 48.],
[25., 91. ,55., 67.]])
L, U = LU_dec(A)
pprint(A)
pprint(L)
pprint(U)
print(np.allclose(np.dot(L,U), A))
```

```
array([[ 5., 7., 5., 9.],
[ 5., 14., 7., 10.],
[20., 77., 41., 48.],
[25., 91., 55., 67.]])
array([[1., 0., 0., 0.],
[1., 1., 0., 0.],
[4., 7., 1., 0.],
[5., 8., 2., 1.]])
array([[5., 7., 5., 9.],
[0., 7., 2., 1.],
[0., 0., 7., 5.],
[0., 0., 0., 4.]])
True
```

## Partial pivoting#

Problems can occur in a numerical implementation of our algorithm if \(A_{kk}\) divided through by in the Gaussian elimination and/or back substitution algorithms might be zero (or close to zero).

Using Gaussian elimination as an example, let’s again consider the algorithm mid-way working on an arbitrary matrix system, i.e. assume that the first \(k\) rows have already been transformed into upper-triangular form, while the equations/rows below are not yet in this form:

We have drawn the horizontal dashed line one row higher, as we are not going to blindly asssume that it is wise to take the current row \(k\) as the pivot row, and \(A_{kk}\) as the so-called pivot element.

**Partial pivoting** selects the best pivot (row or element) as the one where the \(A_{ik}\) (for \(i\ge k\)) value is largest (relative to the other values of components in its own row \(i\)), and then swaps this row with the current \(k\) row.

To generalise our codes above we would simply need to search for this row, and perform the row swap operation.

Python’s `scipy.linalg`

library has its own implementation of the LU decomposition, that uses partial pivoting.

Example of partial pivoting

Suppose that we have the matrix \(\pmb{A}\) from before

and then we do some timeskipping and skip to partway through

Now, magic happens for the sake of illustrating the partial pivoting - “magic magic poof poof and magically \(7\) becomes \(0\)”:

Suppose that the matrix storing our unknowns is

And the matrix storing the results of our equations, after all of the Gaussian elimination previously that have modified the matrix \(\pmb{b}\)

Writing this in equation this would be

Now, we are going to use equation 2 and equation 3

The coefficient of the second unknown in the second equation is 0. The coefficient of the second unknown in the third equation is 49. Thus, I will divide the coefficient of the 3rd equation with the coefficient of the 2nd equation, and then I would multiply the coefficient of the 3rd equation by the quotient. As such, I will divide \(49/0\), and then, I will get an error since you cannot divide by \(0\). It does not neccesarily have to be \(0\) to give errors, even if it is very close to \(0\), it will still give an error.

Thus what can I do?

Well, I need two equations. I cannot use the equation 2 because there is a \(0\). Then, I will use equation 3 and equation 4. Therefore, let’s swap the position of equation 4 and equation 2.

Remember when you swap rows in matrix \(\pmb{A}\), you must also swap rows in the matrix \(\pmb{b}\). Matrix \(\pmb{x}\) does not have to swapped since the order of the variables has not changed.

The matrix storing our unknowns remains

And the matrix storing the results of our equations

Writing this in equation this would be

Inspecting this equation, we see that the equation is essentially the same as it was before the swap, so we can say that our action to swap it is a valid action. I have rewritten the equation before the swap as reference

Now let’s continue, after we finished swapping, and we will focus only on getting to the upper triangle form, and only on matrix \(\pmb{A}\). The matrix \(\pmb{b}\) will also change accordingly as detailed in the previous lecture, but let’s focus on matrix \(\pmb{A}\):

We will not encounter the problem with dividing by zero anymore, at least for this matrix.

This matrix was nice and only required one swap. But if your matrix has lots of \(0\) or very small numbers, then maybe you might need more swaps. You might also want to nondimensionalize that matrix. That is, of course assuming that the equations contain enough information to actually allow you to solve the equation. You could use the determinant, or the Gauss-Jordan elimination to check if there is actually a solution. You should always check the solvability of an equation before solving it, since trying to solve an equation which cannot be solved is wasted effort.

You could use conditionals to control the swapping or the skipping of rows. Although the example here used \(0\), it could be a very small number too, and not neccesary \(0\) that could cause the problem, so maybe it would be nice to compare the conditional to a certain threshold to decided whether to swap or not.

Example partial pivoting using `scipy.linalg`

:

```
import scipy.linalg as sl
A = np.array([[ 5., 7., 5., 9.],
[ 5., 14., 7., 10.],
[20., 77., 41., 48.],
[25., 91. ,55., 67.]])
print("A =", A)
P, L, U = sl.lu(A)
# P here is a 'permutation matrix' that performs
# swaps based upon partial pivoting
print("P =", P)
# The lower-triangular matrix
print(L)
# The upper-triangular matrix
print(U)
# Double check that P*L*U does indeed equal A
print("Does P*L*U = A?", np.allclose(P @ L @ U, A))
```

```
A = [[ 5. 7. 5. 9.]
[ 5. 14. 7. 10.]
[20. 77. 41. 48.]
[25. 91. 55. 67.]]
P = [[0. 1. 0. 0.]
[0. 0. 0. 1.]
[0. 0. 1. 0.]
[1. 0. 0. 0.]]
[[ 1. 0. 0. 0. ]
[ 0.2 1. 0. 0. ]
[ 0.8 -0.375 1. 0. ]
[ 0.2 0.375 0.33333333 1. ]]
[[ 25. 91. 55. 67. ]
[ 0. -11.2 -6. -4.4 ]
[ 0. 0. -5.25 -7.25 ]
[ 0. 0. 0. 0.66666667]]
Does P*L*U = A? True
```

Looking at the form of \(P\) above, we can re-order the rows in advance and consider the LU decomposition of the matrix where \(P=I\), as below. As we haven’t bothered implementing pivoting ourselves, check that your LU implementation recreates the \(A\), \(L\) and \(U\) below.

```
A = np.array([[25. ,91. ,55. ,67.],
[ 5., 7., 5., 9.],
[20., 77., 41., 48.],
[ 5., 14., 7., 10.]])
print("A =", A)
P, L, U = sl.lu(A)
# P here is a 'permutation matrix' that performs
# swaps based upon partial pivoting
print("P =", P)
# The lower-triangular matrix
print(L)
# The upper-triangular matrix
print(U)
# Double check that P*L*U does indeed equal A
print("Does P*L*U = A?", np.allclose(P @ L @ U, A))
```

```
A = [[25. 91. 55. 67.]
[ 5. 7. 5. 9.]
[20. 77. 41. 48.]
[ 5. 14. 7. 10.]]
P = [[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 1.]]
[[ 1. 0. 0. 0. ]
[ 0.2 1. 0. 0. ]
[ 0.8 -0.375 1. 0. ]
[ 0.2 0.375 0.33333333 1. ]]
[[ 25. 91. 55. 67. ]
[ 0. -11.2 -6. -4.4 ]
[ 0. 0. -5.25 -7.25 ]
[ 0. 0. 0. 0.66666667]]
Does P*L*U = A? True
```

### Implementation#

See below the code implementation to partial pivoting:

```
# This function is not necessary but makes the LU_dec_pp function less cluttered
# Making general operations into functions is good way of improving readability
# and reducing your workload later!
def swap_rows(A,j,k):
B = np.copy(A[j,:])
C = np.copy(A[k,:])
# second label, BUT NOT CREATING A 2ND ARRAY.
# Try it for yourself if you want.
A[k,:] = B
A[j,:] = C
return A
# A function to perform LU decomposition with partial pivoting
def LU_dec_pp(A):
m, n = A.shape
A = A.copy() # we won't modify in place but create local copy
P_ = np.identity(m)
L = np.identity(m)
for k in range(m):
j = np.argmax(abs(A[k:,k]))
# Find the index of the largest ABSOLUTE value. np.argmax will return
# the index of the largest element in an array
j+= k # A[1+2,2] = A[3,2] = 3!
A = swap_rows(A,j,k)
P_ = swap_rows(P_,j,k)
for j in range(k+1,m):
s = A[j,k]/A[k,k]
A[j,k:] -= A[k,k:]*s
L[j,k] = s
U = A
return P_.T, L, U
A = np.array([[ 5., 7., 5., 9.],
[ 5., 14., 7., 10.],
[20., 77., 41., 48.],
[25., 91. ,55., 67.]])
P, L, U = LU_dec_pp(A)
pprint(A)
pprint(L)
pprint(U)
pprint(np.dot(P, np.dot(L,U)))
print("Have we succeeded in creating partial pivoting?", np.allclose(np.dot(P, np.dot(L,U)), A))
```

```
array([[ 5., 7., 5., 9.],
[ 5., 14., 7., 10.],
[20., 77., 41., 48.],
[25., 91., 55., 67.]])
array([[ 1. , 0. , 0. , 0. ],
[ 0.2 , 1. , 0. , 0. ],
[ 0.8 , -0.375 , 1. , 0. ],
[ 0.2 , 0.375 , 0.33333333, 1. ]])
array([[ 25. , 91. , 55. , 67. ],
[ 0. , -11.2 , -6. , -4.4 ],
[ 0. , 0. , -5.25 , -7.25 ],
[ 0. , 0. , 0. , 0.66666667]])
array([[ 5., 7., 5., 9.],
[ 5., 14., 7., 10.],
[20., 77., 41., 48.],
[25., 91., 55., 67.]])
Have we succeeded in creating partial pivoting? True
```