{"cells": [{"cell_type": "markdown", "metadata": {"tags": ["module-nm"]}, "source": ["(nm_linear_algebra_intro)=\n", "# Linear algebra introduction\n", "[Numerical Methods](module-nm) \n", "\n", "## Linear (matrix) systems\n", "\n", "We can re-write a system of simultaneous (linear) equations in a matrix form. For example, let's consider:\n", "\n", "$$\\begin{eqnarray*}\n", " 2x + 3y &=& 7 \\\\\n", " x - 4y &=& 3\n", "\\end{eqnarray*}$$\n", "\n", "It can be rewritten in a matrix form:\n", "\n", "$$\n", "\\left(\n", " \\begin{array}{rr}\n", " 2 & 3 \\\\\n", " 1 & -4\n", " \\end{array}\n", "\\right)\\left(\n", " \\begin{array}{c}\n", " x \\\\\n", " y\n", " \\end{array}\n", "\\right) = \\left(\n", " \\begin{array}{c}\n", " 7 \\\\\n", " 3\n", " \\end{array}\n", "\\right)\n", "$$\n", "\n", "We understand that this system always has the form of \n", "\n", "$$\n", "\\left(\n", " \\begin{array}{rr}\n", " a & b \\\\\n", " c & d \n", " \\end{array}\n", "\\right)\\left(\n", " \\begin{array}{c}\n", " x \\\\\n", " y\n", " \\end{array}\n", "\\right) = \\left(\n", " \\begin{array}{c}\n", " e \\\\\n", " f\n", " \\end{array}\n", "\\right),\n", "$$\n", "\n", "where $a,b,c,d,e,f$ are arbitrary constants.\n", "\n", "Let's call the matrix which stores the coefficients of our system of linear equations to be $A$\n", "\n", "$$\n", "A=\n", "\\left(\n", " \\begin{array}{rr}\n", " a & b \\\\\n", " c & d\n", " \\end{array}\n", "\\right)\n", "$$\n", "\n", "and the matrix that contains our variables to be $\\mathbf{x}$\n", "\n", "$$\n", "\\mathbf{x}=\n", "\\left(\n", " \\begin{array}{c}\n", " x \\\\\n", " y\n", " \\end{array}\n", "\\right).\n", "$$\n", "\n", "The matrix that contains the results of our system of linear equation will be called $\\mathbf{b}$\n", "\n", "$$\n", "\\mathbf{b}=\n", "\\left(\n", " \\begin{array}{c}\n", " e \\\\\n", " f\n", " \\end{array}\n", "\\right).\n", "$$\n", "\n", "This system of equations can be represented as the matrix equation\n", "\n", "$$A\\pmb{x}=\\pmb{b}.$$\n", "\n", "More generally, consider an arbitrary system of $n$ linear equations for $n$ unknowns\n", "\n", "$$\n", "\\begin{eqnarray*}\n", " A_{11}x_1 + A_{12}x_2 + \\dots + A_{1n}x_n &=& b_1 \\\\\\\\\\\\ \n", " A_{21}x_1 + A_{22}x_2 + \\dots + A_{2n}x_n &=& b_2 \\\\\\\\\\\\ \n", " \\vdots &=& \\vdots \\\\\\\\\\\\ \n", " A_{n1}x_1 + A_{n2}x_2 + \\dots + A_{nn}x_n &=& b_n\n", "\\end{eqnarray*}\n", "$$\n", "\n", "where $A_{ij}$ are the constant coefficients of the linear system, $x_j$ are the unknown variables, and $b_i$\n", "are the terms on the right hand side (RHS). Here the index $i$ is referring to the equation number\n", "(the row in the matrix below), with the index $j$ referring to the component of the unknown\n", "vector $\\pmb{x}$ (the column of the matrix).\n", "\n", "This system of equations can be represented as the matrix equation $A\\pmb{x}=\\pmb{b}$:\n", "\n", "$$\n", "\\left(\n", " \\begin{array}{cccc}\n", " A_{11} & A_{12} & \\dots & A_{1n} \\\\\\\\\\\\\n", " A_{21} & A_{22} & \\dots & A_{2n} \\\\\\\\\\\\\n", " \\vdots & \\vdots & \\ddots & \\vdots \\\\\\\\\\\\\n", " A_{n1} & A_{n2} & \\dots & A_{nn} \\\\\\\\\\\\\n", " \\end{array}\n", "\\right)\\left(\n", " \\begin{array}{c}\n", " x_1 \\\\\\\\\\\\\n", " x_2 \\\\\\\\\\\\\n", " \\vdots \\\\\\\\\\\\\n", " x_n \\\\\\\\\\\\\n", " \\end{array}\n", " \\right) = \\left(\n", " \\begin{array}{c}\n", " b_1 \\\\\\\\\\\\\n", " b_2 \\\\\\\\\\\\\n", " \\vdots \\\\\\\\\\\\\n", " b_n \\\\\\\\\\\\\n", " \\end{array}\n", " \\right)\n", "$$\n", "\n", "\n", "We can easily solve the above $2 \\times 2$ example of two equations and two unknowns using substitution (e.g. multiply the second equation by 2 and subtract the first equation from the resulting equation to eliminate $x$ and hence allowing us to find $y$, then we could compute $x$ from the first equation). We find:\n", "\n", "$$ x=\\frac{37}{11}, \\quad y=\\frac{1}{11}.$$\n", "\n", "```{margin} Note\n", "Cases where the matrix is non-square, i.e. of shape $m \\times n$ where $m\\ne n$ correspond to the over- or under-determined systems where you have more or less equations than unknowns. \n", "```\n", "\n", "Example systems of $3\\times 3$ are a little more complicated but doable. In this notebook, we consider the case of $n\\times n$, where $n$ could be billions (e.g. in AI or machine learning).\n", "\n", "## Matrices in Python\n", "\n", "We can use `numpy.arrays` to store matrices. The convention for one-dimensional vectors is to call them column vectors and have shape $n \\times 1$. We can extend to higher dimensions through the introduction of matrices as two-dimensional arrays (more generally vectors and matrices are just two examples of {ref}`tensors `). \n", "\n", "We use subscript indices to identify each component of the array or matrix, i.e. we can identify each component of the vector $\\pmb{v}$ by $v_i$, and each component of the matrix $A$ by $A_{ij}$. \n", "\n", "The dimension or shape of a vector/matrix is the number of rows and columns it posesses, i.e. $n \\times 1$ and $m \\times n$ for the examples above. Here is an example of how we can extend our use of the `numpy.array` to two dimensions in order to define a matrix $A$."]}, {"cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["[[10. 2. 1.]\n", " [ 6. 5. 4.]\n", " [ 1. 4. 7.]]\n"]}], "source": ["import numpy as np\n", "\n", "A = np.array([[10., 2., 1.],\n", " [6., 5., 4.],\n", " [1., 4., 7.]])\n", "\n", "print(A)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Check total size of the array storing matrix $A$. It will be $3\\times3=9$:"]}, {"cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["9\n"]}], "source": ["print(np.size(A))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Check the number of dimensions of matrix $A$:"]}, {"cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["2\n"]}], "source": ["print(np.ndim(A))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Check the shape of the matrix $A$:"]}, {"cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["(3, 3)\n"]}], "source": ["print(np.shape(A))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Transpose matrix $A$:"]}, {"cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["[[10. 6. 1.]\n", " [ 2. 5. 4.]\n", " [ 1. 4. 7.]]\n"]}], "source": ["print(A.T)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Get the inverse of matrix $A$:"]}, {"cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["[[ 0.14285714 -0.07518797 0.02255639]\n", " [-0.28571429 0.51879699 -0.2556391 ]\n", " [ 0.14285714 -0.28571429 0.28571429]]\n"]}], "source": ["import scipy.linalg as sl\n", "\n", "print(sl.inv(A))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Get the determinant of matrix $A$:"]}, {"cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["133.00000000000003\n"]}], "source": ["print(sl.det(A))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["````{margin}\n", "Normal `*` operator does operations element-wise, which we do not want!!!\n", "```python\n", "\n", "print(A*sl.inv(A))\n", "```\n", "[[ 1.42857143 -0.15037594 0.02255639]\n", " [-1.71428571 2.59398496 -1.02255639]\n", " [ 0.14285714 -1.14285714 2. ]]\n", "\n", "````\n", "\n", "Multiply $A$ with its inverse using the `@` matrix multiplication operator. Note that due to roundoff errors the off diagonal values are not exactly zero:"]}, {"cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["[[ 1.00000000e+00 -1.66533454e-16 0.00000000e+00]\n", " [ 1.11022302e-16 1.00000000e+00 2.22044605e-16]\n", " [-2.77555756e-17 -1.66533454e-16 1.00000000e+00]]\n"]}], "source": ["print(A @ sl.inv(A))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Another way of multiplying matrices is to use `np.dot` function:"]}, {"cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["[[ 1.00000000e+00 -1.66533454e-16 0.00000000e+00]\n", " [ 1.11022302e-16 1.00000000e+00 2.22044605e-16]\n", " [-2.77555756e-17 -1.66533454e-16 1.00000000e+00]]\n", "\n", "\n", "[[ 1.00000000e+00 -1.66533454e-16 0.00000000e+00]\n", " [ 1.11022302e-16 1.00000000e+00 2.22044605e-16]\n", " [-2.77555756e-17 -1.66533454e-16 1.00000000e+00]]\n"]}], "source": ["print(np.dot(A, sl.inv(A)))\n", "print(\"\\n\")\n", "print(A.dot(sl.inv(A)))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Initialise vector and matrix of zeros:"]}, {"cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["[0. 0. 0.]\n", "\n", "\n", "[[0. 0. 0.]\n", " [0. 0. 0.]\n", " [0. 0. 0.]]\n"]}], "source": ["print(np.zeros(3))\n", "print(\"\\n\")\n", "print(np.zeros((3,3)))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Initialise identity matrix:"]}, {"cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["[[1. 0. 0.]\n", " [0. 1. 0.]\n", " [0. 0. 1.]]\n"]}], "source": ["print(np.eye(3))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Matrix objects\n", "\n", "Note that NumPy has a matrix object. We can cast the above two-dimensional arrays into matrix objects and then the star operator does yield the expected matrix product:"]}, {"cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["\n", "\n"]}], "source": ["A = np.array([[10., 2., 1.],\n", " [6., 5., 4.],\n", " [1., 4., 7.]])\n", "\n", "print(type(A))\n", "print(type(np.mat(A)))"]}, {"cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["[[ 1.00000000e+00 -1.66533454e-16 0.00000000e+00]\n", " [ 1.11022302e-16 1.00000000e+00 2.22044605e-16]\n", " [-2.77555756e-17 -1.66533454e-16 1.00000000e+00]]\n"]}], "source": ["print(np.mat(A)*np.mat(sl.inv(A)))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Slicing\n", "We can use slicing to extract components of matrices:"]}, {"cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["2.0\n", "[10. 2. 1.]\n", "[1. 4. 7.]\n", "[2. 5. 4.]\n", "[[5. 4.]\n", " [4. 7.]]\n"]}], "source": ["# Single entry, first row, second column\n", "print(A[0,1])\n", "\n", "# First row\n", "print(A[0,:])\n", "\n", "# last row\n", "print(A[-1,:])\n", "\n", "# Second column\n", "print(A[:,1])\n", "\n", "# Extract a 2x2 sub-matrix\n", "print(A[1:3,1:3])"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Exercises\n", "\n", "### Solving a linear system\n", "\n", "Let's quickly consider the $2 \\times 2$ case from the beginning of the notebook that we claimed the solution for to be\n", "\n", "$$x=\\frac{37}{11} \\quad\\text{and}\\quad y=\\frac{1}{11}.$$\n", "\n", "To solve the matrix equation \n", "\n", "$$ A\\pmb{x}=\\pmb{b}$$\n", "\n", "we can simply multiply both sides by the inverse of the matrix $A$ (if $A$ is [invertible](https://en.wikipedia.org/wiki/Invertible_matrix)):\n", "\n", "$$\n", "\\begin{align}\n", "A\\pmb{x} & = \\pmb{b}\\\\\\\\\\\\\n", "\\implies A^{-1}A\\pmb{x} & = A^{-1}\\pmb{b}\\\\\\\\\\\\\n", "\\implies I\\pmb{x} & = A^{-1}\\pmb{b}\\\\\\\\\\\\\n", "\\implies \\pmb{x} & = A^{-1}\\pmb{b}\n", "\\end{align}\n", "$$\n", "\n", "so we can find the solution $\\pmb{x}$ by multiplying the inverse of $A$ with the RHS vector $\\pmb{b}$."]}, {"cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Det A = -11.0\n", "A^-1 @ b = [3.36363636 0.09090909]\n"]}], "source": ["A = np.array([[2., 3.],\n", " [1., -4.]])\n", "\n", "# Check first whether the determinant of A is non-zero\n", "print(\"Det A = \", sl.det(A))\n", "\n", "b = np.array([7., 3.])\n", "\n", "# Compute A inverse and multiply by b\n", "print(\"A^-1 @ b =\", sl.inv(A) @ b)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["We can solve the system using `scipy.linalg.solve`:"]}, {"cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["A^-1 @ b = [3.36363636 0.09090909]\n"]}], "source": ["print(\"A^-1 @ b =\", sl.solve(A,b))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Check if the solutions match:"]}, {"cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["True\n"]}], "source": ["print(np.allclose(np.array([37./11., 1./11.]), sl.solve(A,b)))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Matrix multiplication\n", "\n", "\n", "Let \n", "\n", "$$\n", "A = \\left(\n", " \\begin{array}{ccc}\n", " 1 & 2 & 3 \\\\\n", " 4 & 5 & 6 \\\\\n", " 7 & 8 & 9 \n", " \\end{array}\n", "\\right)\n", "\\mathrm{\\quad\\quad and \\quad\\quad}\n", "b = \\left(\n", " \\begin{array}{c}\n", " 2 \\\\\n", " 4 \\\\\n", " 6 \n", " \\end{array}\n", "\\right)\n", "$$\n", "\n", "We will store $A$ and $b$ in NumPy arrays. We will create NumPy array $I$ containing the identity matrix $I_3$ and perform $A = A+I$. Then we will substitute third column of $A$ with $b$. We will solve $Ax=b$."]}, {"cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["A = [[1 2 3]\n", " [4 5 6]\n", " [7 8 9]]\n", "b = [2 4 6]\n", "Size of A: 9 and shape of A: (3, 3)\n", "Size of b: 3 and shape of b: (3,)\n", "I = [[1. 0. 0.]\n", " [0. 1. 0.]\n", " [0. 0. 1.]]\n", "A = [[ 2. 2. 3.]\n", " [ 4. 6. 6.]\n", " [ 7. 8. 10.]]\n", "A = [[2. 2. 2.]\n", " [4. 6. 4.]\n", " [7. 8. 6.]]\n", "x = [3.80647894e-17 7.77156117e-17 1.00000000e+00]\n"]}], "source": ["A = np.array([[1, 2, 3],\n", " [4, 5, 6],\n", " [7, 8, 9]])\n", "b = np.array([2, 4, 6])\n", "print(\"A =\", A)\n", "print(\"b = \",b)\n", "\n", "print(\"Size of A: \", A.size,\" and shape of A: \",A.shape)\n", "print(\"Size of b: \", b.size,\" and shape of b: \",b.shape)\n", "\n", "I = np.eye(3)\n", "print(\"I = \",I)\n", "A = A + I\n", "print(\"A = \",A)\n", "\n", "A[:, 2] = b\n", "print(\"A = \",A)\n", "\n", "x = sl.solve(A,b)\n", "print(\"x = \", x)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Matrix properties\n", "\n", "Consider $N$ linear equations in $N$ unknowns, $A\\pmb{x}=\\pmb{b}$.\n", "\n", "this system has a unique solution provided that the determinant of $A$, $\\det(A)$, is non-zero. In this case the matrix is said to be non-singular.\n", "\n", "If $\\det(A)=0$ (with $A$ then termed a singular matrix), then the linear system does not have a unique solution, it may have either infinite or no solutions.\n", "\n", "For example, consider\n", "\n", "$$\n", "\\left(\n", " \\begin{array}{rr}\n", " 2 & 3 \\\\\n", " 4 & 6 \n", " \\end{array}\n", "\\right)\\left(\n", " \\begin{array}{c}\n", " x \\\\\n", " y\n", " \\end{array}\n", "\\right) = \\left(\n", " \\begin{array}{c}\n", " 4 \\\\\n", " 8 \\\\\n", " \\end{array}\n", "\\right).\n", "$$\n", "\n", "The second equation is simply twice the first, and hence a solution to the first equation is also automatically a solution to the second equation.\n", "\n", "We only have one linearly-independent equation, and our problem is under-constrained - we effectively only have one eqution for two unknowns with infinitely many possibly solutions.\n", "\n", "If we replaced the RHS vector with $(4,7)^T$, then the two equations would be contradictory - in this case we have no solutions.\n", "\n", "Note that a set of vectors where one can be written as a linear sum of the others are termed linearly-dependent. When this is not the case the vectors are termed linearly-independent.\n", "\n", "```{admonition} The following properties of a square $n\\times n$ matrix are equivalent:\n", "\n", "* $\\det(A)\\ne 0\\implies$ A is non-singular\n", "* the columns of $A$ are linearly independent\n", "* the rows of $A$ are linearly independent\n", "* the columns of $A$ span $n$-dimensional space (we can reach any point in $\\mathbb{R}^N$ through a linear combination of these vectors)\n", "* $A$ is invertible, i.e. there exists a matrix $A^{-1}$ such that $A^{-1}A = A A^{-1}=I$\n", "* the matrix system $A\\pmb{x}=\\pmb{b}$ has a unique solution for every vector $b$\n", "\n", "```"]}, {"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": []}], "metadata": {"celltoolbar": "Tags", "kernelspec": {"display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.8"}}, "nbformat": 4, "nbformat_minor": 2}