{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "15d8d79c", "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/tmp/ipykernel_127386/425525964.py:2: DeprecationWarning: `set_matplotlib_formats` is deprecated since IPython 7.23, directly use `matplotlib_inline.backend_inline.set_matplotlib_formats()`\n", " set_matplotlib_formats('svg')\n" ] } ], "source": [ "from IPython.display import set_matplotlib_formats\n", "set_matplotlib_formats('svg')" ] }, { "cell_type": "markdown", "id": "6d4be5f9", "metadata": { "lang": "fr", "slideshow": { "slide_type": "slide" } }, "source": [ "**Programming Course** - ***Master 1 PSL - Science et Génie des Matériaux / Énergie*** \n", "\n", "---------------\n", "\n", "# Introduction to Numpy\n", "\n", "**Basile Marchand (Centre des Matériaux- Mines ParisTech / CNRS / PSL University)**\n", "\n", "
\n", "Follow @BasileMarchand\n", "
" ] }, { "cell_type": "markdown", "id": "fe6b2d03", "metadata": { "lang": "en" }, "source": [ "## Numpy" ] }, { "cell_type": "markdown", "id": "153c00e7", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "NumPy is a Python module allowing to work with multidimensional arrays. Indeed Python does not have natively notions of arrays and therefore by extension even less notions of matrices. \n", "\n", "It is therefore necessary to use a particular module, which is not a module of the standard Python library. **The** recommended module for multidimensional array manipulation (including matrices) is **NumPy**. \n", "\n", "As a proof of the recognition of this module as well as of its performance it is worth mentioning that it is the module that is used in almost all other scientific modules available in Python. The secret of the NumPy module is that for performance reasons it is not developed in Python but in C++." ] }, { "cell_type": "markdown", "id": "f4c1f97b", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Of course the use of this module is done in the classic way: \n", "\n", "```python\n", "import numpy\n", "```\n", "\n", "However, for simplicity's sake you will almost always see the import done by giving an alias to numpy :\n", "\n", "```python\n", "import numpy as np\n", "```" ] }, { "cell_type": "markdown", "id": "95d7a78e", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "The basic object in NumPy, the one we will manipulate later, is the **np.ndarray**. An **np.ndarray** numpy is a multidimensional array of the same type (you can't mix integer, float and string in the same **np.ndarray** for example). We call *rank* of the **np.ndarray** the number of dimensions of the latter: \n", "* *rank of 1* : array with 1 dimension thus a line of M columns\n", "* *rank of 2* : array with 2 dimension thus N lines and M columns\n", "* *rank of 3* : array with three dimensions (a paving stone in space) \n", "* etc \n", "\n", "And the shape of the *array*, *shape* in English, is an N-tuple which characterizes the size of the array according to each of its dimensions. For example:\n", "* A row vector of size **N** corresponds to an **array** with rank=1 and shape=(N,)\n", "* A column vector of size **N** corresponds to an **array** with rank=2 and shape=(1,N)\n", "* A rectangular matrix **NxM** corresponds to an **array** with rank=2 and shape=(N,M)\n", "* A square hypermatrix **NxNxN** corresponds to an **array** with rank=3 and shape=(N,N,N)" ] }, { "cell_type": "markdown", "id": "55fe395a", "metadata": { "lang": "fr", "slideshow": { "slide_type": "slide" } }, "source": [ "### Creating an **array**\n", "\n", "The definition of an `np.ndarray` from a set of values is done by using `np.array` in the following way:" ] }, { "cell_type": "code", "execution_count": 2, "id": "1be161c0", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a 3x3 matrix : \n", "[[1 2 3]\n", " [4 5 6]\n", " [7 8 9]]\n", "a column vector : \n", "[[1]\n", " [2]\n", " [3]]\n", "a row vector : \n", "[1 2 3]\n", "a 3 dimensional array :\n", "[[[ 1 2 3]\n", " [ 2 5 6]]\n", "\n", " [[11 12 13]\n", " [14 15 16]]]\n" ] } ], "source": [ "import numpy as np\n", "a_matrix_3_3 = np.array([[1,2,3], [4,5,6], [7,8,9]])\n", "print(f\"a 3x3 matrix : \\n{a_matrix_3_3}\")\n", "a_vector_column = np.array([[1,], [2,], [3,]]) \n", "print(f\"a column vector : \\n{a_vector_column}\")\n", "a_vector_line = np.array([1,2,3])\n", "print(f\"a row vector : \\n{a_vector_line}\")\n", "an_array_3_dimension = np.array( [[[1,2,3],[2,5,6]], [[11,12,13],[14,15,16]]])\n", "print(f\"a 3 dimensional array :\\n{an_array_3_dimension}\")" ] }, { "cell_type": "markdown", "id": "33e161bc", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "To know the rank and the shape of a NumPy **array** you just have to proceed as follows:" ] }, { "cell_type": "code", "execution_count": 3, "id": "9f6e185d", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "shape = (3, 1)\n", "rank = 2\n" ] } ], "source": [ "shape = a_vector_column.shape \n", "rank = a_vector_column.ndim \n", "print(\"shape = {}\".format(shape))\n", "print(\"rank = {}\".format(rank))" ] }, { "cell_type": "markdown", "id": "67dc3d49", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Moreover, to know the number of elements contained in a `np.array` you just have to access the size attribute of the latter. For example :" ] }, { "cell_type": "code", "execution_count": 4, "id": "fb445a90", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "size = 3\n" ] } ], "source": [ "nElement = a_vector_column.size\n", "print(f\"size = {nElement}\")" ] }, { "cell_type": "markdown", "id": "03972a99", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "In order to initialize an **array** NumPy has a number of functions to create arrays. \n", "* `np.zeros` which allows to create an array containing only zeros\n", "* `np.zeros_like` which allows to build a matrix of zeros having the same shape as another matrix given as input.\n", "* `np.ones` which creates an array containing only ones.\n", "* `np.eye` which creates an identity array.\n", "* `np.random.rand` which creates a matrix with random values.\n", "\n", "Below are examples of how to use each of these functions." ] }, { "cell_type": "code", "execution_count": 5, "id": "827b29b4", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "np.zeros\n", "[[0. 0. 0. 0.]\n", " [0. 0. 0. 0.]]\n", "np.ones\n", "[[1.]\n", " [1.]\n", " [1.]\n", " [1.]\n", " [1.]]\n", "np.zeros_like\n", "[[0. 0. 0.]\n", " [0. 0. 0.]]\n", "np.eye\n", "[[1. 0. 0. 0.]\n", " [0. 1. 0. 0.]\n", " [0. 0. 1. 0.]\n", " [0. 0. 0. 1.]]\n", "np.random.rand\n", "[[0.32115229 0.49376662 0.69948405 0.2243675 0.50560415]\n", " [0.8781221 0.28372339 0.9988285 0.09940989 0.83454571]\n", " [0.09895151 0.08313705 0.00458943 0.63326481 0.86652427]]\n" ] } ], "source": [ "print(\"np.zeros\")\n", "print(np.zeros((2,4)))\n", "print(\"np.ones\")\n", "print(np.ones((5,1)))\n", "print(\"np.zeros_like\")\n", "m = np.ones((2,3))\n", "print(np.zeros_like(m))\n", "print(\"np.eye\")\n", "print(np.eye(4))\n", "print(\"np.random.rand\")\n", "print(np.random.rand(3,5))" ] }, { "cell_type": "markdown", "id": "b9039feb", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "### A word about `np.matrix`\n", "\n", "There is an object of type `matrix` in numpy. At first sight it would be tempting to think that this is the ideal thing for target applications in pre-prep. Well no, it's a false good idea! Don't use `np.matrix` because it will only introduce weird bugs in the codes." ] }, { "cell_type": "markdown", "id": "73117642", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "### A word about what C++ imposes on us behind numpy" ] }, { "cell_type": "code", "execution_count": 6, "id": "1d850478", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "array = [0.33876323 0.22514646 0.1477305 0.06744456 0.3757398 0.5961389\n", " 0.70859954 0.85553278 0.78602837 0.26099642]\n" ] } ], "source": [ "array = np.random.rand(10)\n", "print(f\"array = {array}\")" ] }, { "cell_type": "code", "execution_count": 7, "id": "433cf3da", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "array = [10. 0.22514646 0.1477305 0.06744456 0.3757398 0.5961389\n", " 0.70859954 0.85553278 0.78602837 0.26099642]\n" ] } ], "source": [ "array[0] = int(10)\n", "print(f\"array = {array}\")" ] }, { "cell_type": "code", "execution_count": 8, "id": "8dd512c2", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "could not convert string to float: 'Hello'\n" ] } ], "source": [ "try:\n", " array[0] = \"Hello\"\n", "except Exception as e: \n", " print(e.args[0])" ] }, { "cell_type": "markdown", "id": "bf41fc14", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "And yes the `np.array` are not like Python lists, they are homogeneous containers. You can't store values of different types in them, `numpy` will always try to convert what you give it into the array type." ] }, { "cell_type": "markdown", "id": "071f0590", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "This behavior may seem strange, given the dynamically typed nature of Python ! But I remind you that NumPy is not developed in Python but in C++. But C++ is a statically typed language. This is the price to pay for performance! So each `np.ndarray` is associated with a type. To know the type of the elements you just have to access the `dtype` attribute. For example:" ] }, { "cell_type": "code", "execution_count": 9, "id": "1a0453eb", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "dtype('float64')" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array.dtype" ] }, { "cell_type": "markdown", "id": "8271c633", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "You can see that the type of values that can be contained in the array is `float64` which corresponds to a double precision float (coded on 64 bits). So all the elements that we want to put in the array will be converted to `float64`. If this conversion is not possible we get an error!" ] }, { "cell_type": "markdown", "id": "9d3e77ca", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "It is possible to change the type `np.ndarray` for that it is enough to use the method `astype`. For example if I want to convert the array `tableau` which contains only `float64` into an array containing `int32` it is enough to proceed as follows:" ] }, { "cell_type": "code", "execution_count": 10, "id": "26ab5566", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "arrayInt = [10 0 0 0 0 0 0 0 0 0]\n" ] } ], "source": [ "arrayInt = array.astype(np.int32)\n", "print(f\"arrayInt = {arrayInt}\")" ] }, { "cell_type": "code", "execution_count": 11, "id": "d8fdbb54", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "dtype('int32')" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arrayInt.dtype" ] }, { "cell_type": "markdown", "id": "f0f8b39b", "metadata": { "lang": "en", "slideshow": { "slide_type": "fragment" } }, "source": [ "Then you notice that most of the values become `0`. This is because converting a float64 to an integer is done by simply truncating!" ] }, { "cell_type": "markdown", "id": "db3e88c7", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Of course it is possible when creating an `np.ndarray` to specify the type of element you want, which bypasses the type deduction mechanism of numpy. For example, if we create an array from a list containing only integers." ] }, { "cell_type": "code", "execution_count": 12, "id": "eb799e35", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "type = int64\n" ] } ], "source": [ "array_no_type = np.array([1,2,3,4])\n", "print(f\"type = {array_no_type.dtype}\")" ] }, { "cell_type": "markdown", "id": "11ce6348", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "Numpy automatically deduces an `int64` type." ] }, { "cell_type": "markdown", "id": "2abc26ee", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "But if I want to have `float64` how do I do it ? The stupid and nasty solution is to put dots in the list that I provide as input, for example :" ] }, { "cell_type": "code", "execution_count": 13, "id": "546f7ae8", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "dtype('float64')" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array_no_type = np.array([1.,2.,3.,4.])\n", "array_no_type.dtype" ] }, { "cell_type": "markdown", "id": "8c305054", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "By the way a remark, if I put only a dot in the list at the first element for example numpy will still consider `float64`. Because in the presence of a heterogeneous list NumPy will take the highest level type, in this case `float64`." ] }, { "cell_type": "code", "execution_count": 14, "id": "f10ff44b", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "dtype('float64')" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array_no_type = np.array([1.,2,3,4])\n", "array_no_type.dtype" ] }, { "cell_type": "markdown", "id": "1caba495", "metadata": { "lang": "en", "slideshow": { "slide_type": "fragment" } }, "source": [ "The other slightly more elegant solution is to specify the type of the `np.ndarray` via the optional `dtype` argument of `np.array`. For example:" ] }, { "cell_type": "code", "execution_count": 15, "id": "c4665507", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "dtype('float64')" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array_typed = np.array([1,2,3,4], dtype=np.float64)\n", "array_typed.dtype" ] }, { "cell_type": "code", "execution_count": 16, "id": "854a5ecb", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([10.6, 2. , 3. , 4. ])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array_typed[0] = 10.6\n", "array_typed" ] }, { "cell_type": "markdown", "id": "ec8e8c56", "metadata": { "lang": "fr", "slideshow": { "slide_type": "slide" } }, "source": [ "### Mathematical operations and vectorization" ] }, { "cell_type": "markdown", "id": "47459b4f", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "NumPy allows you to create multidimensional arrays, as we have just seen. But once the table with data is created, it is necessary to be able to apply treatments to these data. Of course NumPy is there for that too!" ] }, { "cell_type": "markdown", "id": "e8713e5c", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "To start with the basic operations `+`, `-`, `*`, `/` are all available in numpy." ] }, { "cell_type": "markdown", "id": "a274578c", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "There are two cases to consider: \n", "\n", "1. Operation between two `np.ndarray` : **the operations are term to term, including for `*`**\n", "2. Operation between an `np.ndarray` and a number" ] }, { "cell_type": "markdown", "id": "9dd42244", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "For example:" ] }, { "cell_type": "code", "execution_count": 17, "id": "bb2169bb", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "a = np.array([[1,2,3],[4,5,6]], dtype=np.float64)\n", "b = np.array([[1,2,3],[4,5,6]], dtype=np.float64)" ] }, { "cell_type": "code", "execution_count": 18, "id": "2e3b7d86", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 2., 4., 6.],\n", " [ 8., 10., 12.]])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a + b " ] }, { "cell_type": "code", "execution_count": 19, "id": "89276ba8", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0., 0., 0.],\n", " [0., 0., 0.]])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a - b " ] }, { "cell_type": "code", "execution_count": 20, "id": "856c49d5", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 1., 4., 9.],\n", " [16., 25., 36.]])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a * b" ] }, { "cell_type": "code", "execution_count": 21, "id": "8b5c47e3", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[1., 1., 1.],\n", " [1., 1., 1.]])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a / b " ] }, { "cell_type": "markdown", "id": "6a1391fa", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "**Broadcasting** \n", "\n", "NumPy for basic operations has a behavior that may seem strange to you when the two `np.ndarray` do not have matching `shape`. This is called broadcasting! If I sum a `2,3` array and a `3,` array, logically we would say that it must not work. But in reality:" ] }, { "cell_type": "code", "execution_count": 22, "id": "8215daf7", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "c = np.array([1,2,3], dtype=np.float64)" ] }, { "cell_type": "code", "execution_count": 23, "id": "8dcc7b57", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a=array([[1., 2., 3.],\n", " [4., 5., 6.]])\n", "c=array([1., 2., 3.])\n" ] }, { "data": { "text/plain": [ "array([[2., 4., 6.],\n", " [5., 7., 9.]])" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(f\"{a=}\")\n", "print(f\"{c=}\")\n", "a + c" ] }, { "cell_type": "markdown", "id": "91006fcd", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "Numpy has in effect replaced the array `c=np.array([1,2,3])` by `np.array([[1,2,3], [1,2,3]])`. This behavior works for all basic operations" ] }, { "cell_type": "code", "execution_count": 24, "id": "1aaa18a4", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([[1. , 1. , 1. ],\n", " [4. , 2.5, 2. ]])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a / c" ] }, { "cell_type": "code", "execution_count": 25, "id": "c64ca254", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "d = np.array([[1.,], [2.,]])" ] }, { "cell_type": "code", "execution_count": 26, "id": "558baf95", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[2., 3., 4.],\n", " [6., 7., 8.]])" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a + d" ] }, { "cell_type": "markdown", "id": "23688daf", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "It is this `broadcasting` that also allows us to do the basic operations between an `np.ndarray` and a number. For example:" ] }, { "cell_type": "code", "execution_count": 27, "id": "13dc12d3", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 2., 4., 6.],\n", " [ 8., 10., 12.]])" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2. * a " ] }, { "cell_type": "code", "execution_count": 28, "id": "a5d23f09", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[3., 4., 5.],\n", " [6., 7., 8.]])" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2 + a" ] }, { "cell_type": "code", "execution_count": 29, "id": "6bb903fb", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[2. , 1. , 0.66666667],\n", " [0.5 , 0.4 , 0.33333333]])" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2 / a " ] }, { "cell_type": "code", "execution_count": 30, "id": "77ce5e74", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.5, 1. , 1.5],\n", " [2. , 2.5, 3. ]])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a / 2. " ] }, { "cell_type": "markdown", "id": "5f8ab35a", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "**The special case of the matrix product**" ] }, { "cell_type": "markdown", "id": "627b87ce", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "The question you are probably asking yourself is but can numpy do a matrix product as we teach it to our prep school students?" ] }, { "cell_type": "markdown", "id": "524a25ef", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "Don't worry, the answer is YES ! It's just that the matrix product between two `np.ndarray` which would have the right sizes is not symbolized by the `*` operator but by `np.dot` or `@`." ] }, { "cell_type": "markdown", "id": "5d4492f2", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "For example:" ] }, { "cell_type": "code", "execution_count": 31, "id": "1539dd4c", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "a = np.random.rand(4,2)\n", "b = np.random.rand(2,5)" ] }, { "cell_type": "code", "execution_count": 32, "id": "b3dcc935", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.20908365, 0.31136876, 0.86303555, 0.31845931, 0.2879681 ],\n", " [0.71982047, 0.50972344, 0.84335463, 0.76717688, 0.4179944 ],\n", " [0.45265308, 0.44934967, 1.01784862, 0.55785474, 0.39422503],\n", " [0.59022932, 0.54158917, 1.1594244 , 0.70144803, 0.46882983]])" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a @ b " ] }, { "cell_type": "code", "execution_count": 33, "id": "fa75c5ce", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.20908365, 0.31136876, 0.86303555, 0.31845931, 0.2879681 ],\n", " [0.71982047, 0.50972344, 0.84335463, 0.76717688, 0.4179944 ],\n", " [0.45265308, 0.44934967, 1.01784862, 0.55785474, 0.39422503],\n", " [0.59022932, 0.54158917, 1.1594244 , 0.70144803, 0.46882983]])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(a, b)" ] }, { "cell_type": "markdown", "id": "61e78796", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "In the same way, to make a matrix-vector product, which is nothing else than the product of an array $Ntimes N$ by a matrix $Ntimes 1$, we proceed as follows:" ] }, { "cell_type": "code", "execution_count": 34, "id": "cf7d1abc", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.36897367],\n", " [0.93679964],\n", " [0.66550232],\n", " [0.8414754 ]])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "v = np.random.rand(2,1)\n", "\n", "a@v" ] }, { "cell_type": "markdown", "id": "ca2586e9", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "**The transpose of an array \n", "\n", "Another essential element of matrix calculation is the transpose. Of course, Numpy has foreseen everything. To compute the transpose of an `np.ndarray` you just have to proceed as follows:" ] }, { "cell_type": "code", "execution_count": 35, "id": "d3052706", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.96491959, 0.89512764, 0.47323461, 0.63153701],\n", " [0.65014771, 0.85114634, 0.84049982, 0.61118797]])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.random.rand(2,4)\n", "a" ] }, { "cell_type": "code", "execution_count": 36, "id": "cd44419e", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.96491959, 0.65014771],\n", " [0.89512764, 0.85114634],\n", " [0.47323461, 0.84049982],\n", " [0.63153701, 0.61118797]])" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b1 = a.T\n", "b1" ] }, { "cell_type": "code", "execution_count": 37, "id": "b0e96303", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.96491959, 0.65014771],\n", " [0.89512764, 0.85114634],\n", " [0.47323461, 0.84049982],\n", " [0.63153701, 0.61118797]])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b2 = np.transpose(a)\n", "b2 " ] }, { "cell_type": "markdown", "id": "7ecb517d", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Beware the transpositon operation only applies to `np.ndarray` of rank greater than or equal to 2. For example the transpose of a \"row vector\" does not give a column vector :" ] }, { "cell_type": "code", "execution_count": 38, "id": "516ab9d0", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "v = [0.06016877 0.9835177 0.33126502 0.82118331]\n", "vt = [0.06016877 0.9835177 0.33126502 0.82118331]\n" ] } ], "source": [ "v = np.random.rand(4)\n", "print(f\"v = {v}\")\n", "vt = v.T\n", "print(f\"vt = {vt}\")" ] }, { "cell_type": "markdown", "id": "a7dfc93a", "metadata": { "lang": "fr", "slideshow": { "slide_type": "slide" } }, "source": [ "### More complex operations" ] }, { "cell_type": "markdown", "id": "45152912", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Of course the operations `+`, `-`, `*`, `/` are not the only ones available. All classical mathematical functions are defined in numpy. \n", "\n", "* `np.cos`, `np.sin`, `np.tan`\n", "* `np.arccos`, `np.arcsin`, `np.arctan`\n", "* `np.degrees`, `np.radians`, `np.exp`, `np.arcsin`, `np.arctan`.\n", "* `np.exp`, `np.log`\n", "\n", "The interest of these functions, which all exist in the `math` module of Python, is that they are made to work on `np.ndarray`." ] }, { "cell_type": "markdown", "id": "822322c5", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "For example if we evaluate the function $\\sin x$." ] }, { "cell_type": "markdown", "id": "6c3c9570", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "In basic Python we would do something like this" ] }, { "cell_type": "code", "execution_count": 39, "id": "4af6d681", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2024-10-17T18:16:22.068577\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "filenames": { "image/svg+xml": "/home/marchand/Documents/ENSEIGNEMENT/python-for-engineers/notebooks/_build/jupyter_execute/06_numpy_78_0.svg" } }, "output_type": "display_data" } ], "source": [ "import math\n", "nStep = 100\n", "x = [ 2*math.pi*i/nStep for i in range(nStep+1)]\n", "y = [ math.sin(x_i) for x_i in x]\n", "\n", "import matplotlib.pyplot as plt \n", "plt.plot(x,y)\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "f4250830", "metadata": { "lang": "en", "slideshow": { "slide_type": "subslide" } }, "source": [ "While using NumPy we can directly write:" ] }, { "cell_type": "code", "execution_count": 40, "id": "2575f61c", "metadata": { "lang": "en", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2024-10-17T18:16:22.133540\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "filenames": { "image/svg+xml": "/home/marchand/Documents/ENSEIGNEMENT/python-for-engineers/notebooks/_build/jupyter_execute/06_numpy_80_0.svg" } }, "output_type": "display_data" } ], "source": [ "xNumpy = np.linspace(0, 2*np.pi, nStep)\n", "yNumpy = np.sin(xNumpy)\n", "plt.plot(xNumpy,yNumpy)\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "175ebaaa", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "There are two advantages to the Numpy approach: \n", "\n", "1. It is simpler to code and more pleasant to read afterwards \n", "2. It is much more powerful" ] }, { "cell_type": "code", "execution_count": 41, "id": "4883fbe7", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5.02 µs ± 63.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n" ] } ], "source": [ "%timeit [math.sin(x_i) for x_i in x]" ] }, { "cell_type": "code", "execution_count": 42, "id": "3e4538a1", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "325 ns ± 3.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)\n" ] } ], "source": [ "%timeit np.sin(xNumpy)" ] }, { "cell_type": "markdown", "id": "62d723b6", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "So there is a factor of 4 between the basic Python version and the NumPy version, and I can assure you that things get much worse when we move on to real problems!" ] }, { "cell_type": "markdown", "id": "2d865a3b", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "You may wonder why it goes 4 times faster!? It's simply because on one side you do the loop in the Python world while on the other side the loop is done in the Numpy world so C++." ] }, { "cell_type": "markdown", "id": "f6e149c9", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "In the broad outline hidden behind all this is the fact that the numpy arrays are actually allocated in memory contiguously, it's `double*`. And so c++ does a great job of going through the entire array and applying a function to all the elements. Whereas Python has more trouble because it doesn't presuppose a memory alignment and therefore spends its time doing indirections." ] }, { "cell_type": "markdown", "id": "f651c5f6", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "__The basic rule to remember is that when manipulating numpy arrays you should **never** make loops__." ] }, { "cell_type": "markdown", "id": "7296cc36", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "If you want to apply a \"personal\" function to an `np.ndarray` it is possible by using the `np.vectorize` function to vectorize your function." ] }, { "cell_type": "code", "execution_count": 43, "id": "0dae6f19", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def my_function(x):\n", " if x < 0.5:\n", " return x \n", " else:\n", " return -x" ] }, { "cell_type": "markdown", "id": "11b89624", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Without vectorization you would have to do something like:" ] }, { "cell_type": "code", "execution_count": 44, "id": "68a21f55", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "data = np.random.rand(10,20,30)" ] }, { "cell_type": "code", "execution_count": 45, "id": "a8ed1d38", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.25 ms ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n" ] } ], "source": [ "%%timeit\n", "for i,x in enumerate(data):\n", " for j, y in enumerate(x): \n", " for k, z in enumerate(y): \n", " data[i,j,k] = my_function(z)" ] }, { "cell_type": "markdown", "id": "0037e1d3", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Whereas if we vectorize the `my_function` function this not very nice triple loop comes down to something much nicer :" ] }, { "cell_type": "code", "execution_count": 46, "id": "2eaedd1a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "382 µs ± 6.35 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n" ] } ], "source": [ "my_function_vect = np.vectorize(my_function)\n", "\n", "%timeit my_function_vect(data)" ] }, { "cell_type": "markdown", "id": "a8cd08db", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "So we observe a significant gain at runtime and most importantly the code is much more pleasant to read." ] }, { "cell_type": "markdown", "id": "ff12110a", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Array manipulation" ] }, { "cell_type": "markdown", "id": "8a8743df", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "So far we have seen how to define `np.ndarray` and how to use these arrays to make more or less complex evaluations. This is good but it is not enough to cover 100% of the needs. In many cases we need to be able to access particular values of an array." ] }, { "cell_type": "markdown", "id": "938ae772", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "The manipulation of the NumPy `np.ndarray` and in particular the access to the values contained in the latter is done in the same spirit as the access to the elements of a list with the difference that one must specify for an `np.ndarray` several indexes since it is a multidimensional array. \n", "\n", "> **Attention :** \n", "> As for lists and tuples, the numbering of indices starts at **0**." ] }, { "cell_type": "code", "execution_count": 47, "id": "c7f3da2d", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The array :\n", " an_array=array([[ 1, 2, 3, 4, 5],\n", " [ 6, 7, 8, 9, 10],\n", " [11, 12, 13, 14, 15]])\n" ] } ], "source": [ "an_array = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]])\n", "print(f\"The array :\\n {an_array=}\")" ] }, { "cell_type": "markdown", "id": "29925794", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Accessing the elements of an np.ndarray is done in the same way as accessing the values of a list, namely by using the `[]` operator. The subtlety is that the `[]` operator of an `np.ndarray` can take as input several indices." ] }, { "cell_type": "code", "execution_count": 48, "id": "c0f2aad4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Element 1,2 : 8\n" ] } ], "source": [ "a_12 = an_array[1,2]\n", "print(f\"Element 1,2 : {a_12}\")" ] }, { "cell_type": "markdown", "id": "2e015990", "metadata": { "lang": "en", "slideshow": { "slide_type": "fragment" } }, "source": [ "Negative indices can also be used to access values from the end:" ] }, { "cell_type": "code", "execution_count": 49, "id": "e1f28b53", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Element -1,-1 : 15\n" ] } ], "source": [ "a_24 = an_array[-1,-1]\n", "print(f\"Element -1,-1 : {a_24}\")" ] }, { "cell_type": "markdown", "id": "a4187b97", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "In addition, as for lists, we can use the concept of slicing. As a reminder, the notation is of the form : \n", "\n", "```\n", "start:stop+1:step\n", "```" ] }, { "cell_type": "markdown", "id": "424f7667", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "For example, if I want to extract the first row of the matrix `a_table` we can proceed as follows:" ] }, { "cell_type": "code", "execution_count": 50, "id": "fc4197df", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Line_0 : [1 2 3 4 5]\n" ] } ], "source": [ "line_0 = an_array[0,:]\n", "print(f\"Line_0 : {line_0}\")" ] }, { "cell_type": "markdown", "id": "0159955b", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "We can then use these notations to extract a subarray :" ] }, { "cell_type": "code", "execution_count": 51, "id": "b571f53c", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 7 8 9 10]\n", " [12 13 14 15]]\n" ] } ], "source": [ "sub_array = an_array[1:,1:]\n", "print(sub_array)" ] }, { "cell_type": "code", "execution_count": 52, "id": "b69cfa8d", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 2 3 4 5]\n" ] } ], "source": [ "sub_array = an_array[0,:]\n", "print(sub_array)" ] }, { "cell_type": "code", "execution_count": 53, "id": "e59ebf1a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1 3 5]\n", " [11 13 15]]\n" ] } ], "source": [ "sub_array = an_array[::2,::2]\n", "print(sub_array)" ] }, { "cell_type": "markdown", "id": "73a6a067", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "The subarray we get then is a bit special it's called a view. What particularity? An example will speak for itself:" ] }, { "cell_type": "code", "execution_count": 54, "id": "2f6ea3a4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 2, 3, 4, 5],\n", " [ 6, 7, 8, 9, 10],\n", " [11, 12, 13, 14, 15]])" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "an_array" ] }, { "cell_type": "code", "execution_count": 55, "id": "c90dd19e", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 3, 5],\n", " [11, 13, 15]])" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sub_array" ] }, { "cell_type": "code", "execution_count": 56, "id": "81c8479a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[10, 3, 5],\n", " [11, 13, 15]])" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sub_array[0,0] = 10\n", "sub_array" ] }, { "cell_type": "code", "execution_count": 57, "id": "d53336e6", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[10, 2, 3, 4, 5],\n", " [ 6, 7, 8, 9, 10],\n", " [11, 12, 13, 14, 15]])" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "an_array" ] }, { "cell_type": "markdown", "id": "e384986e", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "And here is the drama, or not, the sub-table being only a view when we modify a value in the view we modify the corresponding box in the original table." ] }, { "cell_type": "markdown", "id": "6cab9c23", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "So be careful with sub-tables, it's very practical, and in terms of computational cost it allows you to make quite elegant optimizations, but on the other hand you must always keep in mind that you are working on a view." ] }, { "cell_type": "markdown", "id": "495a542e", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "**A remark on the extraction of sub-table :**" ] }, { "cell_type": "markdown", "id": "c5c93681", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "Thus it is possible to access a sub-table in this way. However, in many applications, it is necessary to have access to a sub-table, often discontinuous, only from a list of row and column indices. If we do this directly, we can see below that the extracted sub-table does not match." ] }, { "cell_type": "code", "execution_count": 58, "id": "2a1ee689", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The complete matrix : \n", "[[ 1 2 3 4 5]\n", " [ 6 7 8 9 10]\n", " [11 12 13 14 15]]\n", "The submatrix by the wrong approach : \n", "[ 2 15]\n" ] } ], "source": [ "matrix_a = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]])\n", "print(\"The complete matrix : \\n{}\".format(matrix_a))\n", "idx_i = [0,2]\n", "idx_j = [1,4]\n", "sub_matrix = matrix_a[idx_i, idx_j]\n", "print(\"The submatrix by the wrong approach : \\n{}\".format(sub_matrix))" ] }, { "cell_type": "markdown", "id": "5ae62f11", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "In order to get the desired result it is necessary to use the function `np.ix_`. This function allows to generate from two lists of indices, the **mask** of desired values." ] }, { "cell_type": "code", "execution_count": 59, "id": "28e192dc", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The complete matrix : \n", "[[ 1 2 3 4 5]\n", " [ 6 7 8 9 10]\n", " [11 12 13 14 15]]\n", "mask : (array([[0],\n", " [2]]), array([[1, 4]]))\n", "The submatrix by np.ix_ : \n", "[[ 2 5]\n", " [12 15]]\n" ] } ], "source": [ "matrix_a = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]])\n", "print(f\"The complete matrix : \\n{matrix_a}\")\n", "idx_i = [0,2]\n", "idx_j = [1,4]\n", "\n", "mask = np.ix_(idx_i, idx_j)\n", "print(f\"mask : {mask}\")\n", "\n", "sub_matrix = matrix_a[mask]\n", "print(\"The submatrix by np.ix_ : \\n{}\".format(sub_matrix))" ] }, { "cell_type": "markdown", "id": "787721a4", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "So we have just seen that we can easily extract sub-tables but obviously with the help of this we can easily insert values by block within a table of greater dimension. For example:" ] }, { "cell_type": "code", "execution_count": 60, "id": "f45a4481", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Big array : \n", "[[0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0.]]\n", "Little array : \n", "[[1. 0. 0.]\n", " [0. 1. 0.]\n", " [0. 0. 1.]]\n", "Big array after insertion : \n", "[[0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0.]\n", " [1. 0. 0. 0. 0. 0.]\n", " [0. 1. 0. 0. 0. 0.]\n", " [0. 0. 1. 0. 0. 0.]]\n" ] } ], "source": [ "big_array = np.zeros((6,6))\n", "little_array = np.eye(3)\n", "print(f\"Big array : \\n{big_array}\")\n", "print(f\"Little array : \\n{little_array}\")\n", "big_array[3:,0:3] = little_array\n", "print(f\"Big array after insertion : \\n{big_array}\")" ] }, { "cell_type": "code", "execution_count": 61, "id": "f34b140f", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "little_array = [[0.74497814 0.96188788]\n", " [0.31989114 0.76613268]]\n", "Big array after insertion: \n", "[[0. 0. 0. 0. 0. 0. ]\n", " [0. 0.74497814 0. 0.96188788 0. 0. ]\n", " [0. 0. 0. 0. 0. 0. ]\n", " [1. 0.31989114 0. 0.76613268 0. 0. ]\n", " [0. 1. 0. 0. 0. 0. ]\n", " [0. 0. 1. 0. 0. 0. ]]\n" ] } ], "source": [ "little_array = np.random.rand(2,2)\n", "print(f\"little_array = {little_array}\")\n", "big_array[np.ix_([1,3],[1,3])] = little_array\n", "print(f\"Big array after insertion: \\n{big_array}\")" ] }, { "cell_type": "markdown", "id": "6b379cc2", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Among the other possible manipulations on the **array** NumPy there is the `reshape` operation which allows to change the shape of an array. For example:" ] }, { "cell_type": "code", "execution_count": 62, "id": "494f7d03", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Array before reshape (2, 3) : \n", "[[1 2 3]\n", " [4 5 6]]\n", "Array after reshape (6, 1) : \n", "[[1]\n", " [2]\n", " [3]\n", " [4]\n", " [5]\n", " [6]]\n", "Array after reshape (6,) : \n", "[1 2 3 4 5 6]\n" ] } ], "source": [ "array_1 = np.array([[1,2,3],[4,5,6]])\n", "print(\"Array before reshape {} : \\n{}\".format( array_1.shape, array_1))\n", "\n", "array_2 = array_1.reshape((6,1))\n", "print(\"Array after reshape {} : \\n{}\".format( array_2.shape, array_2))\n", "\n", "array_3 = array_1.reshape((6,))\n", "print(\"Array after reshape {} : \\n{}\".format( array_3.shape, array_3))" ] }, { "cell_type": "markdown", "id": "98c10579", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "> **Caution :** \n", "> For the reshape operation to work it is imperative that the total number of elements is preserved. That is to say that it is imperative that the product of the sizes following each of the dimensions is equal before and after the `reshape`.\n", "\n", "> *Hint :* \n", "> For more simplicity you can leave one of the sizes free during the reshape operation. This size will be automatically deducted from the others in order to satisfy the condition of conservation of the number of elements. To do this, simply give a size of **-1** to the dimension left free." ] }, { "cell_type": "code", "execution_count": 63, "id": "d6446614", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "After the reshape((-1,1)) : \n", "[[1]\n", " [2]\n", " [3]\n", " [4]\n", " [5]\n", " [6]]\n" ] } ], "source": [ "column_vector = array_1.reshape((-1,1))\n", "print(\"After the reshape((-1,1)) : \\n{}\".format(column_vector))" ] }, { "cell_type": "markdown", "id": "2db60c03", "metadata": { "lang": "fr", "slideshow": { "slide_type": "slide" } }, "source": [ "### Boolean operations and mask" ] }, { "cell_type": "markdown", "id": "357cb8fb", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "A key concept of NumPy that allows us to avoid a `for` loop to process the data is the concept of mask. The latter is related to boolean operations." ] }, { "cell_type": "markdown", "id": "54ed2d92", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "What is a `mask`? It is an array, a `np.ndarray` but it contains only booleans. This `mask` will then allow us to isolate parts of `np.ndarray` and thus apply different treatments to different elements of an array." ] }, { "cell_type": "markdown", "id": "99e3da88", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Because an example is always more meaningful than long sentences:" ] }, { "cell_type": "code", "execution_count": 64, "id": "eb25ea1b", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.26247964, 0.79349128, 0.84861139],\n", " [0.26215106, 0.73349506, 0.42576842],\n", " [0.11822719, 0.54511227, 0.85190047],\n", " [0.4651002 , 0.72352484, 0.21725721],\n", " [0.5168999 , 0.63788268, 0.30173818],\n", " [0.20590138, 0.08705489, 0.600157 ],\n", " [0.44186687, 0.57743075, 0.77760956],\n", " [0.30906544, 0.74842288, 0.43980225],\n", " [0.42470675, 0.68425167, 0.76921278],\n", " [0.68285291, 0.16446949, 0.54787821]])" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = np.random.rand(10,3)\n", "data" ] }, { "cell_type": "markdown", "id": "cb54cd84", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "We can create a `mask` corresponding to values strictly less than `0.5`." ] }, { "cell_type": "code", "execution_count": 65, "id": "4d1e7ac0", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ True, False, False],\n", " [ True, False, True],\n", " [ True, False, False],\n", " [ True, False, True],\n", " [False, False, True],\n", " [ True, True, False],\n", " [ True, False, False],\n", " [ True, False, True],\n", " [ True, False, False],\n", " [False, True, False]])" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mask = data < 0.5 \n", "mask " ] }, { "cell_type": "markdown", "id": "8e65566c", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "If we apply the `mask` to the data array then we only get the values for which the corresponding box in the `mask` is `True`." ] }, { "cell_type": "code", "execution_count": 66, "id": "554115e0", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0.26247964, 0.26215106, 0.42576842, 0.11822719, 0.4651002 ,\n", " 0.21725721, 0.30173818, 0.20590138, 0.08705489, 0.44186687,\n", " 0.30906544, 0.43980225, 0.42470675, 0.16446949])" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[ mask ]" ] }, { "cell_type": "markdown", "id": "7963098c", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "The interest is that we can then apply a particular treatment to these values. For example :" ] }, { "cell_type": "code", "execution_count": 67, "id": "74ba98c2", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0. , 0.79349128, 0.84861139],\n", " [0. , 0.73349506, 0. ],\n", " [0. , 0.54511227, 0.85190047],\n", " [0. , 0.72352484, 0. ],\n", " [0.5168999 , 0.63788268, 0. ],\n", " [0. , 0. , 0.600157 ],\n", " [0. , 0.57743075, 0.77760956],\n", " [0. , 0.74842288, 0. ],\n", " [0. , 0.68425167, 0.76921278],\n", " [0.68285291, 0. , 0.54787821]])" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[ mask ] = 0. \n", "data " ] }, { "cell_type": "markdown", "id": "8e5644da", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "The construction of a mask can involve as many complex operations as you like. For example :" ] }, { "cell_type": "code", "execution_count": 68, "id": "6d39065f", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[3.42065649e-01 3.79190020e-01 2.06768802e-02]\n", " [5.63780120e-01 1.77303636e-01 5.98042679e-01]\n", " [9.96992503e-01 2.44551246e-02 7.79373688e-01]\n", " [5.69269353e-01 9.22078174e-01 8.93067687e-01]\n", " [9.74979570e-01 6.28905177e-01 8.28622572e-01]\n", " [2.82161884e-01 4.64519276e-01 1.14707865e-01]\n", " [1.30366641e-02 1.27843138e-01 1.14276123e-01]\n", " [2.46338528e-01 5.74544277e-01 6.70873707e-01]\n", " [9.56943634e-01 4.06732067e-04 1.44626191e-01]\n", " [6.51341333e-01 1.12734280e-01 4.93465605e-01]]\n" ] }, { "data": { "text/plain": [ "array([[False, False, True],\n", " [False, True, False],\n", " [False, True, False],\n", " [False, False, False],\n", " [False, False, False],\n", " [ True, False, True],\n", " [ True, True, True],\n", " [ True, False, False],\n", " [False, True, True],\n", " [False, True, False]])" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = np.random.rand(10,3)\n", "print(data)\n", "mask_0_03 = np.logical_and(data > 0., data < 0.3) \n", "mask_0_03 " ] }, { "cell_type": "code", "execution_count": 69, "id": "fc61f8b3", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0.55243662 0.01665614 0.22686973]\n", " [0.15980204 0.34271069 0.23228992]\n", " [0.78365911 0.04686651 0.72485694]\n", " [0.39202019 0.08579119 0.45621999]\n", " [0.46387074 0.0521096 0.5878552 ]\n", " [0.34875793 0.67408042 0.79227723]\n", " [0.17101679 0.42152956 0.57819604]\n", " [0.4984478 0.60257654 0.30474801]\n", " [0.16923732 0.43786364 0.3427004 ]\n", " [0.3713409 0.21744822 0.94073762]]\n" ] }, { "data": { "text/plain": [ "array([[False, True, True],\n", " [ True, False, True],\n", " [ True, True, True],\n", " [False, True, False],\n", " [False, True, False],\n", " [False, False, True],\n", " [ True, False, False],\n", " [False, False, False],\n", " [ True, False, False],\n", " [False, True, True]])" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = np.random.rand(10,3)\n", "print(data)\n", "mask_inf03_or_sup07 = np.logical_or(data<0.3, data>0.7)\n", "mask_inf03_or_sup07" ] }, { "cell_type": "markdown", "id": "3bc7cf94", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "And there is also the negation of a `mask" ] }, { "cell_type": "code", "execution_count": 70, "id": "5941f778", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[False True True]\n", " [ True False True]\n", " [ True True True]\n", " [False True False]\n", " [False True False]\n", " [False False True]\n", " [ True False False]\n", " [False False False]\n", " [ True False False]\n", " [False True True]]\n" ] }, { "data": { "text/plain": [ "array([[ True, False, False],\n", " [False, True, False],\n", " [False, False, False],\n", " [ True, False, True],\n", " [ True, False, True],\n", " [ True, True, False],\n", " [False, True, True],\n", " [ True, True, True],\n", " [False, True, True],\n", " [ True, False, False]])" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(mask_inf03_or_sup07)\n", "np.logical_not(mask_inf03_or_sup07)" ] }, { "cell_type": "markdown", "id": "21181ccc", "metadata": { "lang": "fr", "slideshow": { "slide_type": "slide" } }, "source": [ "### Reduction operation" ] }, { "cell_type": "markdown", "id": "fd01dee8", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "We saw at the beginning that there are a number of mathematical functions defined in NumPy that allow you to process all the entries in an array simultaneously." ] }, { "cell_type": "markdown", "id": "e206dd67", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "In a similar way you have at your disposal in NumPy some functions, called reduction functions, which allow you to calculate global quantities on an `np.ndarray`." ] }, { "cell_type": "markdown", "id": "8fd40fef", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "For example to calculate the average of a `np.ndarray` of rank 1. You might want to write :" ] }, { "cell_type": "code", "execution_count": 71, "id": "13ef5313", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "values = [0.31598379 0.27267081 0.28320475 0.49398502 0.14069128 0.78680327\n", " 0.39569586 0.29835225 0.37049427 0.62018582]\n" ] } ], "source": [ "values = np.random.rand(10)\n", "print(f\"values = {values}\")" ] }, { "cell_type": "code", "execution_count": 72, "id": "3ca14eff", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.39780671141150425\n" ] } ], "source": [ "m = 0\n", "for x in values:\n", " m += x\n", "m /= values.size\n", "print(m)" ] }, { "cell_type": "markdown", "id": "eb3609a8", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "This is not optimal, NumPy provides you with the `np.mean` function which is used as follows:" ] }, { "cell_type": "code", "execution_count": 73, "id": "c98cb447", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.3978067114115042" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.mean(values)" ] }, { "cell_type": "markdown", "id": "88370464", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "In the same register here is a non-exhaustive list of reduction functions available in Python : \n", " \n", "* `np.sum`\n", "* `np.min`\n", "* `np.mean`\n", "`np.std` * `np.var`\n", "\n", "* `np.max`\n", "* `np.min`\n", "* `np.argmax`\n", "* `np.argmin` \n", "\n", "The names are rather explicit" ] }, { "cell_type": "markdown", "id": "27d7e02f", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "There is just a little subtlety to know with these reduction operations. Indeed they work on `np.ndarray` of any rank. For example :" ] }, { "cell_type": "code", "execution_count": 74, "id": "b82bc3b0", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.30810789, 0.70869817, 0.54926571],\n", " [0.7067641 , 0.53767856, 0.34357478],\n", " [0.73523682, 0.76711646, 0.65686177],\n", " [0.70115931, 0.32946581, 0.57882714]])" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = np.random.rand(4,3)\n", "data" ] }, { "cell_type": "markdown", "id": "a3b1600d", "metadata": { "lang": "fr", "slideshow": { "slide_type": "slide" } }, "source": [ "If we then use the function `np.max` for example, as is this function will return the maximum value over the entire array." ] }, { "cell_type": "code", "execution_count": 75, "id": "a10628ac", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.7671164618589807" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.max(data)" ] }, { "cell_type": "markdown", "id": "6bdc1560", "metadata": { "lang": "en", "slideshow": { "slide_type": "subslide" } }, "source": [ "But this may not be the behavior you want. For example you want the max of each column:" ] }, { "cell_type": "code", "execution_count": 76, "id": "c746a027", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0.73523682, 0.76711646, 0.65686177])" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.max(data, axis=0)" ] }, { "cell_type": "markdown", "id": "70803ae5", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "Or the max of each row :" ] }, { "cell_type": "code", "execution_count": 77, "id": "7e6be75f", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0.70869817, 0.7067641 , 0.76711646, 0.70115931])" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.max(data, axis=1)" ] }, { "cell_type": "markdown", "id": "3dbede7f", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "So you can see that with the `axis` argument you can control the behavior of the reduction functions so that they are not applied globally but more specifically." ] }, { "cell_type": "markdown", "id": "6bbd80b1", "metadata": { "lang": "fr", "slideshow": { "slide_type": "slide" } }, "source": [ "### Linear algebra\n", "\n", "In addition to the usual operations and Boolean operations NumPy implements a number of linear algebra functions. Indeed, since NumPy is the Python module for multi-dimensional arrays and thus in particular matrices and vectors, it was imperative to have these linear algebra functions. To use the linear algebra functions of NumPy, you have to use the sub-module `numpy.linalg`." ] }, { "cell_type": "code", "execution_count": 78, "id": "8f85b1f1", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "import numpy.linalg as npl" ] }, { "cell_type": "markdown", "id": "6ea31bea", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "First of all there are the functions `norm`, `cond` and `det`, which as their names suggest allow you to calculate the norm, the conditioning and the determinant of a 2-dimensional array respectively." ] }, { "cell_type": "code", "execution_count": 79, "id": "2b960a46", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A = \n", "[[0.63578288 0.4890433 0.6526613 0.38883037 0.48724553]\n", " [0.09657889 0.0414308 0.9992744 0.89205416 0.03983972]\n", " [0.13692696 0.57453119 0.41126829 0.56209284 0.25703495]\n", " [0.04151461 0.90597196 0.88450923 0.43199426 0.0353019 ]\n", " [0.459214 0.88529945 0.34389409 0.81062336 0.3691441 ]]\n", "||A|| = 2.8042618057694018\n", "cond(A) = 24.89866805731778\n", "det(A) = -0.06613703135818148\n" ] } ], "source": [ "array_2d = np.random.rand(5,5)\n", "norm_array = npl.norm( array_2d )\n", "cond_array = npl.cond( array_2d )\n", "det_array = npl.det( array_2d )\n", "\n", "\n", "print(\"A = \\n{}\".format(array_2d))\n", "print(\"||A|| = {}\".format(norm_array))\n", "print(\"cond(A) = {}\".format(cond_array))\n", "print(\"det(A) = {}\".format(det_array))" ] }, { "cell_type": "markdown", "id": "169afee9", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Then there are all the methods for matrix decomposition and solving linear systems :\n", "* `solve( A, b )` which finds the solution to the system $Acdot x = b$\n", "* `inv( A )` which allows to calculate $A^{-1}$.\n", "* `pinv( A )` which computes the pseudo-inverse of the matrix $A$.\n", "* `svd` which computes the singular value decomposition of a matrix\n", "* `eig( A )` which computes the eigenvalues and eigenvectors" ] }, { "cell_type": "code", "execution_count": 80, "id": "53afd6dd", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "rhs = \n", "[[0.08218834]\n", " [0.17107379]\n", " [0.33032306]\n", " [0.96546186]\n", " [0.44907668]]\n", "Solution x = \n", "[[-0.06893817]\n", " [ 0.90035213]\n", " [ 0.24775929]\n", " [-0.0793133 ]\n", " [-0.91361865]]\n", "A.x-rhs = \n", "[[ 2.77555756e-17]\n", " [ 2.77555756e-17]\n", " [-5.55111512e-17]\n", " [ 1.11022302e-16]\n", " [-5.55111512e-17]]\n", "inv(A)*A = \n", "[[ 1.00000000e+00 9.76013054e-17 2.78331610e-17 1.07976544e-16\n", " -6.87173596e-18]\n", " [ 1.00579014e-16 1.00000000e+00 2.46551642e-16 2.59569975e-16\n", " 6.18972334e-17]\n", " [-6.85372616e-17 -8.09746441e-17 1.00000000e+00 -2.10309925e-16\n", " -1.18075429e-16]\n", " [ 1.09223255e-16 4.67743688e-18 -4.40225887e-17 1.00000000e+00\n", " -9.82693733e-18]\n", " [-8.58493461e-17 -4.50417431e-16 -6.19466951e-16 -2.65699854e-16\n", " 1.00000000e+00]]\n" ] } ], "source": [ "rhs = np.random.rand(5,1)\n", "print(\"rhs = \\n{}\".format(rhs))\n", "x = npl.solve( array_2d, rhs )\n", "print(\"Solution x = \\n{}\".format(x))\n", "verif = array_2d @ x - rhs\n", "print(\"A.x-rhs = \\n{}\".format(verif))\n", "array_inv = npl.inv( array_2d )\n", "verif = array_inv.dot( array_2d )\n", "print(\"inv(A)*A = \\n{}\".format( verif ) )" ] }, { "cell_type": "markdown", "id": "a265a8da", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "### Input-output with NumPy" ] }, { "cell_type": "markdown", "id": "5a8e6ef9", "metadata": { "lang": "fr", "slideshow": { "slide_type": "fragment" } }, "source": [ "In addition to providing functionality for creating and manipulating arrays and linear algebra, NumPy allows for simpler IO management for the user than is allowed in Python. \n", "\n", "Among the various IO functions that NumPy offers, the three that will certainly be most useful to you are : \n", "* `loadtxt` which allows you to load the contents of a text file (well formatted, for example a csv) directly as a NumPy array. \n", "* `savetxt` allows to save in a text file the content of a `array` numpy. \n", "* `genfromtxt` similar to `loadtxt` except that here the data file can contain holes, missing data, which will then be automatically replaced by a value specified by the user." ] }, { "cell_type": "markdown", "id": "1feaa975", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "Here is an extract of a text file containing tensile test acquisition data." ] }, { "cell_type": "code", "execution_count": 81, "id": "548657f5", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "head: impossible d'ouvrir 'data/curves/data.txt' en lecture: Aucun fichier ou dossier de ce nom\r\n" ] } ], "source": [ "!head data/curves/data.txt" ] }, { "cell_type": "markdown", "id": "a96b28c6", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "To load this data the first solution would be to parse the file by hand using `open`, `read` and finally the string method `split`. However, `numpy` provides the `loadtxt` method which offers more convenience. For example to load the previous data, it is done in one command:" ] }, { "cell_type": "code", "execution_count": 82, "id": "aa751347", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "FileNotFoundError", "evalue": "data/curves/data.txt not found.", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)", "Input \u001b[0;32mIn [82]\u001b[0m, in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0m data_from_file \u001b[38;5;241m=\u001b[39m \u001b[43mnp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mloadtxt\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mdata/curves/data.txt\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcomments\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m#\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/home/zebulon/miniconda3/envs/m1psl/lib/python3.10/site-packages/numpy/lib/npyio.py:1313\u001b[0m, in \u001b[0;36mloadtxt\u001b[0;34m(fname, dtype, comments, delimiter, converters, skiprows, usecols, unpack, ndmin, encoding, max_rows, quotechar, like)\u001b[0m\n\u001b[1;32m 1310\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(delimiter, \u001b[38;5;28mbytes\u001b[39m):\n\u001b[1;32m 1311\u001b[0m delimiter \u001b[38;5;241m=\u001b[39m delimiter\u001b[38;5;241m.\u001b[39mdecode(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mlatin1\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[0;32m-> 1313\u001b[0m arr \u001b[38;5;241m=\u001b[39m \u001b[43m_read\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfname\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdtype\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdtype\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcomment\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcomment\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdelimiter\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdelimiter\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1314\u001b[0m \u001b[43m \u001b[49m\u001b[43mconverters\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mconverters\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskiplines\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskiprows\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43musecols\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43musecols\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1315\u001b[0m \u001b[43m \u001b[49m\u001b[43munpack\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43munpack\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mndmin\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mndmin\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mencoding\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1316\u001b[0m \u001b[43m \u001b[49m\u001b[43mmax_rows\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmax_rows\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mquote\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mquotechar\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1318\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m arr\n", "File \u001b[0;32m/home/zebulon/miniconda3/envs/m1psl/lib/python3.10/site-packages/numpy/lib/npyio.py:955\u001b[0m, in \u001b[0;36m_read\u001b[0;34m(fname, delimiter, comment, quote, imaginary_unit, usecols, skiplines, max_rows, converters, ndmin, unpack, dtype, encoding)\u001b[0m\n\u001b[1;32m 953\u001b[0m fname \u001b[38;5;241m=\u001b[39m os\u001b[38;5;241m.\u001b[39mfspath(fname)\n\u001b[1;32m 954\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(fname, \u001b[38;5;28mstr\u001b[39m):\n\u001b[0;32m--> 955\u001b[0m fh \u001b[38;5;241m=\u001b[39m \u001b[43mnp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlib\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_datasource\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mopen\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfname\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mrt\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mencoding\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 956\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m encoding \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 957\u001b[0m encoding \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mgetattr\u001b[39m(fh, \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mencoding\u001b[39m\u001b[38;5;124m'\u001b[39m, \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mlatin1\u001b[39m\u001b[38;5;124m'\u001b[39m)\n", "File \u001b[0;32m/home/zebulon/miniconda3/envs/m1psl/lib/python3.10/site-packages/numpy/lib/_datasource.py:193\u001b[0m, in \u001b[0;36mopen\u001b[0;34m(path, mode, destpath, encoding, newline)\u001b[0m\n\u001b[1;32m 156\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 157\u001b[0m \u001b[38;5;124;03mOpen `path` with `mode` and return the file object.\u001b[39;00m\n\u001b[1;32m 158\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 189\u001b[0m \n\u001b[1;32m 190\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 192\u001b[0m ds \u001b[38;5;241m=\u001b[39m DataSource(destpath)\n\u001b[0;32m--> 193\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mds\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mopen\u001b[49m\u001b[43m(\u001b[49m\u001b[43mpath\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mencoding\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnewline\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnewline\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/home/zebulon/miniconda3/envs/m1psl/lib/python3.10/site-packages/numpy/lib/_datasource.py:533\u001b[0m, in \u001b[0;36mDataSource.open\u001b[0;34m(self, path, mode, encoding, newline)\u001b[0m\n\u001b[1;32m 530\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m _file_openers[ext](found, mode\u001b[38;5;241m=\u001b[39mmode,\n\u001b[1;32m 531\u001b[0m encoding\u001b[38;5;241m=\u001b[39mencoding, newline\u001b[38;5;241m=\u001b[39mnewline)\n\u001b[1;32m 532\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m--> 533\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mFileNotFoundError\u001b[39;00m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mpath\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m not found.\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "\u001b[0;31mFileNotFoundError\u001b[0m: data/curves/data.txt not found." ] } ], "source": [ "data_from_file = np.loadtxt(\"data/curves/data.txt\", comments=\"#\")" ] }, { "cell_type": "code", "execution_count": null, "id": "9fabedcf", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "print(\"Shape: {} \".format(data_from_file.shape))\n", "print( data_from_file[:10,:])" ] }, { "cell_type": "markdown", "id": "116a4f02", "metadata": { "lang": "fr", "slideshow": { "slide_type": "subslide" } }, "source": [ "We notice that we have specified the optional argument `comments`, this allows us to tell NumPy which lines to ignore. If the first lines do not start with a specific character (comment character) it is still possible to ignore them with the optional argument `skiprosws` which allows to indicate the number to ignore at the beginning of the file. An equivalent use of `loadtxt` to the previous one would be :" ] }, { "cell_type": "code", "execution_count": null, "id": "49ce4d62", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "data_from_file = np.loadtxt(\"data/curves/data.txt\", skiprows=5)" ] }, { "cell_type": "code", "execution_count": null, "id": "134096a8", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "print(\"Shape: {} \".format(data_from_file.shape))\n", "print( data_from_file[:10,:])" ] } ], "metadata": { "jupytext": { "text_representation": { "extension": ".md", "format_name": "myst", "format_version": 0.13, "jupytext_version": "1.14.1" } }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.4" }, "source_map": [ 14, 23, 37, 41, 49, 63, 77, 83, 99, 103, 114, 118, 127, 138, 156, 162, 166, 175, 184, 195, 199, 203, 212, 216, 220, 229, 237, 241, 245, 254, 258, 262, 271, 275, 284, 288, 297, 306, 310, 314, 318, 325, 329, 338, 346, 354, 362, 370, 376, 384, 394, 398, 406, 414, 422, 426, 434, 442, 450, 458, 462, 466, 470, 474, 483, 491, 499, 503, 513, 519, 528, 537, 546, 550, 561, 565, 576, 580, 584, 599, 603, 615, 622, 630, 638, 642, 646, 650, 654, 658, 670, 674, 682, 694, 698, 708, 712, 716, 720, 727, 736, 740, 749, 753, 762, 770, 774, 783, 787, 796, 805, 814, 818, 826, 834, 843, 851, 855, 859, 863, 867, 880, 884, 901, 905, 918, 929, 933, 948, 956, 965, 969, 973, 977, 981, 990, 994, 1003, 1007, 1015, 1019, 1028, 1032, 1043, 1054, 1058, 1067, 1071, 1075, 1079, 1083, 1092, 1104, 1108, 1116, 1132, 1136, 1145, 1149, 1157, 1161, 1169, 1173, 1181, 1185, 1191, 1199, 1203, 1220, 1229, 1245, 1249, 1258, 1262, 1270, 1274, 1282, 1291, 1295, 1303 ] }, "nbformat": 4, "nbformat_minor": 5 }