{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "
\n", "void ufunc_loop(void **args, int *dimensions, int *steps, void *data)\n", "{\n", " /* int8 output = elementwise_function(int8 input_1, int8 \n", " * input_2)\n", " *\n", " * This function must compute the ufunc for many values at \n", " * once, in the way shown below.\n", " */\n", " char *input_1 = (char*)args[0];\n", " char *input_2 = (char*)args[1];\n", " char *output = (char*)args[2];\n", " int i;\n", " for (i = 0; i < dimensions[0]; ++i) {\n", " *output = elementwise_function(*input_1, *input_2);\n", " input_1 += steps[0];\n", " input_2 += steps[1];\n", " output += steps[2];\n", " }\n", "}\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Universal functions/4\n", "\n", "Examples:\n", "```python\n", ">>> a = np.array([a1,a2,...,an]) \n", "```\n", "```python\n", ">>> b = 1.5*a + 2 # * + / - all ufuncs \n", "```\n", "* all the elements of “a” are multiplied by 1.5; result stored in a __temporary array__\n", "* All the elements of the temp array are increased by 2; result stored in another __temporary array__\n", "* b becomes a reference to the __second__ temp array\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Universal functions (3)\n", "* What happens when ufunc has to operate on two arrays of __different dimensions__?\n", "* To be efficient, element-wise operations are required\n", "* To operate element-by-element broadcasting is needed.\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](files/images/broadcasting1.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](files/images/broadcasting2.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Broadcasting/1\n", "* Broadcasting allows to deal in a meaningful way with inputs that do not have the same shape.\n", "* After having applied the broadcasting rules the shape of all arrays must match\n", "* let's see when different shapes are compatible from the broadcasting standpoint\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Broadcasting/2\n", "\n", "Broadcasting follows these rules:\n", "\n", "__1.__ Prepend 1 until all the arrays dimension matches\n", "\n", "```\n", "a.shape (2,2,3,6) # a.ndim = 4 \n", "b.shape (3,6) # b.ndim = 2 \n", "\n", "1 → b.ndim = 4\n", "1 → b.shape (1,1,3,6)\n", "```\n", " \n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "__2.__ Arrays with a size of 1 along a particular dimension act like the array with the __largest size__ along that dimension.\n", " * The values of the array are assumed to be __the same along the broadcasted dimension__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Broadcasting/3\n", "* 1d-ndarrays are always broadcastable\n", "\n", "```python\n", "a.shape → (3) b.shape -> (4,3); \n", "a.shape → (1,3) # first rule, n. of dim matches\n", "a.shape → (4,3) # second rule\n", "c = a*b \n", "c.shape → (4,3) # as expected \n", "```\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "source": [ "```python\n", "# another example\n", "a.shape → (2,1,3) \n", "b.shape → (2,4,1); \n", "c = a*b c.shape → (2,4,3) # second rule\n", "\n", "# Arrays not broadcastable\n", "a.shape → (2,5,3)\n", "b.shape → (2,4,1); \n", "c = a*b # shape mismatch\n", "```\n", " \n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Broadcasting/4\n", "* If needed it is possible to add a trailing dimension to the shape of a `ndarray`. This is a convenient way of taking the outer product (or any other outer operation) of two arrays\n", "```python\n", "a.shape → (3) b.shape → (4); \n", "c = a[:,np.newaxis] # c.shape → (3,1)\n", "c * b \n", "c.shape → (3,4) \n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Universal functions: performance\n", "* Iterations over array should be avoided because they are not efficient. Performance using array arithmetics is better than using python loops.\n", " \n", "* ~ 18x speedup (standard across NumPy)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# Using ufunc\n", "x = np.arange(10**7)\n", "%timeit y = x*3" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# Using for loop\n", "%timeit y = [x*3 for x in range(10**7)] " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Efficiency note\n", "The operation\n", "```python\n", ">>> b = 2*a + 2\n", "```\n", "can be done with in-place operations\n", "```python\n", ">>> b = a # b is a reference to a\n", ">>> b *= 2\n", ">>> b += 2\n", "```\n", "More efficient and doesn't require storing temp array" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "These operations affect array “a” as well; we can avoid this with\n", "```python\n", ">>> b = a.copy()\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Other in-place operations\n", "```python\n", ">>> b += a \n", ">>> b *= a \n", ">>> b **= a\n", ">>> b -= a\n", ">>> b /= a\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Efficiency note/2:\n", "* Let's consider the function\n", "```python\n", ">>> def myfunc(x): # if x is an array\n", " if x<0: # x<0 becomes a bool array\n", " return 0\n", " else: \n", " return sin(x)\n", "```\n", "\n", "* “if x<0” results in a boolean array and __cannot__ be evaluated by the if statement" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Efficiency note (3): Vectorization\n", "We can vectorize our function using the function where\n", "```python\n", ">>> def vecmyfunc(x):\n", " y = np.where(x<0, 0, np.sin(x))\n", " return y # now returns an array\n", "```\n", "`np.where`\n", "```python\n", "np.where(condition, x, y) # condition → bool array \n", " # condition is true → x \n", " # condition is false → y \n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## NumPy universal functions/1\n", "* There are more than 60 ufuncs\n", "* Some `ufuncs` are called automatically: i.e. np.multiply(x,y) is called internally when a*b is typed\n", "* NumPy offers trigonometric functions, their inverse counterparts, and hyperbolic versions as well as the exponential and logarithmic functions. Here, a few examples:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "```python\n", ">>> b = np.sin(a)\n", ">>> b = np.arcsin(a)\n", ">>> b = np.sinh(a)\n", ">>> b = a**2.5 # exponential\n", ">>> b = np.log(a)\n", ">>> b = np.exp(a)\n", ">>> b = np.sqrt(a)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## NumPy universal functions/2\n", "* Comparison functions: __greater, less, equal, logical_and/_or/_xor/_nor, maximum, minimum, ...__\n", "```python\n", ">>> a = np.array([2,0,3,5])\n", ">>> b = np.array([1,1,1,6])\n", ">>> np.maximum(a,b)\n", "array([2, 1, 3, 6])\n", "```\n", "* Floating point functions: __floor, ceil, isreal, iscomplex,...__\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Exercise 5 \n", "![](files/images/inertia_t.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "## Exercise 5: universal functions\n", "* Calculate the molecular center of mass\n", "$$\n", "xcm=\\frac{1}{MW}\\sum_ix_i m_i \\;\\;\\;\\;\\;\\;\\; \n", "ycm=\\frac{1}{MW}\\sum_iy_i m_i \\;\\;\\;\\:\\;\\;\\; \n", "zcm=\\frac{1}{MW}\\sum_iz_i m_i\n", "$$ \n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Calculate the moment of inertia tensor\n", "$$ \n", "\\left[ \\begin{array}{ccc}\n", "I_{11} & I_{12} & I_{13} \\\\\n", "I_{21} & I_{22} & I_{23} \\\\\n", "I_{31} & I_{32} & I_{33} \\end{array} \\right] \n", "$$\n", "whose elemets can be calculated as:\n", "$$\n", "I_{11} = I_{xx} = \\sum_i m_i(y_i^2+z_i^2) \\;\\;\\;\\;\\;\\;\\;\n", "I_{11} = I_{xx} = \\sum_i m_i(y_i^2+z_i^2) \\;\\;\\;\\;\\;\\;\\;\n", "I_{11} = I_{xx} = \\sum_i m_i(y_i^2+z_i^2) \n", "$$\n", "$$\n", "I_{12} = I_{xy} = -\\sum_i m_ix_iy_i \\;\\;\\;\\;\\;\\;\\;\n", "I_{12} = I_{xy} = -\\sum_i m_ix_iz_i \\;\\;\\;\\;\\;\\;\\;\n", "I_{12} = I_{xy} = -\\sum_i m_iy_iz_i\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# molecular center of mass\n", "MW = arr['PA'].sum() \n", "xcm = 1./MW * (arr['PA']*arr['x']).sum()\n", "ycm = 1./MW * (arr['PA']*arr['y']).sum()\n", "zcm = 1./MW * (arr['PA']*arr['z']).sum()\n", "print(xcm,ycm,zcm)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# moment of inertia tensor\n", "Ixx = (arr['PA']*(arr['y']**2 + arr['z']**2)).sum()\n", "Iyy = (arr['PA']*(arr['x']**2 + arr['z']**2)).sum()\n", "Izz = (arr['PA']*(arr['x']**2 + arr['y']**2)).sum()\n", "Ixy = -(arr['PA']*arr['x']*arr['y']).sum()\n", "Ixz = -(arr['PA']*arr['x']*arr['z']).sum()\n", "Iyz = -(arr['PA']*arr['y']*arr['z']).sum()\n", "\n", "TI = np.array([[Ixx, Ixy, Ixz],[Ixy, Iyy, Iyz], [Ixz, Iyz, Izz]])\n", "print(TI)\n", "mTI = np.matrix(TI)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Standard classes inheriting from ndarray\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Matrix objects/1\n", "`Matrix` objects inherit from ndarray classes\n", "* Matrix can be created using a Matlab-style notation\n", "```python\n", ">>> a = np.matrix('1 2; 3 4')\n", ">>> print a\n", "[[1 2]\n", " [3 4]]\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* Matrix are always __two-dimensional__ objects\n", "* Matrix objects over-ride multiplication to be __matrix-multiplication__, and over-ride power to be __matrix raised to a power__\n", "* Matrix objects have higher __priority__ than ndarrays. Mixed operations result therefore in a matrix object " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Matrix objects/2\n", "\n", "* Matrix objects have special attributes:\n", " * `.T` returns the transpose of the matrix\n", " * `.H` returns the conjugate transpose of the matrix\n", " * `.I` returns the inverse of the matrix\n", " * `.A` returns a view of the data of the matrix as a 2-d array \n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Basic modules" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Basic modules: linear algebra/1\n", "\n", "Contains functions to solve __linear systems__, finding the __inverse__, the __determinant__ of a matrix, and computing __eigenvalues__ and __eigenvectors__\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# example\n", "\n", "A = np.zeros((10,10)) # arrays initialization\n", "x = np.arange(10)/2.0\n", "for i in range(10):\n", " for j in range(10):\n", " A[i,j] = 2.0 + float(i+1)/float(j+i+1)\n", "b = np.dot(A, x)\n", "y = np.linalg.solve(A, b) # A*y=b → y=x \n", "print(x)\n", "print(y)\n", "np.allclose(y,x, 10**-3) # True, mind the tolerance! " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Basic modules: linear algebra/2\n", "* Eigenvalues can also be computed:\n", "\n", "```python\n", "# eigenvalues only:\n", ">>> A_eigenvalues = np.linalg.eigvals(A)\n", "\n", "# eigenvalues and eigenvectors:\n", ">>> A_eigenvalues, A_eigenvectors = np.linalg.eig(A)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Exercise 6: matrices and linalg\n", "\n", "1. From the array TI of the last exercise build a matrix (__mTI__), calculate eigenvalues (__e__) and eigenvectors (__Ev__) and prove that: \n", "$\n", "Ev^T mTI \\; Ev = \n", "\\left[ \\begin{array}{ccc}\n", "e_1 & 0 & 0 \\\\\n", "0 & e_2 & 0 \\\\\n", "0 & 0 & e_3 \\end{array} \\right] \n", "$\n", "\n", "2. Now, for example, you can rotate all the atoms coordinates by multiplying them by the __Ev__ matrix" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "mTI = np.matrix(TI)\n", "ev, Ev = np.linalg.eig(mTI)\n", "print(ev)\n", "Ev.T * mTI * Ev" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Basic modules: random numbers\n", "* Calling the standard random module of Python in a loop to generate a sequence of random number is inefficient. Instead:\n", "\n", "```python\n", ">>> np.random.seed(100)\n", ">>> x = np.random.random(4)\n", "array([ 0.89132195, 0.20920212, 0.18532822, \n", "0.10837689])\n", ">>> y = np.random.uniform(-1, 1, n) # n uniform numbers in interval (-1,1)\n", "``` " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* This module provides more general distributions like the normal distribution\n", "\n", "```python\n", ">>> mean = 0.0; stdev = 1.0\n", ">>> u = np.random.normal(mean, stdev, n)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Exercise 7: random numbers\n", "* Given a game in which you win back 10 times your bet if the sum of 4 dice is less than 10, determine (brute force, i.e. 1000000 trows) whether it is convenient to play the game or not\n", "\n", "* hint: use `np.random.randint`" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Numpy: good to know" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Masked arrays (1)\n", "\n", "```python\n", ">>> x = np.arange(20).reshape(4,5) # 2d array \n", ">>> x[2:,2:] = 2 # assignment\n", "array([[ 0, 1, 2, 3, 4],\n", " [ 5, 6, 7, 8, 9],\n", " [10, 11, 2, 2, 2],\n", " [15, 16, 2, 2, 2]])\n", ">>> x[x<5] # x<5 bool array \n", "array([0, 1, 2, 3, 2, 2, 2, 2, 2, 2]) # 1d array\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "* What if we want to maintain the original shape of the array\n", "* We can use masked arrays: the __shape of the array is kept__\n", "* Invalid data can be flagged and ignored in math computations\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "x = np.arange(20).reshape(4,5)\n", "x[2:,2:] = 2\n", "import numpy.ma as ma\n", "y = ma.masked_greater(x, 4)\n", "y" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Meshgrids/1\n", "* Given a two-dimensional grid of points, namely $a_{i,j}=(x i , y j)$ we want to calculate for each point of the grid the values of a function $f(a_{ij})=f(x_i,y_j)$\n", "\n", "* Suppose $f = \\frac{x}{2} + y$\n", "\n", "* $f\\;$ is a `ufunc` (implicitly uses ufuncs: `add`, `divide`) so it operates element-by-element)\n", "\n", "![](files/images/mesh.png)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# wrong\n", "def f(x,y): \n", " return x/2 + y\n", "\n", "x = np.linspace(0,4,5)\n", "y = x.copy()\n", "\n", "values = f(x,y)\n", "values" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Meshgrids/2\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "x = np.linspace(0,4,5)\n", "y = x.copy()\n", "xv,yv = np.meshgrid(x,y)\n", "print(xv)\n", "print(yv)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](files/images/mesh2.png)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "values=f(xv,yv)\n", "values" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Bibliography\n", "* http://docs.scipy.org/doc/\n", "* Guide to NumPy (Travis E. Oliphant)\n", "* NumPy User Guide\n", "* Numpy Reference Guide\n", "* Python Scripting for Computational Science (HansPetter Langtangen)\n" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 1 }