{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 2D Wave Equation Parallel Computation\n", "Ehsan Kamalinejad 10/12/2015\n", "\n", "In this tutorial, we will solve a wave equation numerically in a 2D domain using parallel computation. A problem in parallel computing usually has two important aspects:\n", "1. **The theoretical aspcet.** To find a parallizable solution for a given problem, one needs to find a way to break the problem into some sub-problems that are easier to solve and then find a way of combining the sub-solutions to the solution of the original problem.\n", "2. **The practical aspect.** The practical aspects of a parallel computational problem deals with the technologies available to distribute the computation between several cores/machines. The practical aspect is usually dependent on the particular language/OS/machine that the problem is solved on. In this tutorial we will use Python Language.\n", "\n", "As you might have guessed, the more interesting (and usually more challenging) part of a parallel computational problem is the theoretical part that deals with dividing the problem into sub-problems and assembling the sub-solutions back to a complete solution. \n", "\n", "The idea of this tutorial is to demonstrate these steps in an important example of the 2D wave equation. After studying the theoretical part, we will use some modern tools from Python to solve the problem efficiently in parallel. Note that on a personal computer with only a few processing cores, we will not get a significant speed boost by parallelization but this is just a demonstration and the same steps with small modifications can be applied to solve the problem on a cluster or on a GPU where the speed boost will be significant." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Theory\n", "\n", "We are considering the following problem called the *wave partial differential equation (PDE)*:\n", "\n", "$$\\frac{\\partial^2u}{\\partial t^2} = c^2 \\Delta u$$\n", "\n", "where $u$ is the displacement, $t$ is time, $c$ is the speed of propagation of the wave and $\\Delta$ is the *Laplacian* or simply the second derivative of $u$ with respect to the spatial variables.\n", "\n", "The wave equation is one of the most important equations in physics and it describes many phenomena such as vibrations of a string on a musical instrument, waves on the surface of the water, and electromagnetic waves (including visible light). Its complex counterpart describes the Schrödinger equation which is the fundamental equation of physics for describing quantum mechanics.\n", "\n", "It is clear how important it is to find the solution of such a fundamental equation. Of course, we desire to find a solution that is accurate and efficient. In particular, it is desirable to have **a parameter that controls accuracy**. And for efficiency, it would be great to be able to **solve the problem in parallel**.\n", "\n", "### The wave equation\n", "\n", "A time dependent evolution equation usually has three parts to it:\n", "1. The main partial differential equation (PDE) which is the core of the equation and it describes how the system evolves in time within the domain of definition.\n", "2. The boundary condition (BC) which defines how the system is interacting with the environment through its boundary. For example, the string of a musical instrument has fixed boundary conditions at the two ends or the water wave is reflected at the boundary regions.\n", "3. The initial condition (IC) which describes what was the initial state of the system. For example, when you pull the string of a guitar and you let it go, the displacement from the balance position is the initial condition of the problem.\n", "\n", "Our PDE will be the wave equation of course. We are going to study the 2D wave equation on a rectangular domain $[0,1]\\times [0,1]$. And we will use fixed boundary conditions. And we are going to use two spikes as the initial conditions. You can use exactly the same method for different initial conditions. With small modifications you can implement different boundary conditions. Changing the domain to non-rectangular shapes will be more challenging and we will briefly talk about it later.\n", "\n", "\n", "$$PDE: ~ ~ ~ ~ ~ ~ \\frac{du(x,y,t)}{dt}=\\Delta u(x,y,t) ~ ~ ~ ~ ~ ~ t∈(0,∞) ~ , ~ x∈(0,1) ~ , ~ y∈(0,1)$$\n", " \n", "$$BC: ~ ~ ~ ~ ~ ~ u(0,y,t)=0 ~ , ~ u(x,0,t)=0 ~ , ~ u(1,y,t)=0 ~ , ~ u(x,1,t)=0$$\n", "\n", "$$IC: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ u(x,y,0)=e^y$$\n", "\n", "For more detailed discussion about the wave equation and its derivation from some simple physics principles study chapter 16 of [Farlow's book](https://www.amazon.com/Differential-Equations-Scientists-Engineers-Mathematics/dp/048667620X/ref=sr_1_1?s=books&ie=UTF8&qid=1476071165&sr=1-1&keywords=partial+differential+equations).\n", "\n", "### Finite Difference Vs Spectral\n", "\n", "In scientific computing, the easiest numerical solution for a PDE is usually given by a [*finite difference method*]( https://en.wikipedia.org/wiki/Finite_difference_method). In finite difference, the space-time domain is discretized into smaller sub-domains, the problem is solved in each discrete sub-domain and the information is passed on to the next sub-domain. Accuracy can be controlled by the resolution of the sub-domains. This method is the most standard method of solving PDEs numerically but it has its weaknesses. The important weakness for us is that because of dependency of each sub-domain to its previous sub-domain, this method is not (directly) parallelizable. This is a huge disadvantage specially because in the modern scientific computing we love to do heavy computations in parallel. As we will see, we use a [spectral method]( https://en.wikipedia.org/wiki/Spectral_method) that is easily parallelizable and it also enjoys great accuracy.\n", "\n", "### Separation of Variables (Fourier)\n", "\n", "The idea of separation of variables is simple and very effective. Also it is fully compatible with parallelization. This method can be summarized in these steps:\n", "\n", "1. Find the solution of the problem changing the initial condition to be a very simple function or a family of such simple functions. Under [suitable conditions]( https://en.wikipedia.org/wiki/Basis_function) this family of functions is called a basis. We will use Fourier (sine and cosine) as the basis functions.\n", "2. Rewrite the initial condition as the linear combination (it can be infinite sum) of the basis functions.\n", "3. By [superposition principle](https://en.wikipedia.org/wiki/Superposition_principle) for linear equations, if the initial condition is a particular linear combination of a basis then the solution of the problem is the same linear combination of the sub-solutions for the basis functions.\n", "\n", "In other words, if $sol_i(X,t)$ solves the equation with initial condition $b_i(X)$ and also $u(X,0) = \\sum_i a_i b_i(X)$, then we have the solution of the original problem to be $u(X,t)= \\sum_i a_i sol_i(X,t)$. Hence, one can solve the problem on much simpler initial conditions. Then put the solutions back together as a linear combination. \n", "\n", "If you remember, at the beginning of the tutorial we said that, for parallelization, we need to be able to break the problem into simpler sub-problems and then combine the sub-solutions back to the solution of the original problem. It should be now clear why separation of variables is so compatible with parallelization.\n", "\n", "The simple functions that we use for the building blocks of the sub-solutions are Fourier functions (sine and cosine). To do the first step, we need to solve the special initial conditions of the form $u(x,y,0)=\\sin( m \\pi x) \\sin( n \\pi y)$. This is a straight forward multivariable calculus problem. If you are interested in the details, you can refer to chapter 5 of [Farlow's book](https://www.amazon.com/Differential-Equations-Scientists-Engineers-Mathematics/dp/048667620X/ref=sr_1_1?s=books&ie=UTF8&qid=1476071165&sr=1-1&keywords=partial+differential+equations). This calculation will lead into sub-solutions of the form:\n", "\n", "$$u(x,y,t) = \\sin( m \\pi x)~ \\sin( n \\pi y)~ \\cos(\\sqrt{m^2+n^2} \\pi t)$$\n", "\n", "Now we need to find the linear combination of Sine Fourier functions that give us back the actual initial condition. After that, based on superposition principle, we would be able to use these coefficients to stick together the sub-solutions into the full solution of the original problem.\n", "\n", "To find the coefficients, we can use an integration trick that is described in chapter 10 of [Farlow's book](https://www.amazon.com/Differential-Equations-Scientists-Engineers-Mathematics/dp/048667620X/ref=sr_1_1?s=books&ie=UTF8&qid=1476071165&sr=1-1&keywords=partial+differential+equations). To be accurate, this is more than just a trick. It has its roots in the fact that Fourier functions create an orthogonal basis for the Hilbert functional space $L^2([0,1]\\times [0,1])$. But that discussion is beyond the scope of this tutorial. In short, after some integration tricks we can see that the coefficients can be calculated using the following expression:\n", "\n", "$$A_{m,n} = \\int_0^1 \\int_0^1 \\sin( m \\pi x)~ \\sin( n \\pi y)~ initial(x,y)~ dx dy$$\n", "\n", "Assembling the sub-solutions, superposition principle gives us the final solution:\n", "\n", "$$u(x,y,t)=\\sum_{m=1}^{\\infty} \\sum_{n=1}^{\\infty} A_{m,n} ~\\sin( m \\pi x)~ \\sin( n \\pi y)~ \\cos(\\sqrt{m^2+n^2} \\pi t)$$\n", "\n", "This function is the unique solution of the given wave equation. The uniqueness is implied by the fundamental theorem of linear equations. You can directly check that the solution solves the PDE, the boundary condition and the initial condition. Hence it is the solution that we were searching for. But there is more to it. It is possible (using functional space theory) to prove that assuming smoothness of the initial condition the elements of the series get smaller further in the series. This means that we can cut the series at some finite term and have a good approximation of the solution. In fact, the number of the terms in the series is the accuracy parameter that we were searching for. Hence we have a solution with all of the desirable ingredients:\n", "\n", "1. A solution made out of combining sub-solutions that each can be solved easily and efficiently. \n", "2. A set of sub-solutions that are independent so they can be found in parallel.\n", "3. An accuracy parameter that can be tuned to get the required accuracy.\n", "\n", "The second half of the tutorial deals with implementing this parallel approach in Python." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Computations\n", "\n", "There are several tools for parallelization in Python for example see [here](https://developer.nvidia.com/anaconda-accelerate) for using CUDA GPU computations in Python. In this tutorial, we use two tools that are easy to set up and do not require a specific hardware (such as a CUDA GPU):\n", "\n", "1. We use [ipyparallel](http://ipyparallel.readthedocs.io/en/stable/index.html) library for parallel computing. \n", "2. We also use [numba](http://numba.pydata.org/) library which boosts math operations significantly.\n", "\n", "Although these tools do not require an extra hardware to work but if you have access such hardware (for example a CUDA GPU), you would be able to enhance these methods even more." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ipyparallel Library\n", "\n", "Using ipyparallel library one can create multiple CPU threads to solve an appropriate algorithm in parallel. To install, you can simply run a pip installation in a command:\n", "\n", "pip install ipyparallel\n", "\n", "After the installation is done, run the following command in a terminal to start a cluster:\n", "\n", "ipcluster start -n 4\n", "\n", "Note that -n 4 determines the number of cores accesible for computation. Also, if you are using a Jupyter notebook, the termial should have the same python kernel as your notebook.\n", "\n", "Don't forget to stop the cluster when you're done.\n", "\n", "ipcluster stop\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this part, we calculate the Fourier coefficients. Note that this calculation is independent of the grid size used later for visualization. These coefficients only depend on the shape of the region and on the initial function. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[0, 1, 2, 3]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import ipyparallel\n", "\n", "clients = ipyparallel.Client()\n", "\n", "# check that the clients are connected\n", "clients.ids" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy\n", "\n", "### Initial Condition ###\n", "\n", "# defining the initial function\n", "def initial(x,y):\n", " f = numpy.exp(-((x-(1.0/4))**2+(y-(1.0/2))**2)/0.001) + numpy.exp(-((x-(3.0/4))**2+(y-(1.0/2))**2)/0.001)\n", " return f" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, we calculate the Fourier coefficients of the initial condition. Note that this can be a slow caculation depending on the number of terms of the series you want to keep. Also since the terms in the loops are independent from each other, there are trivial ways to do this calculation in parallel. But because this is not the heaviest part in the calculation, we do not bother parallelizing this section." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# import dblquad for integral calculations\n", "from scipy.integrate import dblquad\n", "# import tnrange to show percentage completed\n", "from tqdm import tnrange\n", "\n", "### Fourier Calculations ###\n", "\n", "# the integrand for calculating the Fourier coefficients\n", "def integrand(x,y,m,n):\n", " g = (numpy.sin(m*numpy.pi*x))*(numpy.sin(n*numpy.pi*y))*initial(x,y)\n", " return g\n", "\n", "# calculate the Fourier coefficients for Amn\n", "def fourier_Coef(n,m):\n", " coef = dblquad( lambda x, y: integrand(x,y,n,m), 0, 1, lambda y: 0, lambda y: 1)\n", " return coef\n", "\n", "# number of the terms in the Fourier expansion\n", "# these are our accuracy parameter\n", "M = 50\n", "N = 50\n", "\n", "# the Fourier coefficients\n", "A = numpy.zeros((M,N))\n", "for m in tnrange(M, desc='coefficients calculation'):\n", " for n in range(N):\n", " A[m,n],_ = fourier_Coef(n,m)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "### Creating a 2D Grid ###\n", "\n", "# note that a big grid size will make the visualization section slow\n", "grid_size = 60\n", "x = numpy.linspace(0,1,grid_size,endpoint=True)\n", "y = numpy.linspace(0,1,grid_size,endpoint=True)\n", "[xx, yy] = numpy.meshgrid(x,y)\n", "\n", "# create a time vector\n", "final_t = 2\n", "num_steps = 480\n", "t = numpy.linspace(0,final_t,num_steps)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### numba: jit complied code in Python\n", "\n", "Numba is a library that allows for very fast calculations in python. If the functions you use in your calculations are compatible with Numba, your performance will be similar to C and C++ performance. \n", "\n", ">\"With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.\"\n", "\n", "For more info about the Numba library look [here](http://numba.pydata.org/)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from numba import jit\n", "\n", "### Calculating the Solution ###\n", "\n", "import numba\n", "\n", "# function to calculate the solution\n", "@numba.jit\n", "def sol(x,y,t):\n", " U = 0\n", " for m in range(M):\n", " for n in range(N):\n", " U += A[m,n]*(numpy.sin(m*numpy.pi*x))\\\n", " *(numpy.sin(n*numpy.pi*y))\\\n", " *(numpy.cos(numpy.pi*numpy.sqrt(m**2+n**2)*t))\n", " return U \n", " \n", "# helper function to calculate the solution vector\n", "def vectorized(step):\n", " return sol(xx,yy,t[step])" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# assigning clients for parallel computation\n", "dv = clients[:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we are pushing the libraries to the cluster. Note that you will need to use the full names of the libraries here, otherwise it will not work." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "importing numpy on engine(s)\n", "importing numba on engine(s)\n" ] } ], "source": [ "with dv.sync_imports():\n", " import numpy\n", " import numba" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# pushing the data to the cluster\n", "dv.push(dict(M=50,N=50, xx = xx, yy=yy, t=t, A=A, sol=sol))" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# solving the problem in parallel on the cluster\n", "U_vec = numpy.zeros((grid_size,grid_size,num_steps))\n", "U_vec = dv.map_sync(vectorized,range(num_steps))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, we will create different frames of the evolution of the wave. To save the video, you will need a ffmpeg encoder installed on your machine. If you do not have the encoder, you can use the FuncAnimation together with inline plotting but the quality and the framerate will be very bad and because of that we skip using inline plotting method." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[MoviePy] >>>> Building video parallel_wave_2D_v1.mp4\n", "[MoviePy] Writing video parallel_wave_2D_v1.mp4\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|███████████████████████████████████████████████████████████████████████████████▊| 480/481 [18:01<00:02, 2.31s/it]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "[MoviePy] Done.\n", "[MoviePy] >>>> Video ready: parallel_wave_2D_v1.mp4 \n", "\n" ] } ], "source": [ "### visualization ###\n", "\n", "import matplotlib.pyplot as plt\n", "from mpl_toolkits.mplot3d import Axes3D\n", "from moviepy.editor import VideoClip\n", "from moviepy.video.io.bindings import mplfig_to_npimage\n", "from matplotlib.colors import LightSource\n", "\n", "# create a standard figure\n", "fig = plt.figure(figsize=(8,6), dpi=240, facecolor='white')\n", "ax = fig.add_subplot(1, 1, 1, projection='3d')\n", "\n", "# setting the parameters of the output video\n", "fps = 60\n", "duration = num_steps/fps\n", "\n", "# create the frames\n", "step = 0\n", "def make_frame(t):\n", " global step, num_steps\n", " ax.clear()\n", " # create a light source for shading\n", " ls = LightSource(azdeg=315, altdeg=45)\n", " # shade the surface\n", " rgb = ls.shade(U_vec[step], cmap=plt.cm.cool, vert_exag=1, blend_mode='hsv')\n", " # use the plot_surface and the shading \n", " surf = ax.plot_surface(xx, yy, U_vec[step], rstride=1, cstride=1,\n", " linewidth=0, antialiased=True, facecolors=rgb,\n", " alpha=0.85)\n", " ax.set_zlim3d(-0.3, 0.3)\n", " ax.set_xticks([])\n", " ax.set_yticks([])\n", " ax.set_zticks([])\n", " if step < num_steps-1:\n", " step += 1\n", " return mplfig_to_npimage(fig)\n", "\n", "# create the video\n", "animation = VideoClip(make_frame, duration = duration)\n", "animation.write_videofile(\"parallel_wave_2D.mp4\", fps=fps)" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }