First, if you have not read our previous post that used the Wolfram Model as a test, you might want to read that page.

In an effort to further explore the benefits of Numba we decided to use a new code that implements floating point operations. Here, we will compare the speeds of Numba, Python, and clever implementations of NumPy. We mostly followed the Julia set example from the book High Performance Python: Practical Performant Programming for Humans.

The Julia set algorithm:

  • Create a 2D array with real numbers on the x-axis and imaginary numbers on the y-axis
  • Iterate through the array and apply a function to each location. We used the function z2 + c where z was the complex number corresponding to the location in the array, and c is a complex constant.
  • Keep doing this until a maximum number of iterations are reached, or the value of a location in the array gets too large.
  • Plotting the absolute value of the array makes plots like those below.

This slideshow requires JavaScript.

Below, we include and compare four versions of the code:

  • the raw Python
  • NumPy
  • NumPy with Numba’s jit
  • Numba with njit, parallel=True

The raw Python takes a naive approach of iterating through an array and individually checking and calculating each location. The NumPy version uses NumPy operations to do this much more quickly.

Raw Python Code

Note that this uses loops.


import time #timing
import numpy as np #arrays
import numba as nb #speed
x_dim = 1000
y_dim = 1000
x_min = 1.8
x_max = 1.8
y_min = 1.8j
y_max = 1.8j
z = np.zeros((y_dim,x_dim),dtype='complex128')
for l in range(y_dim):
z[l] = np.linspace(x_min,x_max,x_dim) np.linspace(y_min,y_max,y_dim)[l]
def julia_raw_python(c,z):
it = 0
max_iter = 100
while(it < max_iter):
#These will be expensive!
for y in range(y_dim):
for x in range(x_dim):
if abs(z[y][x]) < 10: #Runaway condition
z[y][x] = z[y][x]**2 + c #square and add c everywhere
it += 1
return z
start = time.perf_counter()
julia_raw_python(.4+.6j,z) #arbitrary choice of c
end = time.perf_counter()
print(endstart)

view raw

Julia-Python.py

hosted with ❤ by GitHub

Raw NumPy Code

Now, we use “clever” NumPy, rather than loops.


import time #timing
import numpy as np #arrays
import numba as nb #speed
x_dim = 1000
y_dim = 1000
x_min = 1.8
x_max = 1.8
y_min = 1.8j
y_max = 1.8j
z = np.zeros((y_dim,x_dim),dtype='complex128')
for l in range(y_dim):
z[l] = np.linspace(x_min,x_max,x_dim) np.linspace(y_min,y_max,y_dim)[l]
def julia_numpy(c,z):
it = 0
max_iter = 100
while(it < max_iter):
z[np.absolute(z) < 10] = z[np.absolute(z) < 10]**2 + c #the logic in [] replaces our if statement. This line
it += 1 #updates the whole matrix at once, no need for loops!
return z
start = time.perf_counter()
julia_numpy(.4+.6j,z) #arbitrary choice of c
end = time.perf_counter()
print(endstart)

view raw

Julia-Numpy.py

hosted with ❤ by GitHub

Python with njit and prange

Back to the loops, but adding Numba.


import time
import numpy as np
from numba import njit, prange
x_dim = 1000
y_dim = 1000
x_min = 1.8
x_max = 1.8
y_min = 1.8j
y_max = 1.8j
z = np.zeros((y_dim,x_dim),dtype='complex128')
for l in range(y_dim):
z[l] = np.linspace(x_min,x_max,x_dim) np.linspace(y_min,y_max,y_dim)[l]
@njit(parallel = True)
def julia_raw_python_numba(c,z):
it = 0
max_iter = 100
while(it < max_iter):
#These will be expensive!
for y in prange(y_dim): #special loops for Numba! range -> prange
for x in prange(x_dim):
if abs(z[y][x]) < 10: #Runaway condition
z[y][x] = z[y][x]**2 + c #square and add c everywhere
it += 1
return z
start = time.perf_counter()
julia_raw_python_numba(.4+.6j,z)
end = time.perf_counter()
print(endstart)

Numpy with jit

Copy of “clever” NumPy, with Numba (jit).


import time
import numpy as np
from numba import jit
x_dim = 1000
y_dim = 1000
x_min = 1.8
x_max = 1.8
y_min = 1.8j
y_max = 1.8j
z = np.zeros((y_dim,x_dim),dtype='complex128')
for l in range(y_dim):
z[l] = np.linspace(x_min,x_max,x_dim) np.linspace(y_min,y_max,y_dim)[l]
@jit()
def julia_numpy_numba(c,z):
it = 0
max_iter = 100
while(it < max_iter):
z[np.absolute(z) < 10] = z[np.absolute(z) < 10]**2 + c #the logic in [] replaces our if statement. This line
it += 1 #updates the whole matrix at once, no need for loops!
return z
start = time.perf_counter()
julia_numpy_numba(.4+.6j,z)
end = time.perf_counter()
print(endstart)

We ran these codes on a desktop with:

  • Xeon® Processor E5-1660 v4 (20M Cache, 3.2-3.6 GHz) 8C/16T 140W
  • Linux CentOS
  • 4*32GB 2Rx4 4G x 72-Bit PC4-2400 CL17 Registered w/Parity 288-Pin DIMM (128GB Total)
  • 2*GeForce GTX 1080 Ti Founders Edition (PNY) 11GB GDDR5X – 960GB PM863a SATA 6Gb/s 2.5″ SSD, 1,366 TBW ( OS and Scratch ) 1.92TB PM863a SATA 6Gb/s 2.5″ SSD, 2,773 TBW
  • RAVE Proprietary Liquid Cooling Kit

The times used in the graph below are the minimum times each code took for 100 trials to run with varying array sizes. (Size is the edge length of the Julia set.)

Unknown-21

As you can see, using NumPy alone can speed up the Julia set calculation by a little over an order of magnitude; applying Numba to NumPy had no effect (as expected).  Numba gave speeds 10x faster than the NumPy, illustrating the advantage Numba brings.

Credits:

Data:

All of the data produced can be found on David’s GitHub: https://github.com/DavidButts/Julia-Timing-Data