## Wednesday, June 13, 2018

### Profile your code in Jupyter notebook/lab

We discussed using profiler to profile your code and find out where it is slow in the previous blog, and but you need to run from command line. Today, we will have a look of the profile code in jupyter notebook. Note that, if you haven’t installed ‘line_profiler’, install it first:
pip install line_profiler

Let’s first define some functions to calculate random things. There are three functions that calling one by one.
def square_the_value(x, y):

return a**2

z = 0
for i in range(1000):
z += x
for j in range(1000):
z += y

return z

def calculate_my_value(x, y):

a = x + y
b = x - y

print(square_the_value(a, b))

calculate_my_value(1, 2)

994009000000

Now we want to have an idea of which part of the code running fast and which part running slow. We could use the line_profiler to do the job. First, we need to load the extension:
%load_ext line_profiler

Let’s profile the top level function that we run. We can see that we use ‘%lprun’, which basically run the line_profiler, the ‘-f’ flag is to tell it which function or method we want to profile, and the calculate_my_value(1, 2) is the real statement that we want to run:
%lprun -f calculate_my_value calculate_my_value(1, 2)

994009000000

Timer unit: 1e-06 s

Total time: 0.295409 s
Function: calculate_my_value at line 16

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
16                                           def calculate_my_value(x, y):
17
18         1          3.0      3.0      0.0      a = x + y
19         1          1.0      1.0      0.0      b = x - y
20
21         1     295405.0 295405.0    100.0      print(square_the_value(a, b))

Now we could see that the line_profiler give us the time to run each line, and what’s the percentage of this line takes. We could see that the last line used all the time. We can continue to profile the last time by entering into the square_the_value function:
%lprun -f square_the_value calculate_my_value(1, 2)

994009000000

Timer unit: 1e-06 s

Total time: 0.39605 s
Function: square_the_value at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
1                                           def square_the_value(x, y):
2
3         1     396048.0 396048.0    100.0      a = add_1000_times(x, y)
4
5         1          2.0      2.0      0.0      return a**2

Similarly, we could profile the add_1000_times function to figure out which line really takes all the time:
%lprun -f add_1000_times calculate_my_value(1, 2)

994009000000

Timer unit: 1e-06 s

Total time: 0.829793 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================