Why python recursions are too slow?
Benchmarking
I am a big fan of micro benchmarking. In this article, we will compare the performance of tail end recursive
Fibonacci(40) in C, Go, Python3 and javascript.
Note - A language isn't fast or slow, it's the compiler/interpreter which is responsible for the performance of a language. So for ease of understanding, we will be considering the most popular engines used for compiling/interpreting a particular language.
Let's jump right in
Python
import time
import sys
def fib(n):
if n <= 1:
return n
else:
return fib(n-1) + fib(n-2)
if __name__ == "__main__":
number = int(sys.argv[1])
start_time = time.time()
fib(number)
time_taken = time.time() - start_time
print(f"Time taken = {time_taken}s")
27 seconds!
Damn that is quite some time for calculating fibonacci(40)
, but we don't have the execution time of other languages to compare. So let's first find out how much time would a compiled program like C and Go take.
Go
package main
import (
"fmt"
"os"
"strconv"
"time"
)
func Fibo(n int) int {
if n <= 1 {
return n
}
return Fibo(n-1) + Fibo(n-2)
}
func main() {
number := os.Args[1]
i, _ := strconv.Atoi(number)
start := time.Now()
Fibo(i)
fmt.Println("Time take -", time.Since(start))
}
C
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int fibonacci(int n)
{
if (n == 0 || n == 1)
return n;
else
return (fibonacci(n - 1) + fibonacci(n - 2));
}
int f(int);
int main(int argc, char *argv[])
{
int n, res;
double time_spent = 0.0;
n = atoi(argv[1]);
clock_t begin = clock();
res = fibonacci(n);
clock_t end = clock();
time_spent += (double)(end - begin) / CLOCKS_PER_SEC;
printf("The elapsed time is %f seconds", time_spent);
return 0;
}
So compiled languages like golang
and C
evaluates Fibonacci(40) in about 500-700ms
. We cannot expect an interpreted language to be as fast as compiled one. So let's take another interpreted language(javascript
) and find the execution time for Fibonacci(40)
Javascript
function fib(n) {
if (n <= 1) {
return n;
} else {
return fib(n-1) + fib(n-2);
}
}
number = process.argv[2];
console.time('fibonacci');
fib(number);
console.timeEnd('fibonacci');
Well as a python
developer, this made me cry. How come javascript
is able to evaluate the same thing in under 1s.
Here is the summary of runtime comparison for 4 languages.
Language | Runtime (milliseconds) |
C | 660 |
Go | 590 |
Javascript | 970 |
Python | 27200 |
First of all, I am surprised that Go
is performing better than C
. But that's irrelevant to our current discussion so let's forget C
and Go
for now and focus on comparing python
with javascript
.
Both of them are interpreted language, both are dynamically typed so the only thing that could make the difference is the compiler/interpreter. Javascript
is executed on V8
engine and python is interpreted with CPython
compiler. The major difference between V8
and CPython
is that V8 has a JIT compiler
whereas CPython
doesn't.
A quick refresher of JIT Compiler.
JIT
(Just in time) compilers are a way of improving the performance of an interpreted programs. JIT
compiler compiles code during execution rather than prior to execution. It determine the frequently executed code and compiles them to machine code during runtime.
CPython
executes a code in 2 steps. In step 1, It converts python code into bytecode and in step 2, it interprets the bytecode line by line and converts that to machine code.
V8
compiles javascript directly into machine code. So if we could compare javascript with python without the JIT compiler, Python would be faster. But this is not the case, for certain types of programs like recursions, Python would be slower because it doesn't pre-compile for future use.
Hence in our tail end recursive Fibonacci function, the bytecode is interpreted line by line 331160281
times which hampers the python performance. Whereas the JIT compiler in V8 compiles the recursive function into machine code as soon as it realizes that is getting called frequently. Hence translation time from javascript code to machine code is saved almost 331160281
time which makes javascript faster.
Why CPython doesn't have JIT compiler?
The answer to this is simple. CPython doesn't have enough resources or money compared to V8. You might already know about pypy which is an alternate implementation of python to Cpython. Pypy
has a JIT compiler so is comparatively faster.
Let's see how much time a pypy compiler would take for calculating a fibonacci(40) using recursion
Clearly pypy is much faster than CPython so why don't we use it ?
Because pypy doesn't have support to many common libraries like numpy, pandas, scipy etc.
So what is a workaround to make recursions faster with CPython?
Now since we know that CPython is not an intelligent compiler. We have to make sure that our program is optimized the best way possible. For this particular use case, we could either have a memoized algorithm or you could use function caching.
What is function caching?
Python has this cool feature of letting us cache the return value of a function based on the arguments passed - read more
Let's check the calculation time with function caching for fib(40)
import time
import sys
from functools import lru_cache
@lru_cache(maxsize=50)
def fib(n):
if n <= 1:
return n
else:
return fib(n-1) + fib(n-2)
if __name__ == "__main__":
number = int(sys.argv[1])
pro_time = time.process_time()
fib(number)
print(f"Time taken = {pro_time}s")
Conclusion
Writing programs in python requires more performance optimization at the code level since CPython interpreter won't optimize it much for you.
Subscribe to my newsletter
Read articles from Ajitesh Singh directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by