One simple trick to improve the python speed x10 x100 (Codon)
https://docs.exaloop.io/codon/
You can compile “python” (codon) to native code, or have some python (codon) functions in your python codebase native compiled by the codon JIT. Also supports OpenMP and GPUs.
Syntax: (same as python with additions): https://docs.exaloop.io/codon/language/basics
Seems that the golden rule is that you can directly use your .py file. (Except very large codebases, where you can use the codon JIT).
1. JIT
import codon
from time import time
def is_prime_python(n):
if n <= 1:
return False
for i in range(2, n):
if n % i == 0:
return False
return True
@codon.jit
def is_prime_codon(n):
if n <= 1:
return False
for i in range(2, n):
if n % i == 0:
return False
return True
t0 = time()
ans = sum(1 for i in range(100000, 200000) if is_prime_python(i))
t1 = time()
print(f'[python] {ans} | took {t1 - t0} seconds')
t0 = time()
ans = sum(1 for i in range(100000, 200000) if is_prime_codon(i))
t1 = time()
print(f'[codon] {ans} | took {t1 - t0} seconds')
[python] 8392 | took 39.6610209941864 seconds [codon] 8392 | took 0.998633861541748 seconds
2. OpenMP
@par
for i in range(10):
import threading as thr
print('hello from thread', thr.get_ident())
@par
for i in range(10):
import threading as thr
print('hello from thread', thr.get_ident())
3. GPU
Only Nvidia devices are supported
import gpu
@gpu.kernel
def hello(a, b, c):
i = gpu.thread.x
c[i] = a[i] + b[i]
a = [i for i in range(16)]
b = [2*i for i in range(16)]
c = [0 for _ in range(16)]
hello(a, b, c, grid=1, block=16)
print(c)
This code is equivalent to (simpler):
a = [i for i in range(16)]
b = [2*i for i in range(16)]
c = [0 for _ in range(16)]
@par(gpu=True)
for i in range(16):
c[i] = a[i] + b[i]
print(c)