One simple trick to improve the python speed x10 x100 (Codon)
https://docs.exaloop.io/codon/
You can compile “python” (codon) to native code, or have some python (codon) functions in your python codebase native compiled by the codon JIT. Also supports OpenMP and GPUs.
Syntax: (same as python with additions): https://docs.exaloop.io/codon/language/basics
Seems that the golden rule is that you can directly use your .py
file. (Except very large codebases, where you can use the codon JIT).
1. JIT
import codon from time import time def is_prime_python(n): if n <= 1: return False for i in range(2, n): if n % i == 0: return False return True @codon.jit def is_prime_codon(n): if n <= 1: return False for i in range(2, n): if n % i == 0: return False return True t0 = time() ans = sum(1 for i in range(100000, 200000) if is_prime_python(i)) t1 = time() print(f'[python] {ans} | took {t1 - t0} seconds') t0 = time() ans = sum(1 for i in range(100000, 200000) if is_prime_codon(i)) t1 = time() print(f'[codon] {ans} | took {t1 - t0} seconds')
[python] 8392 | took 39.6610209941864 seconds [codon] 8392 | took 0.998633861541748 seconds
2. OpenMP
@par for i in range(10): import threading as thr print('hello from thread', thr.get_ident())
@par for i in range(10): import threading as thr print('hello from thread', thr.get_ident())
3. GPU
Only Nvidia devices are supported
import gpu @gpu.kernel def hello(a, b, c): i = gpu.thread.x c[i] = a[i] + b[i] a = [i for i in range(16)] b = [2*i for i in range(16)] c = [0 for _ in range(16)] hello(a, b, c, grid=1, block=16) print(c)
This code is equivalent to (simpler):
a = [i for i in range(16)] b = [2*i for i in range(16)] c = [0 for _ in range(16)] @par(gpu=True) for i in range(16): c[i] = a[i] + b[i] print(c)