Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Numba fits very few usecases, but where it does fit it's awesome.

I've been using it in a python graph library to write graph traversal routines and it's done me very well: https://github.com/VHRanger/nodevectors

The best part is the native openMP support on for loops IMO. Makes parallelism in data work very efficient compared to python alternatives that use processes (instead of threads)



Why do you say it fits very few uses cases?


Numba can compile in “no Python” mode only with a subset of Python. Eg classes is limited and is still experimental. Also I think string manipulation is slow but the doc has details.

If you want to specify the type (for example for aot or just because you want to make it clear) then the call signature is less flexible.

In short, pick any random Python library, you’d find there are very few places you can jit accelerate something effectively. It is for numeric.

Even for numerical code, it is more like writing C functions than say C++ (with classes etc).

But it does makes accelerating vectorized code very easy. Even if you have a function that uses Numpy, it is likely you can speed it up using Numba with a decorator only.

But when it doesn’t work, it might often be not very clear why you can’t until you get some experience.


The ahead of time compilation output is... Well.. let's say difficult to package _properly_ (compare it to Cython where it's well supported and documented). That makes it useless for production, unless you want to ship giant containers with compilers etc


In theory, a compiler toolchain is not required since Numba already comes with LLVM, i.e. for JIT compilation, no additional compiler is necessary.

In the past, that was also possible for AOT compilation [1], but that technique broke during some update and it seems like there is no one left who knows how to fix this.

[1] https://stackoverflow.com/a/42198101


Jax also has experimental support for persisting its JIT cache on the filesystem in the ‘jax.experimental.compilation_cache’ module.


How does jax compare to numba?


numba is more general. Any change to the shapes of arrays triggers a JIT recompilation in jax, numba is a bit more forgiving. jax has autodiff that numba doesn't. Also, JAX supports TPUs, which numba doesn't support (yet).


What??? Numba has more usage in the AI/ML community than Cython has ever had by anyone, ever.

"Fits very few use cases" LOL okay without numba there's no UMAP and HDBScan and those are pretty popular and important libraries that come to mind just off the top of my head...

Also, claiming Cython is well documented also gets a huge LOL from me as someone whose actually written a bit of Cython.


I have written quite a bit of cython code as well and at least the last time I looked cython was much better documented than numba (it has been a couple of years though so things might have improved on the numba side), and I would agree with the previous poster is generally quite well documented.


Also I am specifically referring to the documentation on creating ahead of time compiled packages using Numba vs Cython, but perhaps that was unclear.


FWIW Numba's JIT caches the compiled function as long as you don't call it again with different type signatures (eg. int32[] vs int64[])

I've succesfully deployed numba code in an AWS lambda for instance -- llvmlite takes a lot of your 250mb package budget, but once the lambda is "warm" the jit lag isn't an issue.

That said, if you absolutely want AOT you'll have to use Cython or some horrible hack dumping the compiled function binary.


Exactly!


You realize that Scikit-learn is written mostly in Cython (where high performance is needed)? It is a part of the most influential ML library in existence.


Also pandas


You're aware pandas and most of scipy is cython, right?

I like numba, but cython is clearly used more in the popular packages


It’s not really gonna be used in your database rest api is it.


I assume the parent comment was talking about the context of computations where numba is supposed to be a drop-in for wherever numpy is used.

And I agree that it's not actually usable everywhere, since the support for numpy's feature set is actually quite limited, especially around multidimensional arrays. I had to effectively rewrite my logic to make use of numba. Still it is pretty worth it imo, given how it can add parallelism for free. And conforming to numbas allowed subset of numpy usually results in simpler and more efficient code. In my case I ended up having to work around the lack of support for multidimensional arrays but ended up with a more efficient solution relying on low dimensional arrays being broadcasted, reducing a lot of duplicate computations


I've had success with numba speeding up code that worked on apache arrow returned by duckdb, which might just go into a rest api




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: