I don't even write python, really, but I've been interfacing with llama.cpp and whisper.cpp through it recently because it's been most convenient. Before that I was using nodejs libraries that just wrap those two cpp libs.
I guess since these models are meant to be run "client side" or "at the edge" or whatever you want to call it, it helps if they can be neutrally used with just about any wrapper. Using them from Javascript instead of Python is sort of huge for moving ML off the server and into the client.
I haven't really dipped my toes into the space until llama and whisper cpp came along because they dropped the barrier extremely low. The documentation is simple, and excellent. The tools that it's enabled on top like text-generation-webui are next level easy.
I guess since these models are meant to be run "client side" or "at the edge" or whatever you want to call it, it helps if they can be neutrally used with just about any wrapper. Using them from Javascript instead of Python is sort of huge for moving ML off the server and into the client.
I haven't really dipped my toes into the space until llama and whisper cpp came along because they dropped the barrier extremely low. The documentation is simple, and excellent. The tools that it's enabled on top like text-generation-webui are next level easy.
git clone. make. download model. run.
That's it.