Don't know why the downvotes; this limitation of Python is really hard to fathom. IIRC they had a project to remove it and it went into the weeds somehow (I think it made performance worse?)
Python has GIL -> Python apps are mostly single-threaded -> Single-threaded performance is important -> Adding granular locking has impact on single-thread performance -> CPython "isn't supposed to have perf regressions" -> Python has GIL
Because e.g. nodejs has a GIL, too, and apparently no one thinks this is a problem.
For web applications one usually has a software chain like web server <-> wsgi server <-> dozens of python instances.
Standalone processes just implement threading, which also is fairly easy (as far as theading itself can be easy).
Scientific libraries like scipy can use parallel processes automatically in the background (using things like BLAS), as long as the data is modelled correctly.
I just gonna repost what I've written before about this before:
The current state of threading and parallel processing in Python is a joke. While they are still clinging to the GIL and single-core performance, the rest of the world is moving to 32 core (consumer) CPUs.
Python's performance, in general, is a crappy[1] and is beaten even by PHP these days. All the people that suggest relying on multiprocessing probably haven't done anything that's CPU and Memory intensive because if you have a code that operates on a "world-state" each new process will have to copy that from a parent. If the state takes ~10GB each process will multiply that.
Others keep suggesting Cython. Well, guess what? If I am required to use another programming language to use threads, I might as well go with Go/Rust/Java instead and save the trouble of dabbling with two languages.
So where does that leave (pure-)Python? It can only be used in I/O bound applications where the performance of the VM itself doesn't matter. So it's basically only used by web/desktop applications that CRUD the databases.
It's really amazing that the machine learning community has managed to hack around that with C-based libraries like SciPy and NumPy. However, my suggestion would be to drop GIL and copy the whatever model has been working for Go/Java/C#. If you can't drop GIL because some esoteric features depend on that, then drop them as well.
Every single project which has tried to drop the GIL has failed in some way. It's not some "esoteric features", it's fundamentally a hard problem that implicates the entirety of the python object model, python C api, scoping, imports, and GC.
I think multi-interpreting is the way to go, but that still would require a framework for ensuring safe memory access.
Speaking of Go, I always thought it would be neat to write a python implementation in Go, but leverage Go's GC, and implement the 'go' keyword/function for easy parallelism. But you still have the problem of scoping and memory safety. Or similar idea but with Rust. Something tells me that isn't a trivial undertaking, especially if you want all the libraries, which is 75% the point of python.
> All the people that suggest relying on multiprocessing probably haven't done anything that's CPU and Memory intensive because if you have a code that operates on a "world-state" each new process will have to copy that from a parent. If the state takes ~10GB each process will multiply that.
This is wrong, there are multiple ways of python threads working on shared data.
> It's really amazing that the machine learning community has managed to hack around that with C-based libraries like SciPy and NumPy.
Well, the main implementation of the whole language is c-based. I can't see how that implies hackiness.
> If you can't drop GIL because some esoteric features depend on that, then drop them as well.
There have been multiple python implementations without a GIL available for way over 10 years, for example pypy, ironpython and jython. Yet these never went mainstream, which strongly implies the GIL problem actually isn't that much of a problem in the real world.
There are (IMHO) two scenarios where this doesn't matter: If you're writing a web service then the multi threading (processing) is done by the front end ie Nginx; and for 'scientific' or similar computationally expensive stuff Python is turning into a scripting layer over C libraries. Have a look at numpy to see what I mean.