It may not be a victory across the board, but it certainly isn't a loss across the board. That's all I'm saying.
Let me answer some of your points more specifically:
NUMA is a concern, but with some work on the schedular to give goroutines slight affinity for threads, this can be largely mitigated. This could be as simple as a scheduler policy like `take the first queued goroutine that previously executed on this thread, looking upto 5 into the queue, otherwise take the first one` instead of `always take the first one`. The difficulty with this strategy is you could experience starvation of goroutines, and there are a ton of other complexities, which is why the current scheduler is so simple. I believe this will get better...
Context switching isn't the issue with threaded implementations of servers. Writing a server with a thread per connection is a bad idea because of the memory requirements. 1000 connections will lead to gigabytes of memory in use.
I don't think Go's performance envelope has shrunk or become less predictable, IMO. I think what we've given up is control over what threads execute what goroutines, essentially, the NUMA argument above. This will hopefully get better with time.
Alas, it is, because there's no communication between the kernel scheduler and the user-space scheduler. The resulting interaction has non-intuitive results.
Let me answer some of your points more specifically: NUMA is a concern, but with some work on the schedular to give goroutines slight affinity for threads, this can be largely mitigated. This could be as simple as a scheduler policy like `take the first queued goroutine that previously executed on this thread, looking upto 5 into the queue, otherwise take the first one` instead of `always take the first one`. The difficulty with this strategy is you could experience starvation of goroutines, and there are a ton of other complexities, which is why the current scheduler is so simple. I believe this will get better...
Context switching isn't the issue with threaded implementations of servers. Writing a server with a thread per connection is a bad idea because of the memory requirements. 1000 connections will lead to gigabytes of memory in use.
I don't think Go's performance envelope has shrunk or become less predictable, IMO. I think what we've given up is control over what threads execute what goroutines, essentially, the NUMA argument above. This will hopefully get better with time.