"Modeling a web server as a single process per request, the supervision model, a...

toast0 · on Feb 20, 2019

I think you're right that this will flip back and forth; but the key difference in my mind between the process per request model of Apache and friends, and the process per connection model of Erlang is that in Erlang, I can do a million connections/processes per machine, and that would be very unfeasible with Apache.

Both approaches _do_ give me a very straightforward programming environment for isolated processes, although the isolation guarantees are smaller in Erlang. I'd like to think it's easier to break the isolation for cross process communication with Erlang, but that's probably debatable.

In my mind, the Erlang model is validated by the Apache model, but it adds scale in a way that doesn't require a mental flip to event-driven programming (although, beam itself is certainly handling IO through event loops with kqueue or epoll or what have you underneath).

nostrademons · on Feb 21, 2019

"I can do a million connections/processes per machine, and that would be very unfeasible with Apache."

It's somewhat less infeasible now than it was in the early 2000s. The main barriers to C1M with an OS process per connection are:

1. Stack size. With 8M stacks 1M processes would take up 8 TB of RAM.

2. Process creation overhead - loading the executable into memory, setting up global context, opening sockets.

3. Context-switching overhead: swapping page tables, TLB flushes, saving registers, etc.

For #1, recent versions of Linux will happily let you create threads or processes with 4K stacks now. They also don't actually allocate the memory for the whole process, they just map pages, and then the page fault is what assigns a physical page to a virtual address, so if you never touch a memory location it doesn't exist in RAM. For #2, new processes get COWed from their parent and can inherit file descriptors as well, so all the read-only data (executables, static data, MMapped files, etc.) is essentially free. #3 is a legitimate reason why language-based solutions are faster (they don't have to flush the whole TLB on context-switch, and know exactly which registers they're using), but mostly affects speed rather than concurrent connections.

Scarbutt · on Feb 21, 2019

in Erlang, I can do a million connections/processes per machine, and that would be very unfeasible with Apache.

Very niche use case and even more in the context of serving HTTP requests, where the JVM/Go/C#/Rust and even nodejs will smoke erlang because it can't compete in raw performance.

lliamander · on Feb 20, 2019

One reason why I occasionally look in on DragonflyBSD is because of it's implementation of lightweight kernel threads seems like a compelling approach to addressing some of those trade-offs.