HN2new | past | comments | ask | show | jobs | submitlogin

fork is a massive feature, not a bug.
 help



fork() is a misfeature, as is SIGCHILD/wait and most of Unix process management. It worked fine on PDP-11 and that's it.

But Linux also overcommits mmap-anonymous/sbrk, while Windows leaves the decision to the user space, which is significantly slower.


Not really. It elegantly solves the "create a process, letting it inherit these settings and reset these other settings", where "settings" is an ever changing and expanding list of things that you wouldn't want to bake into the API. Thus (omitting error checks and simplifying many details):

  pipe (fd[2]);      // create a pipe to share with the child
  if (fork () == 0) { // child
    close (...);     // close some stuff
    setrlimit (...); // add a ulimit to the child
    sigaction (...); // change signal masks
    // also: clean the environment, set cgroups
    execvp (...);    // run the child
  }
It's also enormously flexible. I don't know any other API that as well as the above, also lets you change the relationship of parent and child, and create duplicate worker processes.

Comparing it to Windows is hilarious because Linux can create processes vastly more efficiently and quickly than Windows.


> It elegantly solves the "create a process, letting it inherit these settings and reset these other settings", where "settings" is an ever changing and expanding list of things that you wouldn't want to bake into the API.

Or, to quote a paper on deficiencies of fork, "fork() tremendously simplifies the task of writing a shell. But most programs are not shells".

Next. A first solution is trivial: make (almost) all syscalls to accept the target process's pidfd as an argument (and introduce a new syscall to create an empty process in suspended state) — which Windows almost (but not quite) can do already. A second solution would be to push all the insides of the "if (fork () == 0) { ... }" into a eBPF program and pass that to fork() — that will also tremendously cut on the syscall costs of setting up the new process's state as opposed to Windows (which has posix_spawn()-like API).

> create duplicate worker processes.

We have threads for this. Of course, Linux (and POSIX) threads are quite a sad sight, especially with all the unavoidable signalling nonsense and O_CLOFORK/O_CLOEXEC shenanigans.


Yes, but at what cost? 99% of fork calls are immediately followed by exec(), but now every kernel object need to handle being forked. And a great deal of memory-management housekeeping is done only to be discarded afterward. And it doesn't work at all for AMP systems (which we will have to deal with, sooner or latter).

In 1970 it might have been the only way to provide a flexible API, but nowadays we have a great variety of extensible serialization formats better than "struct".


> In 1970 it might have been the only way to provide a flexible API, but nowadays we have a great variety of extensible serialization formats better than "struct".

Actually, fork(2) was very inefficient in the 1970's and for another decade, but that changed with BSD 4.3 which shipped an entirely new VMM in 1990 in 4.3-Reno BSD, which – subsequently – allowed a CoW fork(2) to come into existence in 4.4 BSD in 1993.

Two changes sped fork (2) up dramatically, but before then it entailed copying not just process' structs but also the entire memory space upon a fork.


AFAIR it was quite efficient (basically free) on pre-VM PDP-11 where the kernel swapped the whole address space on a context switch. It only involved swapping to a new disk area.

I used MINIX on 8086 which was similar and it definitely was not efficient. It had to make a copy of the whole address space on fork. It was the introduction of paging and copy-on-write that made fork efficient.

Oh, is that how MINIX did that? AIUI, the original UNIX could only hold one process in memory at a time, so its fork() would dump the process's current working space to disk, then rename it with a new PID, and return to the user space — essentially, the parent process literally turned into the child process. That's also where the misconception "after fork(), the child gets to run before the parent" comes from.

At no cost apparently, since Linux still manages to be much faster and more efficient than Windows.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: