I'd agree a filesystem is also a type of hierarchical database but the author do...

throwaway09223 · on July 24, 2021

Sure, and I would agree with you here.

These are the kinds of categorizations that people can go nuts over. Rather than get too hung up on words I'd say that whatever this is, it can effectively be represented by a filesystem and therefore it should be as a matter of general architecture and security principle.

zamadatix · on July 24, 2021

I'm actually with the author that if it were going to be rewritten a freshly written columnar database would be way more efficient than representing it as a filesystem but that either would be better than what we have after 30 years. I just don't think "it wasn't a filesystem originally" has much to do with why it's so crap now. Similar case: posix specifies network sockets be accessed as files/filesystems (as most everything in posix is) but nobody actually used that representation because it's inefficient even though it's the standard and easily mappable to files/filesystems. Well I think Solaris actually allows both but the point stands.

throwaway09223 · on July 24, 2021

Sorry, I'm unfamiliar with what you mean by "network sockets be accessed as files." Do you mean unix domain sockets? These are in fact commonly used and they're certainly no less efficient (more efficient in many ways, in fact).

UDS are interfaced with via the same berkeley sockets api, not via the filesystem api. Have you ever written applications that use them?

zamadatix · on July 25, 2021

I don't mean unix domain sockets, those are known as IPC sockets. The berkeley sockets API you are familiar is actually exactly what I was talking about. It does offer both types of sockets (the other being network sockets as I originally mentioned) but it uses handles in an abstract namespace not files in a filesystem (e.g. in Linux it's still a FD but it doesn't map to an actual file on a filesystem it's just a unique handle in its own namespace).

What I was referring to were things like /dev/tcp/ and /dev/udp you'll find on Solaris (or emualated via bash on most systems) which are actual filesystem paths instead of handles in abstract namespace. A usage example of this comparable to binding to a socket with the BSD API to udp://localhost:2048 would be "echo "example" > /dev/udp/localhost/2048". The actual I/O is through the standard file/filesystem interface just like /dev/random. It's not the best for network sockets though so they tend to get a raw handle in every modern OS, even if it does mean rebuilding the wheel on some other things.

Network sockets are the canonical example of "not everything in Unix is a file". "Everything is a FD" is true but "everything is a handle" is true on any OS design, the uniqueness that things like ram and disks are just files under / did not hold true with networking.

And yes I have written plenty of apps with ipc sockets and network sockets and raw sockets and even underlying device access (for things like custom Ethernet packets). I'm in networking by profession.

throwaway09223 · on July 25, 2021

I think there's some confusion about how the sockets api works, let me see if I can clear this up.

Posix does not specify that network sockets should be accessed by file paths. It's possible to do so, but unspecified by the standard.

Sockets produced by socket(2) are regular old file descriptors, just as created by open(2) on a file path, or any other descriptor generating syscall like pipe(2) or epoll_create(2). There is no separate representation among any of these -- they are all just file descriptors. There are many, many ways to create descriptors and many aren't associated with a filesystem. There's no efficiency issue here, nor is there a divergence from a consistent pattern.

If you like, you can use fchmod(2) on a descriptor generated by socket(2) and change its permissions. You can track it by its inode. It doesn't matter that the descriptor is not linked to a filesystem, any more than for a similar descriptor created by pipe(2). They all have the same functionality and fit within the same consistent metaphor. When you run grep | grep, the pipe descriptor has permissions, mtime, ctime, atime and the rest. Everything just works.

It's trivial to write a filesystem to expose descriptors, in fact /proc does this already for all descriptor tables across all processes. There's no rebuilding of any wheel - the point of commonality is the "struct file" in linux/fs.h.

There's no such thing as a "raw handle" here, btw. That phrase has no meaning.

zamadatix · on July 25, 2021

Thanks for giving detailed points instead of just asking if I've used sockets before - I think I see the main divergence as a result. First though I noted a big mistake on my part: I referenced the wrong UNIX standard originally. This is a huge error on my part, I meant to reference TLI (later standardized XTI, the "competitor" to BSD sockets at the time) not POSIX as what defined /dev/udp, /dev/tcp, and the APIs to access it instead of BSD/POSIX style socket APIs e.g. 't_open("/dev/udp", O_RDWR);'. My apologies I'm sure I was completely misdirecting a lot of the conversation and causing a lot of confusion with that error.

For where the main divergence in what we are each talking about though when I originally said:

> Similar case: posix specifies network sockets be accessed as files/filesystems (as most everything in posix is) but nobody actually used that representation because it's inefficient even though it's the standard and easily mappable to files/filesystems.

I was talking about literally exposing networking through the filesystem by mapping the construct to files and paths as that's what the author's registry tool actually does and what the author was proposing Windows should do in a rewrite - not whether or not sockets can be backed by the preexisting FD handle in an arbitrary namespace using custom socket functions to manage the socket efficiently. As a result I was trying to explain to you why BSD-type sockets don't use a literal filesystem mapping even if they have an FD and you were trying to explain to me how they were still backed by a FD even if it's not in the filesystem (i.e. describing the same API from opposite ends). I agree fully with your take it's a standard FD which has no performance concerns once created and can be treated as such but looking back I think I tried quite hard to point out I was talking about literal files/filesystems mapping not the FD handle so I'm not sure where the split came from... perhaps normally it"everything is a file" vs "everything is a FD" not a big distinction but this case just happened to be about a literal filesystem mapping not whether or not it would end up using a FD.

Also to note when I talk about "inefficient" I don't mean "slow to compute" (after all it ends up a FD once opened, as noted heavily at this point) it's the interface which becomes inefficient (which is what the registry article was dealing with). Even though you call it trivial as in "trivial to expose a mapping", which there is no argument is trivial BSD sockets offered much more straightforward and simple moldability around internet protocol network socket concepts than the filesystem approach by the author/TLI's approach which is a big reason BSD sockets won out. The "rebuilding of the wheel" are that the BSD socket API defines functions fit to purpose instead of molded around traditional file API naming and structure like TLI/XTI did.

In the section "It's not the best for network sockets though so they tend to get a raw handle" 'raw' was meant as another adjective to point out it was just a non-filesystem mapped reference (howso depends on the OS, in *nix still a FD in others not it doesn't really matter though it's just a ref) per the prior sentence not meant to be taken that 'raw handle' was a proper noun describing a different type of handle definition you'll find in the source code.

I appreciate the time, at the very least I'm sure I'll never make the error of conflating TLI with POSIX again and at the most I may have solidified some internals I don't get to think about every day!