The thing I am describing is when you link a compilation unit using:
struct internal_state { int dummy; } state;
with another compilation unit that defined the same state differently:
struct internal_state {
int actual_meaningful_member_1;
unsigned long actual_meaningful_member_2; } state;
As far as I know, BSD socked do not do this. Zlib was doing this (https://github.com/pascal-cuoq/zlib-fork/blob/a52f0241f72433... ), but I have had the privilege of discussing this with Mark Adler, and I think the no-longer-necessary hack was removed from Zlib.
BSD sockets probably have a different kind of UB, related to so-call “strict aliasing” rules, unless they have been carefully audited and revised since the carefree times in which they were written. I am going to have to let you read this article for details (example st1, page 5): https://trust-in-soft.com/wp-content/uploads/2017/01/vmcai.p...
BSD sockets are weird in that the first struct's (sockaddr) size wasn't big enough, so APIs all take a nominal pointer to sockaddr but may require larger storage (sockaddr_storage) depending on the actual address.
/*
* Structure used by kernel to store most
* addresses.
*/
struct sockaddr {
unsigned char sa_len; /* total length */
sa_family_t sa_family; /* address family */
char sa_data[14]; /* actually longer; address value */
};
/*
* RFC 2553: protocol-independent placeholder for socket addresses
*/
#define _SS_MAXSIZE 128U
#define _SS_ALIGNSIZE (sizeof(__int64_t))
#define _SS_PAD1SIZE (_SS_ALIGNSIZE - sizeof(unsigned char) - \
sizeof(sa_family_t))
#define _SS_PAD2SIZE (_SS_MAXSIZE - sizeof(unsigned char) - \
sizeof(sa_family_t) - _SS_PAD1SIZE - _SS_ALIGNSIZE)
struct sockaddr_storage {
unsigned char ss_len; /* address length */
sa_family_t ss_family; /* address family */
char __ss_pad1[_SS_PAD1SIZE];
__int64_t __ss_align; /* force desired struct alignment */
char __ss_pad2[_SS_PAD2SIZE];
};
struct sockaddr_storage is insufficient as well. A Unix domain socket path can be longer than `sizeof ((struct sockaddr_un){ 0}).sun_path`. That's a major reason why all the socket APIs take a separate socklen_t argument. Most people just assume that a domain socket path is limited to a relatively short string, but it's not (except possibly Minix, IIRC).
> A Unix domain socket path can be longer than `sizeof ((struct sockaddr_un){ 0}).sun_path`
Hm, I didn't realize this, or if I knew this I had forgotten. It makes sense because sun_path is usually pretty small, I believe 108 chars is the most common choice, and typically file paths are allowed to be much longer.
Do you have a citation for this behavior? I can't seem to find it, though I'm not looking very hard.
I guess you are right that any syscall taking a struct sockaddr * also has a length passed to it... Some systems have sa_len inside struct sockaddr to indicate length, but IIRC linux does not. I've often thought that length parameter was sort of redundant, because (1) some platforms have sa_len, and (2) even without that, you should be able to derive length from family. But your Unix domain socket example breaks (2). Without being able to do that, I start to imagine that the kernel would need to probe for NUL chars terminating the C string anytime it inspects a struct sockaddr_un, rather than block-copying the expected size of the structure -- that would be needlessly complicated.
So I just reran some tests on my existing VMs and it turns out I remembered wrong. Here's the actual break down:
* Solaris 11.4: .sun_path: 108; bind/connect path maximum: 1023. Length seems to be same as open. Interestingly, open path maximum seems to be 1023 (judged by trying ls -l /path/to/sock), although I always thought it was unbounded on Solaris.
* MacOS 10.14: .sun_path: 104, bind/connect path maximum: 253. Length can be bigger than .sun_path but less than open path limit.
* NetBSD 8.0: .sun_path: 104, bind/connect path maximum: 253. Same as MacOS.
* Linux 5.4: .sun_path: 108, bind/connect path maximum: 108.
* AIX 7.1: .sun_path: 1023, bind/connect path maximum: 1023. Yes, .sun_path is statically sized to 1023! And like Solaris, open path maximum seems to be 1023 (as judged by trying ls -l /path/to/socket). Thanks to Polar Home, polarhome.com, for the free AIX shell account.
Note that all the above lengths are exclusive of NUL, and the passed socklen_t argument did not include a NUL terminator.
For posterity: on all these systems you can still create sockets with long paths, you just have to chdir or use bindat/connectat if available. My test code confirmed as much. And AFAICT getsockname/getpeername will only return the .sun_path path (if anything) used to bind or connect, but that's a more complex topic (see https://github.com/wahern/cqueues/blob/e3af1f63/PORTING.md#g...)
Linux also has the unusual extension of: if sun_path[0] is NUL, the path is not a filesystem path and the rest of the name buffer is an ID. I don't remember if that can have embedded NULs in that ID. I believe so.
I'm curious what exactly makes this undefined behavior.
And in particular, what about something like this?
struct Foo {
#ifdef __cplusplus
int bar() const { return bar_; }
private:
#endif
int bar_;
};
Or, taking this a step further:
struct _Foo;
typedef struct _Foo Foo;
// In C "struct _Foo" is never defined.
int Foo_bar(const Foo* foo) { return *(int*)foo; }
void Foo_setbar(Foo* foo) { *(int*)foo; }
Foo* Foo_new() { return malloc(sizeof(int)); }
#ifdef __cplusplus
struct _Foo {
void set_bar() { bar_ = bar; }
int bar() const { return bar_; }
private:
int bar_;
};
#endif
The above isn't ideal but it does provide encapsulation in a way that doesn't seem to violate strict aliasing (the memory location is consistently read/written as "int").
I think this is plenty ok. For one thing, If a struct as a member of type T, it's ok to access it through a pointer to T (and also the address of the struct is guaranteed to be identical to the address of the first member). For another, you are using dynamically allocated memory, so the only thing that matters is the type of the pointer when the access is finally made. It doesn't matter that it was a Foo* before, if what you dereference is an int*.
This is different from pretending that the address of a struct s { int a; double b; } is the address of a struct t { int a; long long c; } and accessing it through a pointer to that. If you do that, C compilers will (given the opportunity) assume that the write-through-a-pointer-to-struct-t does not modify any object of type “struct s”. This is what the example st1 in the article illustrates.
The latter is what I suspect plenty of socket implementations still do (because there are several types of sockets, represented by different struct types with a common prefix). It is possible to revise them carefully so that they do not break the rules, but I doubt this work has been done.
The ability to use pointers to structures with a Common initial Sequence goes back at least to 1974--before unions were invented. When C89 was written, it would have been plausible that an implementation could uphold the Common Initial Sequence guarantees for pointers without upholding them for unions, but rather less plausible that implementations could do the reverse. Thus, the Standard explicitly specified that the guarantee is usable for unions, but saw no need to redundantly specify that it also worked for pointers.
If compilers would recognize that operation involving a pointer/lvalue that is freshly visibly based on another is an action that at least potentially involves the latter, that would be sufficient to make code that relies upon the CIS work. Unfortunately, some compilers are willfully blind to such things.
The thing I am describing is when you link a compilation unit using:
with another compilation unit that defined the same state differently: As far as I know, BSD socked do not do this. Zlib was doing this (https://github.com/pascal-cuoq/zlib-fork/blob/a52f0241f72433... ), but I have had the privilege of discussing this with Mark Adler, and I think the no-longer-necessary hack was removed from Zlib.BSD sockets probably have a different kind of UB, related to so-call “strict aliasing” rules, unless they have been carefully audited and revised since the carefree times in which they were written. I am going to have to let you read this article for details (example st1, page 5): https://trust-in-soft.com/wp-content/uploads/2017/01/vmcai.p...