So you think you know C? (2016)

WalterBright · on July 6, 2019

Having written a conforming C compiler, at one point I knew everything there was to know about C (I forget details now and then, or confusing them with C++ and D).

But knowing every engineering detail is not the same thing as knowing how to program in C effectively. It's like being the engineer who designs a Grand Prix car. It does not mean you can drive it faster around the track than anyone else. Not even close.

For example, the C preprocessor is surprisingly complicated. I had to scrap it and rewrite it completely 3 times. If you try to make use of all those oddities, my advice is don't waste your time. Over time I removed all the C preprocessor tricks from my own code and just wrote ordinary C in its place. Much better.

pcwalton · on July 6, 2019

I couldn't agree more. After spending many years working with LLVM, which is at its heart a C compiler, and understanding why it has to do the sometimes-terrifying things it has to do to get C to run well, I've become very paranoid when writing C or C++. My C/C++ code is as boring as possible.

(In fact I try to avoid writing C or C++ whenever possible these days; undefined behavior in the language is too pernicious and unfixable without breaking compatibility. I think both languages are approaching obsolescence.)

WalterBright · on July 6, 2019

> My C/C++ code is as boring as possible.

One advantage to being an older programmer is I don't feel any need to show off any more. I try to make it so obvious that anyone would look at it and think that's so simple, anyone could do it.

It's surprisingly hard to write simple code. Any idjit can come up with Rube Goldberg code.

bufferoverflow · on July 6, 2019

Though different people consider different things simple. To one a loop is fine, to another a map is a better choice.

dahart · on July 6, 2019

> to another a map is a better choice.

Are you thinking more of JS or another functional scripting language than C/C++? C doesn’t have map, so it’s not an option, and in C++ it’s called something else.

In JS, I so wish that I could switch to functional constructs like map permanently, but map and foreach are much slower than loops, an order of magnitude or more for tight loops. I’m still forced to use loops in performance critical code, even if I consider map a better choice.

Coming back to C++ after having been in JS land for 5 years, C++ feels constantly difficult to use, and all the names for the functional primitives don’t seem to make intuitive sense like they do in JS.

Gene_Parmesan · on July 6, 2019

> order of magnitude or more for tight loops

I just learned this about JS's map/foreach a few weeks ago. I didn't think it was possible to be more disappointed by JS than I already was, but somehow I managed it.

WalterBright · on July 6, 2019

map/filter/reduce is simpler, but it takes some getting used to. Loops have worn a deep rut in my brain.

akavi · on July 6, 2019

As someone who was introduced to programming via map/filter/reduce, for loops are incredibly more complicated.

I don’t think I’ve ever written one correctly on first try.

WalterBright · on July 6, 2019

I've used loops so much I don't even see the loop anymore as a collection of constructs, I see it as a single thing. There are a lot of easy mistakes to make with loops, but I don't make them anymore (of course, by writing that, I will make one!). For example:

    #include <stdbool.h>
    #include <stdio.h>
    typedef long T;
    bool find(T *array, size_t dim, T t) {
      int i;
      for (i = 0; i <= dim; i++);
      {
        int v = array[i];
        if (v == t)
          return true;
      }
   }

There are 5 errors in that example.

taejo · on July 7, 2019

Is the fifth error that there are only four errors?

whynotminot · on July 6, 2019

Are you primarily a JS/front-end dev?

Because this is crazy to me! Some languages only use for loops (Go).

akavi · on July 6, 2019

Full-stack, spent most of my time on the backend in ruby or scala.

Gene_Parmesan · on July 6, 2019

> I don’t think I’ve ever written one correctly on first try.

Please don't take offense, but this is odd to me. I honestly severely doubt I am some sort of programming super genius, but I have never had any issues setting looping logic correctly. (I took four years to teach myself programming & CS and now I've been at my first professional dev job for ~6 months.) None of my colleagues seem to have such issues either. What are you experiencing trouble with most? Off-by-one?

prirun · on July 7, 2019

For those who may have trouble with loops, take heart: after 45 years of non-stop programming, often 10+ hours a day, I still find myself sometimes doing mental loop simulations with small (few element) data to make sure a loop is correct.

It's usually much easier to take extra time to make sure it's right than to debug it later.

akavi · on July 6, 2019

I mean, I've written at most a couple dozen in my 7 years as a professional programmer (usually tight loops for perf), so sheer unfamiliarity is a big factor.

But yeah, syntax (what order the arguments go in) and off-by-one issues are the majority, I think. Plus figuring out what my initial accumulator needs to be.

Idk, map/filter and friends are just a much more direct mapping of how I think of programming.

naasking · on July 6, 2019

Unfortunately, loops still seem to be the easiest way to iterate over two collections simultaneously.

akavi · on July 6, 2019

“#zip”?

Or maybe I’m misunderstanding the use case?

naasking · on July 6, 2019

Which only works if the sequences align perfectly. Handling misaligned collections is more awkward with functional constructs.

Functional graph programming is also still a bit of an open problem. There are awkward scenarios in both cases.

akavi · on July 6, 2019

Ah, I see what you mean. Yeah, that’s always awkward

Not sure how you deal with that with for loops either. Increment the iteration var in the body of the loop? (Seems scary to me, but like I said, I’ve got terrible intuition with them)

naasking · on July 8, 2019

For something like iterators:

    var ie1 = foo.GetEnumerator();
    var ie2 = bar.GetEnumerator();

    while(true)
    {
        var has1 = ie1.MoveNext();
        var has2 = ie2.MoveNext();
        if (!has1 && !has2)
            break;
        if (has1)
            // do something with ie1.Current
        if (has2)
            // do something with ie2.Current
    }

pjmlp · on July 6, 2019

Same here.

C++ only when Java or .NET need it's assistance, or integration with OS APIs that require C++ (NDK, WinUI, DX).

Then C only when there is no alternative (customer wants it, we only do C here, required lib is C only e.g. SDL, ...).

It is an herculean project, but maybe some day LLVM could be rewritten into something else. After all it isn't the first compiler stack, just the one that got most famous.

dvdbloc · on July 6, 2019

What would we write drivers in then if C was obsoleted?

pjmlp · on July 6, 2019

Apple says the future is C++ and Swift.

Microsoft says the future is a mix of constrained C++ (Core Guidelines), Rust and AOT C#.

Google says the future is C++ and Java (as of Treble) on Android, with Go, C++ and Rust on ChromeOS and Fuchsia.

ARM says C++ on mbed.

GenodeOS says C++ and Ada.

Newton, Symbian, Bada and BeOS used C++.

C is married with UNIX, they were born to each other, other OSes have long followed other paths.

rPlayer6554 · on July 6, 2019

Minor point, but isn't Android moving to Kotlin? I don't think they see Java being a sustainable choice for the long term.

pjmlp · on July 6, 2019

Yes they are, but in what concerns Treble and platform libraries, they plan to keep their Java 8 variant around.

Apparently they are also adding support for desugaring Java 10 language features (yep 10 not 12) that don't rely on new JVM bytecodes (as per Google IO talk about state of Android tooling).

At least for now.

satyenr · on July 6, 2019

Rust? I know a lot of people evangelise rust — to the point of annoyance of others — but as we move into an era where the entire world is run on computers, it is just not acceptable to have decades old infrastructure susceptible to bugs often caused by someone not understanding undefined or implementation dependent behaviour.

xondono · on July 6, 2019

Is still a little early, but as someone who works at the bare metal level, Rust shows early promise.

The features required are still unstable (as in not in the stable version of the compiler), but it’s getting there, and doing it fast.

In any case, firmware behaves a lot like banking software, language change will take time

jason0597 · on July 6, 2019

Cargo feels like too bloated to me for it to be something suitable for something low level like embedded. Unless the rust team can make it more appealing to use the language without cargo, I don't think there's much future.

xondono · on July 7, 2019

In bare metal systems, cargo would run on the developers host system, so that should not be an issue.

Also, most toolchains of the sector are way more complex to maintain and way more bloated.

xedrac · on July 6, 2019

We use rust in embedded and love cargo! Coming from C++ and the endless mess of build systems that exist there, cargo is a breath of fresh air! What don't you like about it?

Athas · on July 6, 2019

The Oberon system has drivers in, well, Oberon (which is a high-level Pascal successor with garbage collection). Low-level memory access is done through magical peek/poke functions, in which all the dirtyness is concentrated (and these might not be allowed in user programs - not sure). This means the language as a whole is not littered with unsafe pointers just to service the tiny subset of programs that need them.

pjmlp · on July 6, 2019

And it is used commercially.

Astrobe has been selling development kits for ARM based boards for years now.

gpderetta · on July 6, 2019

pcwalton might have an opinion :). He is Rust lead designer.

pcwalton · on July 6, 2019

Not lead designer :)

But yes, Rust, or even in userspace, as newer and/or more microkernel-ish OS's allow for. Apple is doing work to allow drivers to be written in Swift...

saagarjha · on July 6, 2019

I believe DriverKit is still C++.

pjmlp · on July 6, 2019

Just some parts of it.

Some device classes (not to mix with OOP ones) can only be programmed with C++, while others can be developed in any compiled language able to link to the OS APIs.

There is a WWDC session on it.

saagarjha · on July 7, 2019

I watched it, and they're pretty clear that driver extensions (using DriverKit, like I mentioned) must be written in C or C++. System extensions can use any language.

pjmlp · on July 7, 2019

I was thinking about whole driver feature set, so got it wrong.

mpweiher · on July 6, 2019

...and of course pre-Apple, DriverKit was Objective-C.

pjmlp · on July 6, 2019

And pre-OS X, Mac OS was Object Pascal. :)

merlincorey · on July 6, 2019

C has a standard with multiple competing compilers.

Rust does not have these same features to date.

pjmlp · on July 6, 2019

With multiple slightly incompatible competing compilers.

WalterBright · on July 6, 2019

D as BetterC !

ant6n · on July 6, 2019

ogoffart · on July 6, 2019

And that's why people claiming that C++ is more complicated than C because it has an even bigger specification miss the point.

What counts is how easy to use in practice. You can get along just fine in C++ without knowing the exact aliasing rules from C or or how to specialize a template. What matters is that the extra features of C++ makes actual programming simpler, not harder. (for example destructor (RAII), standard library, classes, ...)

0815test · on July 6, 2019

> You can get along just fine in C++ without knowing the exact aliasing rules from C

Nope, you can't. These are exactly the things that introduce undefined behavior (i.e. total breakage) if you aren't very careful about what you're doing at all times. Don't take my word for it, check out what the C++ designers themselves state about the issue in the C++ Core Guidelines. C/C++ is far from simple, and thinking that you can just make things up as you go along is a serious mistake.

ogoffart · on July 6, 2019

You are absolutely right, but my point is that when you do modern C++, in the application code, it is very unlikely that you need to use reinterpret_cast in your code, and therefore you don't need to know all the subtlety about it.

So despite C++ being more complex than C, if you limit yourself to some practical subset, it is actually easier than C.

seba_dos1 · on July 6, 2019

Same can be (and has been) said about JavaScript. Or C. Or any language with any kind of issues.

And it doesn't work in practice.

Leherenn · on July 6, 2019

I think it does sometimes? At least for enterprise software.

We use both C and C++ embedded as well as C++/Qt for the desktop control system, and we have no issue keeping to a sane subset of C++.

There are clearly defined rules and code review does the remaining enforcement. And it's not even hard or time consuming as everyone is pretty much aligned and every small issue can be easily resolved with a quick chat.

rofo1 · on July 6, 2019

Strong agreement here.

You can limit yourself by choice to certain areas of the language that you know inside out (you know the asm they produce, etc.) and use it to solve problems. Don't worry about every single corner case. As you said, you can easily forget those things, especially if it's not your day-to-day job.

The same idea and principle can be found in "JavaScript - The Good Parts".

mhh__ · on July 6, 2019

Having recently been doing some web development, I'd argue that JavaScript's design (even today, despite some modern additions) is so conducive to undebuggable spaghetti code (i.e. this ) that there really are no good parts.

Sometimes the technology is objectively the wrong choice i.e. compiling javascript to native code (Not a JIT).

WalterBright · on July 6, 2019

Some languages do fight your attempts to write good code every step of the way :-)

No, I won't name names.

mhh__ · on July 6, 2019

I'm currently writing a library that loads, parses and includes c headers at compile time, in D.

It's not finished but let's just say I've managed to make yours and Andrei's thoughtful design into a monster

cperciva · on July 6, 2019

Over time I removed all the C preprocessor tricks from my own code and just wrote ordinary C in its place. Much better.

I don't get it. If my macros produce standards-compliant code and they make my code easier to read and understand, why shouldn't I use them?

My goal as a developer is to write clean performant bug-free code... not to make life easy for compiler developers.

WalterBright · on July 6, 2019

> If my macros produce standards-compliant code and they make my code easier to read and understand, why shouldn't I use them?

The problem is they don't make code easier to read and understand. Worse, the unhygienic nature of C macros makes it hard to contain them.

I haven't seen your code, so I'm speaking based on what I've seen of mine and others' code. If you dial it back, the person who has to deal with your code after you leave will appreciate it.

More generally speaking, if you're doing metaprogramming with the C macro system, you've outgrown the language and should consider a more powerful one.

icedchai · on July 6, 2019

I once worked at a place that had a platform specific "DEBUG" log macro.

It worked something like: DEBUG(msg); Except, it was defined in such a way that you actually had to have two closing parens, like DEBUG(msg)); It looked syntactically invalid, but whatever the macro did required it.

The entire code base was littered with WTFs like that...

WalterBright · on July 6, 2019

I hated debug macros. There just was never a clean way to write them. I was determined that D would not suffer from that problem. `debug` is a keyword in D, and you can do things like:

    debug printf("I got here\n");

and the printf only gets compiled in when compiling with -debug. (Any statement can be used after the printf.) Even better, semantic checks for debug statements are relaxed - for example, purity is not checked for them.

Meaning you can embed debug printf's in functions marked 'pure', instead of having to use a monad.

pjmlp · on July 6, 2019

Because when you work on a team, not everyone is a C Gandalf, probably not even yourself a couple of months later when fixing a bug with everyone screaming that the system is down.

Exaggerating here, but rule of thumb is that it takes twice to debug as it takes to write it, so how long do you want to take to do maintenance fixes?

gray_-_wolf · on July 6, 2019

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

--Brian Kernighan

cperciva · on July 6, 2019

rule of thumb is that it takes twice to debug as it takes to write it

Ok, so... macros which let me write code faster should help with debugging time as well? ;-)

pjmlp · on July 6, 2019

It is an exponential curve with cleverness as input parameter.

So it depends how complex they are.

xondono · on July 6, 2019

Can you assure that your preprocessor tricks always generate compliant code?

For me writing embedded code is the ultimate test for your programming skills, since a lot of C toolchains for embedded devices (as in bare metal embedded) are unstable, only almost compliant and full of weird behaviors and hacks.

cperciva · on July 6, 2019

Here's an example: https://github.com/Tarsnap/libcperciva/blob/master/datastruc...

As long as you invoke use it properly, e.g.

ELASTICARRAY_DECL(PTRLIST, ptrlist, void *);

there's no way it will create non-compliant code.

mobilemidget · on July 6, 2019

How ('yeah im going to write this') do you start such project like a c compiler? Seems like a huge thing to write. What did you write first? Did you already have a lot of compiler knowledge so you had a good idea/structure/flow etc in your mind already? And how long did it take you?

tia

WalterBright · on July 6, 2019

I'd written a couple compiler like toys before, but nothing like a real compiler. I just started writing it. Took a couple years.

mhh__ · on July 6, 2019

A compliant C compiler is probably a year's work in a modern language give or take?

WalterBright · on July 6, 2019

It takes 3-5 months just to write a compliant preprocessor. Writing a basic code generator is a year, writing a basic optimizer is another year.

But if you want a competitive compiler, better pencil out 10 years.

m463 · on July 8, 2019

> at one point I knew everything there was to know about C

after taking the test, wouldn't this be:

at one point I knew everything there was to know about (my implementation of) C

also: one time long ago I tried to use the c-preprocessor to preprocess a data file. ha ha ha ha ha. (conclusion: don't do that)

hyeomans · on July 6, 2019

Somewhat related, what would you recommend for someone who has never programmed in C?

pksadiq · on July 6, 2019

There are a few problems with the questionnaire. "I don't know" is pretty generic choice to be given/choosen.

Say for example, in question 5, the statement "return i++ + ++i;" is undefined, because the value of i is read and modified twice in a single sequence point (and of course, the order of addition is unspecified), which is not allowed in C. So the answer is "Undefined." (The explanation given in the page not accurate enough in terms of C)

And for question 1, the code is valid, but the result is not strictly defined. It depends on the implementation. So the answer is "Implementation defined."

The usage of "main()" hurts me, the strictly conforming way is to write it as "int main(void)" (or similar)

I feel like the questionnaire piss off people who really knows C.

mytailorisrich · on July 6, 2019

To me "I don't know" is a very apt choice. It makes the point clear that indeed reading the code does not allow you to know the result, which is quite a pitfall.

In your comment you are jumping from "I don't know", which is the first step, to wanting to explain why.

mort96 · on July 6, 2019

There were multiple questions where I would have answered "It's undefined" or "It's implementation defined", but those weren't options. It's not that I don't know the answer; I know the answer ("it's implementation defined according to the spec, but on essentially every relevant platform, the result will be X"), but the "it's implementation defined" part of my answer isn't an option, so the only possible answer becomes "on essentially every relevant platform, the answer is X".

Using "I don't know" as a substitute for "I know that the standard clearly covers this, and it says that the result depends on the implementation" does seem to be designed to piss off people who know C. If they really wanted to get the point across that you don't know what is and isn't implementation defined or undefined, they shouldn't be using vague questions to mislead people; they should just plainly ask questions which people don't know the answer to.

I hate this kind of questioning where you 100% know the subject matter the quiz is asking about, but the question and possible choices is so vague you have to try to interpret what you suspect the person who wrote the quiz wants the answer to be. I once had an exam which was full of that kind of multiple choice question, and guessed the exam author's intentions wrong on most of them.

0x00000000 · on July 6, 2019

the quiz asks what each one would evaluate to. “Undefined” or “implementation dependent” are not answers to that question. “I don’t know [because it is undefined]” or “I don’t know [based on the given information]” are logically consistent answers to the question that was asked

mort96 · on July 6, 2019

Ok, say I made a quiz where I ask you about what `((((16 <= 16) << 16) >> 16) <= 16)` evaluates to. I could give you the options 0, 1, 16, or "I don't know". If I as a quiz author wanted to test your knowledge about shift operators and esoteric uses of equality operators and your ability to reason through an expression, I would mark "1" as correct, and "0", "16", and "I don't know" as incorrect (I would include the "I don't know" option just because that's a common thing to do in quizzes, to not force people to guess one of the options if they don't actually know the answer).

My point isn't that there's no logically consistent answer. My point is that there are _two_ logically consistent answers, and which one is correct depends on the unknowable state of mind of the quiz author. On the other hand, if the options included "It's implementation defined" or "it's undefined", the author would have made their expectations clear, and the quiz would actually test people's knowledge of C rather than people's ability to try to reason about what sort of answer the author expects.

giovannibajo1 · on July 6, 2019

But the point is that I do know what that will print on my computer(s), with my compiler(s), on my architecture(s). "I don't know" is too generic a statement.

Thorrez · on July 6, 2019

On your computer yes, but the question specifically doesn't tell you what computer and compiler are used. So thus you can't know what the number is.

mytailorisrich · on July 6, 2019

Even on a given CPU with a given compiler you usually do not know until you've tried.

This is an article about the C language and the starting point with all the examples is that you do not know for a fact what the result will be.

ci5er · on July 6, 2019

I understand your claim. But I've gotta say this claim is maybe a little too aggressive. I know for a fact that on a Sun-3 (68020 SunOS Desktop Pizza Box) using either gcc or the bundled cc, all of them would have been the same answer, and the answer would have been known to the coder, before running that code (unless you unleashed one of the gcc command line dogs). Except maybe #5, because who does this?

Dylan16807 · on July 6, 2019

> So the answer is "Undefined."

The code is undefined, but "I don't know." is still the correct answer for what happens to the variable.

pksadiq · on July 6, 2019

> The code is undefined, but "I don't know." is still the correct answer for what happens to the variable.

Well, then a better choice would be "I can't know."

bandushrew · on July 6, 2019

"I cant know" is a subset of "I dont know"

"I dont know" was absolutely the correct answer.

ineedasername · on July 6, 2019

Except from the perspective of a pragmatics linguistic analysis, "I don't know" has a social context of "There's an answer, and I don't know it."

In this case, a non-C programmer should answer "I don't know" to all of them. A person with a passing familiarity should answer similarly. A seasoned pro would be forced to answer the same. Making it a rather useless tool for distinguishing people who think they know C but are honest when faced with their limitations or those who truly know it and know the answer is undetermined, which is supposed to be the point of the exercise.

Thorrez · on July 6, 2019

>which is supposed to be the point of the exercise.

I don't think that assumption is justified. Someone could say the point of the exercise is to illustrate that C is confusing.

ineedasername · on July 6, 2019

You don't think the claim is justified because some might say otherwise. Some is irrelevant. Some might say lots of mutually exclusive interpretations. It's the author's intent that matters, and the context of the Author's post indicates the some interpretation isn't the author's intent. His post begins with the question "So you think you know C?". He then goes on to present a test that is, by his own words, intended identify to test takers whether or not they really understand the intricacies of C, and to think critically about that source of their knowledge "I had to learn to rely on the standard instead of folklore; to trust measurements and not presumptions; to take “things that simply work” skeptically"

Never once does the author mention that C is confusing, use the word confusing, or otherwise indicate that general idea. If you're getting that impression, it's your own reading into it. I'm not even saying you'd be incorrect, but that's not the author's intent, which was the basis of my comment.

Thorrez · on July 6, 2019

If what "some" might say about what the author intends is irrelevant, then what you say about what the author intends is also irrelevant, because you are just some person (unless you're the author). My point was why should I trust your interpretation of what the author intends more than anyone else's.

>"So you think you know C?"

That goes along with the interpretation that the point is to illustrate C is confusing. It would go along with something like "You think you know it, you think it's simple, well actually you don't know it, it's confusing."

>intended identify to test takers whether or not they really understand the intricacies of C, and to think critically about that source of their knowledge

Yes, its intent is to indicate to test takers that a lot of them don't really understand the intricacies of C, which demonstrates that C is more confusing than they originally thought.

>Never once does the author mention that C is confusing, use the word confusing, or otherwise indicate that general idea.

Here are some quotes that indicate the idea that C is confusing:

>C is not that simple.

>It’s only reasonable that the type of short int and an expression with the largest integer being short int would be the same. But the reasonable doesn’t mean right for C.

>Actually, it’s much more complicated than that. Take a peek at the standard, you’ll enjoy it.

>The third one is all about dark corners.

>The test is clearly provocative and may even be a little offensive.

Then the author says that he did C for 15 years and thought he knew it, but then realized he didn't. That indicates to me either that the author is saying that he's not smart, or that C is confusing. The second appears to be the point the author is actually making.

ineedasername · on July 6, 2019

My interpretation is based directly on what the author states. Your "some" is based on a vague aggregate group whose interpretations, in aggregate, would be diverse and often contradictory and mutually exclusive. Personally, I trust the explicit an implied interpretation of the author's direct statements than you mere speculation as to what others might interpret.

Thorrez · on July 6, 2019

If you don't like "some" then replace it with me. I interpret it as the author saying C is confusing.

My interpretation is also based on what the author states, fairly explicitly. And I don't think there's anything that explicitly contradicts my interpretation.

ineedasername · on July 7, 2019

You say it's confusing because the author says it's not simple. The same might be said of any language. Or of any learning specialty at all. It's not synonymous with confusing. You're severely stretching the meaning of the author's words when you say the author's point was to say that C is confusing. It's what you infer because you were confused, which points to this being personal to you, not the general intent of the author.

Dylan16807 · on July 6, 2019

Showing how people misunderstand the intricacies of C is much closer to "C is confusing" than "distinguishing who is honest about their limitations".

ineedasername · on July 6, 2019

And yet its not confusing. Given the confines of any particular implementation and compiler the behavior can be known without confusion. The author never directly mentions or implies that their intent is to convey that C in confusing. Quite the contrary, they indicate their intent is to demonstrate that certain segments of people who believe they know C don't in fact understand its intricacies.

Dylan16807 · on July 6, 2019

> Given the confines of any particular implementation and compiler the behavior can be known without confusion.

Only through extreme levels of compiler code inspection, as it can vary based on optimization heuristics.

> Quite the contrary, they indicate their intent is to demonstrate that certain segments of people who believe they know C don't in fact understand its intricacies.

Demonstrating that people don't know C is subtly different from an intent of testing whether people know C. The point being made is about C itself.

geezerjay · on July 6, 2019

> "I cant know" is a subset of "I dont know"

There is a world of difference between "don't know" and "can't know", as the first implies a shortcoming on the side of the developer while second one states that the question is patently meaningless to someone who does master the language.

Filligree · on July 6, 2019

Compile it, look at the assembly. You can know. The answer will vary from place to place, but it isn't non-existent.

thedufer · on July 6, 2019

With what compiler? Targeting what architecture? The question is underspecified. "I can't know" is correct.

Gibbon1 · on July 6, 2019

Q: What is a leaky partial abstraction of the C standard?

Ans: A compiler.

mikeash · on July 6, 2019

And then you update your compiler and something completely different happens.

saagarjha · on July 6, 2019

Or your standards conforming but mischievous compiler does something nondeterministic ;)

owl57 · on July 6, 2019

1) This made me curious. Are any of the compilers in real use nondeterministic?

2) Probably that's not needed? A normal optimizing compiler just inlines the function somewhere new — and boom? Then again, can that really happen with practical contemporary compilers and this exact statement?

mikeash · on July 7, 2019

Most compilers are nondeterministic in small ways. For example, it's common to use hash tables that are keyed by pointer address and then iterate over the entries in storage order, so the order in which certain things are emitted will change from run to run. This is why "deterministic builds" are such a big deal, and not just an obvious thing that you get for free.

I don't know what the chances are that such a thing could ever translate into good assembly being emitted in one run and bad assembly being emitted in the next.

Dylan16807 · on July 8, 2019

Register allocation can be quite tricky, and sometimes it can only explore a small part of the problem space, so if you don't start the algorithm with exactly the same seed you might end up with significantly different code in certain functions.

leni536 · on July 6, 2019

UB includes the code not compiling, although it rarely happens in practice.

AdieuToLogic · on July 6, 2019

> Compile it, look at the assembly. You can know. The answer will vary from place to place, but it isn't non-existent.

This pretty much is the definition of "undefined behaviour" in the context of a standardized language specification.

amluto · on July 6, 2019

Whoa there! You mean “unspecified behavior”. int i = [unspecified] means that i has some value, but the spec doesn’t determine the value. Undefined behavior means that all your secrets might be sold to the highest bidder, your centrifuges might explode, and your computer is now full of ransomware.

AdieuToLogic · on July 6, 2019

> Whoa there! You mean “unspecified behavior”. int i = [unspecified] means that i has some value, but the spec doesn’t determine the value. Undefined behavior means that all your secrets might be sold to the highest bidder, your centrifuges might explode, and your computer is now full of ransomware.

That's only when using Boehm GC[0] in kernel device drivers that self-modify. Or any MSVC binary.

:-D

0 - https://www.hboehm.info/gc/

AdieuToLogic · on July 6, 2019

Whilst my other comment was intended to be jovial, it is hard to say if that was accurately conveyed. So this one will be serious.

The original problem definition, as specified by @pksadiq, read thusly:

> Say for example, in question 5, the statement "return i++ + ++i;" is undefined ...

This inspired a response by @Filligree of:

> Compile it, look at the assembly. You can know. The answer will vary from place to place, but it isn't non-existent.

Given the original constraint of an undefined statement result, and the suggested activity to address same, I posited that the recommended action is an exemplar of observing the product of undefined behaviour.

You then contributed:

> You mean “unspecified behavior”.

As per c-faq.com[0], there are three categories identified relating to this topic:

1 - implementation-defined: The implementation must pick some behavior; it may not fail to compile the program.

2 - unspecified: Like implementation-defined, except that the choice need not be documented.

3 - undefined: Anything at all can happen; the Standard imposes no requirements.

Whereas you imply a standards-conformant implementation of "return i++ + ++i;" is unspecified (category #2), it is, in fact, undefined (category #3). The support for this assertion is as follows.

As per the same site, Question 3.8[1] includes:

> Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.

And further states:

> ... if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written. This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.

And concludes with an example stating:

> ... the Standard declares that it is undefined, and that portable programs simply must not use such constructs.

Therefore, the original expression presented by @pksadiq is in fact an exemplar of an undefined expression as defined by category #3 shown above. Since both it and the message to which I originally responded satisfy same, I stand by my response given to @Filligree as having had informally defined the standard C concept of "undefined behaviour."

0 - http://c-faq.com/ansi/undef.html

1 - http://c-faq.com/expr/seqpoints.html

Dylan16807 · on July 6, 2019

> Whereas you imply a standards-conformant implementation of "return i++ + ++i;" is unspecified (category #2), it is, in fact, undefined (category #3).

You're misreading things. amluto's assertion is that "The answer will vary from place to place, but it isn't non-existent." is a description of category #2. That assertion is basically correct, depending on how exactly you define "place to place".

An informal definition of category #3 is "The answer can vary from place to place, or not exist at all." ideally followed by "It might crash or run unrelated code or even prevent the preceding code from running." It's flat-out wrong to say a value "isn't non-existent" when it comes to source code exhibiting undefined behavior.

geezerjay · on July 6, 2019

> The code is undefined, but "I don't know." is still the correct answer for what happens to the variable.

Actually, as undefined behavior should not be used at all then the correct answer should be "nevermimd these examples, they are all bug-ridden".

will4274 · on July 6, 2019

No, it isn't. The correct answer to #2 is "According to the standard, the result is implementation defined, but on my target platform, 0". "I don't know" is the wrong answer.

zAy0LfpBZLC8mAC · on July 6, 2019

The C specification does not say that undefined behaviour must give a deterministic result on a given platform. All you can say is "this one time I compiled and then ran this code, it gave 0". There is no requirement that the code compiles at all, nor that the same compiler on the same platform produces the same binary on every run, nor that the resulting binary produces the same result on every run, nor that the binary produces any result, nor that it doesn't sometimes produce a result and sometimes not, nor that the compiler doesn't sometimes produces a binary and sometimes not ... undefined behaviour is exactly that: undefined behaviour.

will4274 · on July 7, 2019

I'm well aware of what undefined behavior is. I still know it's undefined behavior and can read my compiler manual to answer the question of how the code behaves. "I don't know" is simply wrong.

zAy0LfpBZLC8mAC · on July 8, 2019

> and can read my compiler manual to answer the question of how the code behaves.

Which is both not true (because the compiler manual usually won't define undefined behaviour) and irrelevant (because the questions were about C, not about a compiler).

will4274 · on July 8, 2019

In the example I choose (#2) most compilers totally specify the behavior. And the question was (right from the article) "what the return value would be?" In order for a function to return, it must be run. In order for a function to be run, it must be compiled. In order for a function to be compiled, there must be a compiler (or interpreter, I suppose).

You're being pedantic about something silly, but you're also wrong in your pedantry.

angleofrepose · on July 6, 2019

Somewhat related, my introductory classes involved a lot of games around pre- and post- increments and short circuiting. While I get that understanding these operations is fundamentally important, is understanding ridiculous combinations of them important? I mean, these were the basis of large portions of some quizzes and midterms. I get playing with them from a theoretical perspective, as this can literally be done in many languages, but why force freshmen to play this deep mental gymnastics? Maybe a play at making the classes weeder classes and no other reason.

rayiner · on July 6, 2019

The questions are testing whether you really understand the basic rules of the language. Often times, the best way to test whether you really get the rules is to raise them in some odd context, so that you can’t just pattern match to figure out the result.

pvg · on July 6, 2019

I don't know how convoluted the questions on your midterms were but one good reason that kind of irritating thing pops up in tests is that it's quite common in real world C code. Think of the old K&R string copy example.

icedchai · on July 6, 2019

Have you ever worked with a pre-ANSI (K&R) C compiler? Omitting the return type for main is legal in those old compilers. Newer ones give you a warning.

rayiner · on July 6, 2019

Omitting the return type is perfectly conforming C89.

aey · on July 6, 2019

> I feel like the questionnaire piss off people who really knows C.

I hated this test. I’ve spent 12 years working on C targeting various flavors of arm and x86.

Just because the behavior is undefined when compiled without warnings and run on a Soviet water integrator doesn’t mean the language is undefined for the 99.995% of the industry uses.

Behavior of c89 or later with -Wall -Werror on modern clang, gcc, icc, visual studio, is well understood on arm, x86, mips, risc, ppc, Cortex-m and just about every other hardware architecture.

But, C is a pia, and I’ve been using rust instead :)

jonathankoren · on July 6, 2019

It’s not just that stuff. What pissed me off was asking about the return code of a comparator. That’s just bad form. You’re only supposed to check for zero or nonzero. I have never used the value beyond that, and if you are, that’s a problem.

drfuchs · on July 6, 2019

You’re incorrect. The result of a comparison is guaranteed to be zero or one in C. Similarly for the exclamation-point “not” operator, and || and &&.

This isn’t a recent standardization; it’s been an explicitly specified feature of C pretty much since the very beginning of the language. See page 7 of the prehistoric https://www.bell-labs.com/usr/dmr/www/cman.pdf

dmcdm · on July 6, 2019

I found it an amusing excrcise, if not terribly relevant, even as someone who spends 90% of his dev time in C.

What rubs me about these sorts of articles is they make some presumption about the importance and nessecisity of writing truely portable C, as if the "C Standard" were in and of itself a terribly useful tool. This is in contrast to where I live most of the time which is "GCC as an assembler macro language" (for a popular exposition on this subject see https://raphlinus.github.io/programming/rust/2018/08/17/unde...). And yeah, reading through the problem set I was critiquing it in context of my shop's standards, where we might be packing and padding, using cacheline alignment, static assertions about sizeof things, specific integer types, etc. So these sorts of articles just come off as a little pendantic to folks like me. I don't doubt they're useful for some folks, and I guess it's interesting to come up from the depths of non-standard GNU extensions and march= flags to see what I take for granted.

userbinator · on July 6, 2019

It's very much worth reading, Linus Torvalds' opinion of standards that's linked in that article, but I'll link it again here: https://lkml.org/lkml/2018/6/5/769

"So standards are not some kind of holy book that has to be revered. Standards too need to be questioned."

The way I see it, a lot of compiler writers are basically taking the standard as gospel and ignoring everything else "because the standard doesn't say we can't" --- and that's a huge problem, because behaviour that the standard doesn't define often has a far more common-sense meaning that programmers expect. IMHO the onus should really be on the authors of compilers to find that reasonable meaning. In fact, the standard even suggests that one possible undefined behaviour is something like "behave in a manner characteristic of the environment" (can't remember nor be bothered looking up the standard.)

pcwalton · on July 6, 2019

This is a common misconception. Compiler authors don't exploit undefined behavior to make themselves seem smart, or because they like breaking code. They exploit undefined behavior because somebody filed a bug saying some code was slow, and exploiting UB was the simplest way--or, in many cases, the only way--to fix the performance problem.

GCC and Clang do give you the option to avoid optimizations based on undefined behavior: compile at -O0. We think of the low-level nature of C as being good for optimization, but in many cases the C language as people expect it to work is at odds with fast code.

It's fascinating to actually dive into the specific instances of undefined behavior exploitation that get the most complaints. In each such case, there is virtually always a good reason for it. For example, treating signed overflow of integers as UB is important to avoid polluting perfectly ordinary loops with movsx instructions everywhere on x86-64. It's easy to see why compiler developers added these optimizations: someone filed a bug saying "hey, why is my loop full of movsx", and the developers fixed the problem.

Edit: Should be movsx instead of movzx, sorry.

jstimpfle · on July 6, 2019

Could you go into a little bit more detail regarding the movzx? Aren't 32-bit registers always zero-extended on x86-64?

pcwalton · on July 6, 2019

Sure. Here's an in-depth explanation from Fabian Giesen: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...

jstimpfle · on July 6, 2019

Thanks, rygorous is always a great read - although sometimes a little overwhelming. If I got the gist of it, I have a small correction to your comment: the issue is about movsxd (sign extended integer indexes), not movzx (zero extension).

userbinator · on July 6, 2019

It's easy to see why compiler developers added these optimizations: someone filed a bug saying "hey, why is my loop full of movsx", and the developers fixed the problem.

"fixed" by breaking other expectations. Regardless of what the spec says, that's still a stupid way to do things. There's a child comment below which examines this case in detail; and the real solution is to make the analysis better, not use UB as a catch-all excuse.

saagarjha · on July 6, 2019

> compiler writers are basically taking the standard as gospel

I would be rather disappointed if they didn't, honestly.

bzbarsky · on July 6, 2019

Consider the following statements:

1) The standard says I must do this, so I must do it.

2) The standard doesn't say I must not do this (but does allow me to either do it or not do it), so it's totally OK if I do it.

I think you're thinking of cases covered by statement 1, and I think pretty much everyone agrees that compiler writers should behave that way for the standard to mean anything.

The issues arise in cases covered by statement 2. Just because the standard allows a behavior doesn't mean that the behavior is a good one. And yes, code relying on you not having the behavior is not following the standard, and that's something the authors of that code should consider addressing. But on the other hand, the standard may allow a lot of behaviors that only make sense in some situations but not others (totally true of the C standard, depending on the underlying hardware) and as a compiler writer you should think carefully about what behaviors you actually want to implement.

AS a concrete example, you _could_ write a C compiler targeting x86-64 which has sizeof(uint64_t) == 1, sizeof(unsigned int) == 1, sizeof(unsigned long) == 2, and sizeof(unsigned long long) == 2 (so 64-bit char, 64-bit short, 64-bit int, 128-bit long, 128-bit long long). Would this be a good idea? Probably not, unless you are trying to use it as a way to test for bugs in code that you will want to run on an architecture where those sizes would actually make sense...

erik_seaberg · on July 6, 2019

It's a collective action problem. If we want to give up runtime performance and get stronger guarantees about what code will be understood to mean, we should revise the standard and start using new optimizers that respect it. If every compiler goes its own way, I only benefit from what they already agreed on.

FartyMcFarter · on July 6, 2019

GCC and many other compilers have been known to change the consequences of undefined behavior unpredictably when upgrading, changing compiler flags, etc. For some examples that matters.

TazeTSchnitzel · on July 6, 2019

Knowing what the standard says and keeping to it as much as possible is important because every now and then, a major compiler finds some exciting new way to optimise code based on undefined behaviour, and breaks code that assumed GCC would always do some seemingly obvious reasonable thing it did when the author tested it.

pcwalton · on July 6, 2019

If you use C as an assembler macro language, you aren't actually writing C. You're likely to get burned someday, unless you compile at -O0.

SolarNet · on July 6, 2019

> as if the "C Standard" were in and of itself a terribly useful tool

Not necessarily, I took it to mean that engineering is holistic and things like compiler behavior in the face of undefined parts of the standard are important to account for.

bsder · on July 6, 2019

[flagged]

dang · on July 6, 2019

Hey, please don't add personal attacks on top of your substantive points in HN threads. It helps nothing and makes the thread nastier and evokes worse from others. Also it's against the site guidelines: https://hackertimes.com/newsguidelines.html.

stefan_ · on July 6, 2019

Where the author goes wrong is in assuming that somehow "I don't know" can be a final answer to these things. No, it is absolutely fucking vital that you know how the compiler will pad your structures in C. Similarly to the "what size is an int" on your architecture - on an ATmega8 this is 16 bit, but the chip can't actually do all 16 bit operations in single instructions.

lvturner · on July 6, 2019

I took that to be the point of the article though, that just looking at the code wasn't enough to know and you needed to go further to answer these cases for your exact use case or target platform.

ryandrake · on July 6, 2019

Further: Unless your code is compiled, deployed to a rocket, and fired off the Earth never to return, the question of “what is my platform?” is meaningless in the context of writing good C.

So, today, using the compiler installed on your system right now, sizeof(int) = 32. Great. That means nothing, and changes nothing about whether your code is correct. You should not write code relying on it. Just like you should not measure the output of the questions on this test, and declare that you know what the answers are.

dmcdm · on July 6, 2019

>Unless your code is compiled, deployed to a rocket, and fired off the Earth never to return, the question of “what is my platform?” is meaningless in the context of writing good C.

While I feel the tone of your comparison was intended to be a bit hyberbolic, the reality is a bulk of modern C development occurs in a context similar to the one you describe. Further the thought, utterly foreign to the vast majority of software developers, that the physical machine may not be some utterly abstract and constantly mutating target which there is no hope of understanding is, imo, one of the great dying arts of software engineering - a death perpetuated by the same sort of folks who think CS education should be carried on in Java.

I contend that, these days, most C is written to target a particular compiler, physical machine, and/or device.

erik_seaberg · on July 6, 2019

There is vastly more old C code than new, and it didn't target the x64 or ARM architectures it's running on now. Where it wasn't portable, that was a defect that had to be fixed.

My first job was a 4GL targeting customers running DOS on the 80286, complete with runtime linking. 100% of that work has been abandoned due to incompatibility. It contributed nothing to the profession beyond what I personally learned.

scarface74 · on July 6, 2019

There is a Mac program BBEdit that was first written to target 68K 32 bit Macs, then PPC 32 bit Macs, then 32 bit x86 Macs and then 64 bit Macs. Probably within the next 3 years it will target ARM Macs.

The author said he never did a full scale rewrite. He slowly migrated code from one platform to the next.

Today, Apple’s code runs on both ARM and x86 and with Marzipan, as will developers code. True most will be in Objective C, but some low level code is still in C.

unwind · on July 6, 2019

I hope I'm being on topic and reasonable to point out that the result of the sizeof operator is in "number of chars", not bits.

hedora · on July 6, 2019

This is why, decades ago, the C world moved on, and added types like int32_t and size_t, so programmers can say what they mean.

bcaa7f3a8bbc · on July 6, 2019

My score is 3/5.

One immediate redflag I have noticed is using "int", "char", "short" as if they have a definite size. They don't. C standard only guarantees a minimum size. For example, many PDPs are 36-bit. Assuming the size of a variable is a common practice nowadays, but at least one should use uint8_t, int32_t, etc. from stdint.h.

But I was still tricked, it should be obvious in hindsight, 12 years of schooling led me to think: If the author was asking these questions, at least one or two questions must be answerable (even if it's technically incorrect, but you'd better to guess the original intention of a question). So I still tried to guess and got two wrong answers... Get to be careful next time...

Raymonf · on July 6, 2019

In school, I was taught to choose the "best answer" on a test or quiz, and if you don't know something, choose the answer that looks right to you.

This test.. it reverses that entirely.

NohatCoder · on July 6, 2019

It is school that gets it wrong. In real life, knowing that you don't know something and acting accordingly is often way better than taking a guess.

scarface74 · on July 6, 2019

I only got two out of five. One so really didn’t know and I remembered enough C to remember sequence points.

gravypod · on July 6, 2019

Unfortunately the "best answer" is taken to mean "the wrong answer that the teacher programmed into you".

JKCalhoun · on July 6, 2019

4/5 here. In fact after the third question provided "I don't know" as an answer I started to suspect something was up — especially since the author said only one answer was the right one ... why even provide "I don't know" then, I wondered?

I knew "int" was sort of platform-dependent (was 16-bits generally when I was learning to code, later 32-bit became more typical) — so combined with that niggle and all the "I don't knows", I (correctly) reevaluated by first couple of answers.

Still, didn't realize the last one was compiler-dependent.

beached_whale · on July 6, 2019

The third one has another implementation defined aspect. We do not know the value of a space ' ', in ascii it is 0x20(32) but that depends on the system, in EBCDIC it is 0x40(64).

dang · on July 6, 2019

Discussed at the time: https://hackertimes.com/item?id=12902304

https://hackertimes.com/item?id=12900279

Similarly titled in 2012: https://hackertimes.com/item?id=4657317

and 2011: https://hackertimes.com/item?id=3125891

Ultimatt · on July 6, 2019

Replacing "I don't know" with "this is undefined in the spec" or "this is implementation specific" would probably increase the pass rates.

flowerlad · on July 6, 2019

“I don’t know” is not the right answer. I do know. I know the answer to be “Unspecified”.

thayne · on July 6, 2019

But you _don't_ know what it returns, because the behavior is either undefined or implementation defined, and you don't know the implementation.

ben509 · on July 6, 2019

But you do know that it's undefined or implementation defined. It's a "known unknown".

Thorrez · on July 6, 2019

Yes, so you knowingly click "I don't know". You don't know the value, and you know that you don't know the value.

H8crilA · on July 6, 2019

Look, do you know what will it return or not?

saagarjha · on July 6, 2019

Not without knowing the compiler, which I think was the point.

vkaku · on July 6, 2019

I answered Idk to all. After the first two, the pattern became clear and I felt like if I wrote a compiler myself, the answers could be very different.

I've worked on 16 bit C code, 32 and now 64 bit code. So I knew that the behavior was implementation and optimization dependent. :)

Ignorance is bliss in C.

childintime · on July 6, 2019

I posted this in the embedded C-shop I work, under the comment that exactly in this place all should pass this test. Sadly only 1 in 5 passed the test (yours truly). Admittedly, this test was binary: either you pass or fail all of the questions (which is also sort of a give-away).

In the end this test proved to be a really valuable, because the "I don't know" drove the point home, specially for smart folks who don't like to answer any test, ever, with "I don't know".

ddevault · on July 6, 2019

5/5 - but I came to Hacker News to procrastinate from the C parser I've been writing.

meithecatte · on July 6, 2019

Wait, which of your projects needs a C parser?!

ddevault · on July 6, 2019

A new one

cat_plus_plus · on July 6, 2019

Well, that's a copout. Of course, if you take absolutely any computer architecture, you can't assume simple things like sizeof(int) or data structure alignment. But if sizeof(int) is at 4 and data needs to be aligned by its own size - like on any real architecture relevant today - many of these questions have a deterministic answer. In practice, compiler bugs are a much bigger issue than architecture assumptions.

tillulen · on July 6, 2019

What’s the latest compiler bug that caused a serious issue for you? (Genuinely interested, no sarcasm here.)

usr1106 · on July 6, 2019

I failed on one, number 4. I bravely assumed 16 bit integers cannot exist. Can anyone name a concrete platform/compiler where int is/was 16 bit. Or is this just a theoretical option left open by the spec?

caf · on July 6, 2019

Turbo C on MS-DOS for one. In fact 16-bit int was the norm on that platform, because the architecture didn't have 32-bit general purpose registers.

In the C89 days, you'd use 'short' in aggregates (structs and array) for values you knew wouldn't exceed 16 bits so didn't want to potentially waste space; 'long' in situations where you knew 16 bits wouldn't be enough; and 'int' the rest of the time (where 16 bits was enough, and there weren't any storage benefits to outweigh the performance benefit of using the native word size).

Sharlin · on July 6, 2019

Why couldn't they? C had already existed for a couple of decades when 32-bit machines started getting popular. `int`, as the default integer type, usually is the size of the machine word for best performance. It would make no sense to have slow, emulated 32-bit ´int`s on a 16-bit system, never mind 8-bit ones.

segfaultbuserr · on July 6, 2019

Many C compilers that target 8-bit (sometimes 16-bit) machines. MOS 6502, Zilog Z80, and Motorola 6809. Modern examples include Intel 8051, AVR and PIC.

inamberclad · on July 6, 2019

AVR, so certain Arduinos. On an Arduino Uno, sizeof(int) returns 2.

segfaultbuserr · on July 6, 2019

So does AVR's (ex?) competitor, PIC8. This comes from the documentation of Microchip's C18 compiler. https://i.stack.imgur.com/1uV3l.jpg

fengb · on July 6, 2019

Most compilers for 8-bit and 16-bit CPUs, e.g. Z80 or 80286.

rcfox · on July 6, 2019

My first thought was MSDOS/Turbo C, and I found this page when looking for confirmation: http://synfare.com/599N105E/hwdocs/sizes.html

kurlberg · on July 6, 2019

My amiga c compiler (aztex manx) allowed either 16 or 32 bit ints. All/most systems libraries used 32 bit parameters, despite this I insisted on the 16 bit version "for performance". In hindsight this was sort of insane: one missing L (say in "1L" for casting to long) meant a not so quick floppy disk reboot). :-)

Anyhow, for a computer with 16 bit wide data bus, having 16 bit ints might be justified by performance (and/or reducing memory usage.)

WalterBright · on July 6, 2019

Digital Mars C and C++ for DOS and (early) Windows.

ummonk · on July 6, 2019

I’ve worked with Arduinos where int was 16 bit.

Narishma · on July 6, 2019

x86 for one, up until 32-bit was introduced with the 80386.

peteri · on July 6, 2019

pdp-11

wrs · on July 6, 2019

Sad to say I scored perfectly, due to a similar early disillusionment on embedded platforms, and years of pain porting code between 16- and 32-bit architectures when the author thought they knew the size of “int”.

danbolt · on July 6, 2019

At the end of the test, the author talks about automation programming for a nuclear power plant. I don’t think I could ever sleep the same at night after writing something like that.

OldHand2018 · on July 6, 2019

> I don’t think I could ever sleep the same at night after writing something like that

In these situations, you likely know your hardware and know your compiler, so you can actually provide an answer for 4 of the questions. The last one is a situation where someone should tell you not to get cute in the code review.

I wrote C in telecom and finance and in both places we enforced a rule: when you define a structure, put a comment after each element that says what you think the structure offset should be, and at the end of the structure #define a constant that says what you think the size of the structure should be. In a code review, if anyone noticed something that didn't look right, you could talk about it. In testing, you could also check that sizeof(foo_s) == FOO_S_SIZE and fail if it wasn't.

In some of our code, we would test the size of various types and structures on startup and immediately exit if they weren't what we expected. We'd print type sizes to logs to help debugging if there was ever a problem. We were supporting a single code base that ran on big endian, little endian, X86, Itanium, SPARC, ARM. Compilers change, but automated tests of type and structure sizes catch things immediately.

It may sound like a lot of work, but it actually isn't at all. It also helps a lot with long-term maintainability.

bzbarsky · on July 6, 2019

> In some of our code, we would test the size of various types and structures on startup

This is one of the things that C++ has actually improved a lot recently: doing this with static_assert is much nicer in terms of catching problems early... And yes, it's great for long-term maintainability.

unwind · on July 6, 2019

C has had standardized static assert since the C11 spec was released. See [1] for instance.

[1]: https://stackoverflow.com/a/7287341

saagarjha · on July 6, 2019

I never thought I’d hear myself saying this, but that syntax is uglier than C++’s.

bzbarsky · on July 6, 2019

Excellent! I haven't followed the C standards as closely; glad they added this as well.

rramadass · on July 6, 2019

Neat techniques. I had done similar stuff in some protocol code that i had a chance to write.

Someone1234 · on July 6, 2019

Particularly writing it in C... It isn't a language well suited to be fully defined (see this very article for why), and no, Rust/Go aren't either. But Ada derivative or Haskell perhaps, there's some amazing tooling for safety critical systems and the languages themselves lend themselves to exposing side-effects.

pdpi · on July 6, 2019

> But Ada derivative or Haskell perhaps

Ada, maybe? I don't know enough about it to comment. You definitely don't want to use Haskell for that sort of work load, though, at least not directly. Laziness-by-default is precisely the sort of hard-to-reason-about logic you don't want in that sort of application.

That said, if I I had no alternative but to try and tackle this problem, I would seriously consider a strategy where I would write a Haskell program that would generate the actual program (potentially in ASM directly) for me.

icedchai · on July 6, 2019

I scored perfectly. I've been programming in C since 1989, on various platforms (started with the Amiga, then VAX/VMS, Linux x86, and various embedded systems.)

sys_64738 · on July 6, 2019

I emailed this to my boss telling him I got 0/5. He just setup a 1:1 meeting first thing Monday AM.

Something1234 · on July 6, 2019

Let us know how it goes...

fantasticFerret · on July 6, 2019

C programer here. I try to never think assume I know C.

morpheuskafka · on July 6, 2019

If we're going to be --pedantic, shouldn't the author specify the exact standard of the C language under test? A lot of companies have varying implementations of C and perhaps some do specify some of the behavior at hand here.

saagarjha · on July 6, 2019

Implementations of C≠C standards, though of course I'm sure some random obscure compiler has tried to call their C dialect a "standard" at some point.

zw123456 · on July 6, 2019

The better wording would be D) not enough information to give a definitive answer. It's like the old gotchya question; what is 1+1 ? of course the answer is it depends on if you are using binary or a base of integer >2.

rgoulter · on July 6, 2019

Yeah, or "undefined behaviour".

But that reminds me of the joke: there are 10 kinds of people: those who know binary, those who don't, and those who didn't know the '10' was written in base 3.

bzbarsky · on July 6, 2019

Or doing the whole thing mod 2.

bandushrew · on July 6, 2019

4/5 I tricked myself into thinking I could figure out the last one....oops.

chadcmulligan · on July 6, 2019

Isn't there a more fundamental flaw in these questions? main() always returns an int, whether that's 4 or 8 bytes, 0 or 1 means success or failure depends on the implementation. here's a bit of a discussion https://stackoverflow.com/questions/204476/what-should-main-...

Reminds me of the dumb exams some teachers would set to trick you when in school to make themselves feel superior.