Given the neighboring thread where I just learned that the lexer runs before the preprocessor, I’m not sure that would be the outcome. There’s no reason to assume the comment terminator wouldn’t be ignored in strings. And even today, you can safely write printf(“hello // world\n”); without risking a compile error, right?
> Given the neighboring thread where I just learned that the lexer runs before the preprocessor, I’m not sure that would be the outcome.
That is precisely why nested comments would end up breaking the C89 code example I provided above. I elaborate this further below.
> There’s no reason to assume the comment terminator wouldn’t be ignored in strings.
There is no notion of "comment terminator in strings" in C. At any point of time, the lexer is reading either a string or a comment but never one within the other. For example, in C89, C99, etc., this is an invalid C program too:
In this case, we wouldn't say that the lexer is "honoring the comment terminator in a string" because, at the point the comment terminator '*/' is read, there is no active string. There is only a comment that looks like this:
/* Comment
printf("hello */
The double quotation mark within the comment is immaterial. It is simply part of the comment. Once the lexer has read the opening '/*', it looks for the terminating '*/'. This behaviour would hold even if future C standards were to allow nested comments, which is why nested comments would break the C89 example I mentioned in my earlier HN comment.
> And even today, you can safely write printf("hello // world\n"); without risking a compile error, right?
Right. But it is not clear what this has got to do with my concern that nested comments would break valid C89 programs. In this printf() example, we only have an ordinary string, so obviously this compiles fine. Once the lexer has read the opening quotation mark as the beginning of a string, it looks for an unescaped terminating quotation mark. So clearly, everything until the unescaped terminating quotation mark is a string!