Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

> Do you write parsers or state machines very often?

Yes and yes. Never have used or needed a `goto` in C for these. Could you provide an example of how it would be useful?



If you have a state machine with several mutually recursive states, you can end up with a dangerously deeply nested call stack* . (Particularly if the state machine is reacting to an infinite stream of data and never actually returns.) One way around this is moving as much of the logic as is feasible into one large function and use gotos instead of function calls, managing the accumulated data yourself. (In a language with tail-call optimization, you can just use function calls. Much better.) This is usually a bit cumbersome to manage by hand, but a good strategy when generating C code from a DSL. (Look at the output from lex, for example.) This is also common when implementing a virtual machine in C.

* Especially on embedded hardware, which may only have a few KB for the stack.

Another use case for gotos is handling cleanup on error, when writing code that has to be fault-tolerant. Here's a rough example, generalized from a VM I'm working on:

    typedef struct thing {
        int id;         /* object ID */
        int buf_sz;     /* current size of the buffer */
        char *buf;      /* internal buffer */
        foo *f;         /* some other thing that needs alloc / init */
    } thing;
    
    thing *thing_new(int id, int buffer_size) {
        thing *t = malloc(sizeof(*t));
        foo *f = NULL;
        char *buf = NULL;
        if (t == NULL) goto cleanup;
        buf = malloc(buffer_size);
        if (buf == NULL) goto cleanup;
        t->buf = buf;
        f = foo_new();
        if (f == NULL) goto cleanup;
        t->f = f;
        return t;
    
    cleanup:
        /* Avoid leaking memory if any part failed. */
        if (f) foo_free(f);
        if (buf) free(buf);
        if (t) free(t);
        return NULL;
    }
I don't tend to use goto by hand much outside of that particular idiom, but it's common enough that it should be recognized, and the equivalent with ifs and multiple returns would be much worse: "if not B, free A; if not C, free A and B; if not D, free A, B, and C; if not E, ...".


This is ok as it is what is deemed exception handerling. Though personaly I like to handle as many known exceptions individualy. I also like to have all functions to return a status code and with that you can cleanup any possible exception and still have all your error handerling code in one place and allowing graceful handerling of any exception. It's a little bit more effort, but when you need that level of assurance you pay for it one way or another.


I would not really call that a state machine. (Or a parser.) Which were the topic at hand. All of my state machine have been for embedded targets and generally look like

    rules(struct stateful state) {
        switch (state->foo) {
        case ....
        }
    }

    sensors(struct stateful state) {
        state->button1 = ....
    }

    motors(struct stateful state) {
        switch(state->foo) {
            case ....
            portBar = ....
        }
    }

    main() {
        for(;;){
            sensors(state)
            rules(state)
            motors(state)
        }
     }
Simple state machine, no gotos are needed. Works fine for non-simple ones too. And the first rule of fault tolerant code (particularly for embedded) is never use malloc.

I agree that goto is useful for error handling, but not parsers or state machines. Anyway, I'll play your bait-and-switch. Here is error handling without goto.

    typedef struct thing {
        int id;         /* object ID */
        int buf_sz;     /* current size of the buffer */
        char *buf;      /* internal buffer */
        foo *f;         /* some other thing that needs alloc / init */
    } thing;
    
    thing *thing_new(int id, int buffer_size) {
        do {
            thing *t = malloc(sizeof(*t));
            foo *f = NULL;
            char *buf = NULL;
            if (t == NULL) {break;}
            buf = malloc(buffer_size);
            if (buf == NULL) {break;}
            t->buf = buf;
            f = foo_new();
            if (f == NULL) {break;}
            t->f = f;
            return t;
        } while (0)
        // clean up
        if (f) {foo_free(f);}
        if (buf) {free(buf);}
        if (t) {free(t);}
        return NULL;
    }
Now normally I would never do that. More typically I would actually use the loop. Because I would not be malloc-ing, I would be trying to initialize a piece of hardware over SPI or i2c. The do...while would be replaced with a for(i=maxtries; i; i--) loop. After maxtries, the loop terminates and the peripheral is shut down.


And now, how do you break from within the 2nd and 3rd level of nesting? break can only break one of them. If only "break" had a label, so you could say "break toplevel;" instead of just "break;" (and name the relevant scope "toplevel", of course).

Turns out, you actually can! instead of "break toplevel;", you just write "goto toplevel;". There's another minor change, in that you have to put the name at the end of the scope, rather than the beginning of the scope, which is why people tend to name it after the next block (e.g. "goto cleanup;" in this case).


I do not believe you have actually read any of the conversation.

Still waiting for someone to show me a state machine that absolutely needs goto.


I was only replying to your "here's error handling without goto". And I'm not implying that it is impossible - just that your solution does not scale as is to more than one level of scoping.

There is nothing that absolutely needs goto (including error handling), because Turing completeness does not require goto. so you might wait forever; I'm not sure what it is that you guys are arguing about with respect to state machines.

(of note, I keep waiting for TCO proponents to show me an example in which the guarantee of TCO in scheme makes the world so-much-better. The claim always comes up, and every example I've seen so far requires at most adding two more lines in Python, and no adding of TCO)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: