Both approaches are valid, but I would hope they are using a separate model to v...

Both approaches are valid, but I would hope they are using a separate model to validate responses, rather than crippling the base model(s). In OpenAI's case, we don't know for sure, but it seems like a combination of both, resulting in lower quality responses overall.

I imagine LLaMA was fed highly-vetted training data, as opposed to being "fixed" afterwards.