How? Is it not trained on human-created code? Or does it learn what is good and ...

gwern · on Jan 13, 2022

Yes. It learns to imitate the author and latents like style or quality.

The Codex paper goes into this; one of the most striking parts is that Codex is good enough to deliberately imitate the level of subtle errors if prompted with code with/without subtle errors. (This is similar to how GPT-3 will imitate the level of typos in text, but a good deal more concerning if you are thinking about long-term AI risk problems, because it shows that various kinds of deception can fall right out of apparently harmless objectives like "predict the next letter".) See also Decision Transformer.

So since OP is prompting with meaningful comments, Codex will tend to continue with meaningful comments; if OP had prompted with no comments, Codex would probably do shorter or no comments.

awb · on Jan 13, 2022

It’s trained on a subset of all public human-created code.

That’s why it’s possible to be better than the average human.

If it was trained on all code ever written (public and private) and weighted equally, then it would generate pretty close to average human code (or more like a mode with the most common answer prevailing).

bee_rider · on Jan 13, 2022

Does this seem to indicate that the comments in Github codes are better than average? I'd believe that. I suspect lots of people have their personal "tiny perfect programs I'd like to be able to write at work" up on github.

awb · on Jan 13, 2022

Sounds reasonable to me. Also, with AI training, there’s usually a human in the loop somewhere classifying. So, presumably there were developers rating code and providing feedback for the AI until it learned to recognize good code on its own.