You're right, the RLHF fine-tuning is not adding any information to the model. It just steers the model towards our intentions.
But the regular fine-tuning is simple language modelling. You can fine-tune a GPT3 on any collection of texts in order to refresh the information that might be stale from 2021 in the public model.
But the regular fine-tuning is simple language modelling. You can fine-tune a GPT3 on any collection of texts in order to refresh the information that might be stale from 2021 in the public model.