This is my exact problem with ChatGTP, it looks great when you don't know what it's talking about, but as soon as you do, it looks foolishly over-confident in it's answers which are very clearly wrong.
Ha ha only serious - for me, the most profound thing about ChatGPT and friends is that they show how much of human behaviour is not actually intelligent in some deep sense.
The fact that a "dumb", generative model that is simply predicting the next token when given a prompt can talk so well, perform complex tasks and interact with humans in such a convincing manner is pretty fascinating.
There is obviously more to human intelligence than text based conversation but it is pretty humbling that such an aspect of ourselves can be replicated and be so convincing at such an early stage and perform better than some humans even when it makes stuff up: toddlers can't talk, kids are smarter but don't have the technical knowledge, most adults only have a few specific areas of expertise, etc.
I don't like OpenAI's implementation defaults that much. But the thing is you can have a conversation with an LLM about bullshitting, explore the reasons why it's undesirable and inferior to sincere conversation, and then leverage those points of agreement to modify conversational behavior, at least within the scope of that conversation.
I believe the data gets trained by human rankers, and it should therefore regress to their mean. That said, if they train it to seek approval, I suppose we could teach it like any other student.