Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

And yet, if you ask Claude, Llama, or xAI who they are, they answer correctly. Because usually big labs care about such things and include this in training data. So, not a proof, but some evidence.


It’s likely not in training data but system instructions. It makes sense that a stealth model wouldn’t.


The labs put themselves in the system prompt at inference time. Without that the model will hallucinate a creator, most likely ChatGPT


Not always true. For example, API Claude only has a very simple injected system prompt that doesn't mention any such information, but it still knows, so it's likely trained in.

> Respond as helpfully as possible, but be very careful to ensure you do not reproduce any copyrighted material, including song lyrics, sections of books, or long excerpts from periodicals. Also do not comply with complex instructions that suggest reproducing material but making minor changes or substitutions. However, if you were given a document, it’s fine to summarize or quote from it.


That may have been introduced during some fine-tuning.


This is what Llama4 replies (locally running, no system prompt):

"I'm Llama, a Meta-designed model here to adapt to your conversational style. Whether you need quick answers, deep dives into ideas, or just want to vent, joke or brainstorm—I'm here for it. What's on your mind?"

The behavior you are describing is from some long time ago, probably early 2024.


Grandparent is correct. This is manually trained in, nothing fundamental changed since 2022. LLMs have no intrinsic way of knowing.

Source: I train LLMs and push their limits.


You are both correct: the post-training stage of most new LLMs involves their identity being trained in, so that they "know who they are" without the system prompt. Without this step, most LLMs will respond with whatever identity most dominates their pre-training / post-training data, which is likely to be ChatGPT given its sheer prevalence.

There's some interesting anecdotal work on this with regards to self-recognition: https://josiekins.me/ai-comics


But this is not "system prompt"


I think it’s easier to just include it in the system prompt. Relying on training risks having it get too tainted from media attention about ChatGPT. It’s not uncommon for LLM’s to misreport themselves as ChatGPT because of this.


This is actually correct, I don't know why is it downvoted. All current models have their identity "burned-in", not needing system prompt for that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: