Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Not really. They're still non deterministic language predictors. Believing that a prompt is an effective way to actually control these machines' actual behavior is really far fetched.

They com like that from factory. Hardcoded to never say no.



They're not hardcoded to never say no, but some of the models were trained to be "yes men" because their creators thought it would be a good property to have. GPT-4o for example.


The thing is that they are completely incapable of meta-cognition. Reasoning models don’t show their actual reasoning at all.


Right — they're not reasoning, they're generating text that statistically models reasoning. Anyone who says differently is selling something.


As the meme goes, "they are the same picture".


Language has reasoning encoded within it.


It certainly does. But so too do complex neural network functions, as do attention mechanisms.


That is what a base model does. After RL it is a very different thing, and anyone who says they know what it is, is naive or dishonest. These things are grown, not made, and we really do not understand how they work in many important ways.


Yeah, but they’re not magic; we can still do experiments and see what happens. Anthropic did a lot of work on this and showed that they’re not accurately describing their reasoning process.


Humans are also not magic; we also do experiments on humans, and humans also do not accurately describe our reasoning process.

Lot of confabulation going on in the moist blob of electrochemistry found encased within the hydroxyapatite crystal cage we call a skull. Why is it that things which rhyme, or get repeated a lot, seem more true? How come some of us suffer uncontrollable seizures from flashing lights?

We have to study reason to get any good at it, we absolutely suck at this without training. Our natural state is illiterate, innumerate, and illogical.

The part of our (all animals, not just humans) intelligence that is most magical*, is how efficient we are with few examples, not reasoning.

* Questions about consciousness will have to wait until we can agree which of the 40+ definitions of the word actually answers the question we care about, and then also we figure out how to actually test for whatever that is.


Of course, the fact that they have to do that proves my point.


> non deterministic language predictors.

Non?? Only those with sh*tty code, surely.

There's nothing inherently non-deterministic about inference.


Not believing that a prompt is an effective way to actually control their behavior is obviously incorrect to anyone who's actually used these things.

It's not a guaranteed way to control their behavior, but you can more than move the needle.


The word most relevant to this conversation is “influence.” Influence is possible and users observe it and use it to increase margins of useful outcomes. “Control” is incorrect.


yeah that distinction is pretty important, and in general that guy I believe IS making the point - if you can not control it with guaranteed outcomes - you cannot control it.


You can't control it any more than you can control a draw from a deck of cards, but you can absolutely control the deck of cards that you choose to draw from.


The problem is that nobody really does that? Like, as far as I'm aware, even simple stuff such as not considering tokens that would result in a syntax error when writing code isn't being done.


magicians can probably make you change your mind on the former


That's silly. My car is not absolutely guaranteed to turn left when I turn the steering wheel left, but you wouldn't say I can't control my car on that basis.

Steering an LLM with a prompt is way less reliable than steering a car with a steering wheel, but there's still control. It's just not absolute.


if your car doesn' turn left when you turn the steering wheel left, the problem is that the car is broken, if an LLM does something unexpected after you gave it instructions, that's possible when the LLM is functioning entirely correctly.


Nothing in this world is guaranteed. That doesn't mean it's uniformly random either. LLMs can still do something unexpected if you give them clear instructions, but that doesn't mean it'll be arbitrary and unpredictable in scope. The same way C/C++ undefined behavior technically means program can give you nasal demons, but in reality it won't do anything unusual (like format your C:/ drive) unless someone purposefully coded it to do that.


This is all going to flash through your mind when your car mysteriously doesn't turn left. I would prefer to think of machines as things with defined outputs and failure is failure, more than as fluffy little kittens who might do the wrong thing, if the consequences are going to fall on someone who doesn't deserve it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: