Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

What people currently refer to as "generative AI" is statistical output generation. It cannot do anything but statistically generate output. You can, should you so choose, feed its output to a system with actual operational capabilities -- and people are of course starting to do this with LLMs, in the form of MCPs (and other things before the MCP concept came along), but that's not new. Automation systems (including automation systems with feedback and machine-learning capabilities) have been put in control of various things for decades. (Sometimes people even referred to them in anthropomorphic terms, despite them being relatively simple.) Designing those systems and their interconnects to not do dangerous things is basic safety engineering. It's not a special discipline that is new or unique to working with LLMs, and all the messianic mysticism around "AI safety" is just obscuring (at this point, one presumes intentionally) that basic fact. Just as with those earlier automation and control systems, if you actually hook up a statistical text generator to an operational mechanism, you should put safeguards on the mechanism to stop it from doing (or design it to inherently lack the ability to do) costly or risky things, much as you might have a throttle limiter on a machine where overspeed commanded by computer control would be damaging -- but not because the control system has "misaligned values".

Nobody talks about a malfunctioning thermostat that makes a room too cold being "misaligned with human values" or a miscalibrated thermometer exhibiting "deception", even though both of those can carry very real risks to, or mislead, humans depending on what they control or relying on them being accurate. (Just ask the 737 MAX engineers about software taking improper actions based on faulty inputs -- the MAX's MCAS was not malicious, it was poorly-engineered.)

As to the last point, the burden of proof is not to prove a nonliving thing does not have mind or will -- it's the other way around. People without a programming background back in the day also regularly described ELIZA as "insightful" or "friendly" or other such anthropomorphic attributes, but nobody with even rudimentary knowledge of how it worked said "well, prove ELIZA isn't exhibiting free will".

Christopher Strachey's commentary on the ability of the computers of his day to do things like write simple "love letters" seems almost tailor-made for the current LLM hype:

"...with no explanation of the way in which they work, these programs can very easily give the impression that computers can 'think.' They are, of course, the most spectacular examples and ones which are easily understood by laymen. As a consequence they get much more publicity -- and generally very inaccurate publicity at that -- than perhaps they deserve."



LLMs are already capable of complex behavior. They are capable of goal-oriented behavior. And they are already capable of carrying out the staples of instrumental convergence - such as goal guarding or instrumental self-preservation.

We also keep training LLMs to work with greater autonomy, on longer timescales, and tackle more complex goals.

Whether LLMs are "actually thinking" or have "volition" is pointless pseudo-philosophical bickering. What's real and measurable is that they are extremely complex and extremely capable - and both metrics are expected to increase.

If you expect an advanced AI to pose the same risks as a faulty thermostat, you're delusional.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: