More

einrealist · 2026-04-06T23:39:30 1775518770

I don't trust anyone who claims that LLMs today are superhumanly intelligent. All they do is perform compute-intensive brute-force attacks on the problem/solution space and call it 'reasoning', all while subsidising the real costs to capture the market. So much SciFi BS and extrapolation about a technology that is useful if adopted with care.

This technology needs to become a commodity to destroy this aggregation of power between a few organizations with untrustworthy incentives and leadership.

shruggedatlas · 2026-04-07T00:19:51 1775521191

Your brain is performing "compute-intensive brute-force attacks on the problem/solution space" as you read this very sentence. You trained patterns on English syntax, structure, and semantics since you were a child and it is supporting you now with inference (or interpretation). And, for compute efficiency, you probably have evolution to thank.

JohnMakin · 2026-04-07T00:57:33 1775523453

people like to say this like they’re apples to apples but this comparison isn’t remotely how the brain actually works - and even if it did, the brain does it automatically without direction and at an infitesimal percentage of the power required.

And we’re just talking about cognition - it completely ignores the automatic processes such as maintaining and regulating the body and it’s hormones, coordinating and maintaining muscles, visual/spacial processing taking in massive amounts of data at a very fine scale, and informing the body what to do with it - could go on.

One of the more annoying things about this conversation is you don’t even need to make this argument to make the point you’re trying to make, but people love doing it anyway. It needlessly reduces how amazing the human brain is to a bunch of catchy sci fi sounding idioms.

It can be simultaneously true that transformer based language models can be very smart and that the human brain is also very smart. It genuinely confuses me why people need to make it an either/or.

ch_fr · 2026-04-07T10:23:25 1775557405

Thank you, this comparison has been a huge annoyance of mine for the past 3 years of... this same debate over and over.

I think it's the hubris that I find most offensive in this argument: a guy knows one complex thing (programming) and suddenly thinks he can make claims about neuroscience.

igggh · 2026-04-07T03:49:52 1775533792

Great post

stonyrubbish · 2026-04-07T01:00:25 1775523625

Human cognition is nothing like AI "cognition." It really bothers me that people think AI is doing the same thing the human mind does. AI is more like a parrot which is trained to give a correct-looking response to any question. The parrot doesn't think, doesn't know what its doing etc, it just does it because it gets a treat every time a "good" answer is prompted. This is why it can't do things like know how many parenthesis are balanced here ((((()))))) (you can test this), it doesn't have any kind of genuine cognition.

saxonww · 2026-04-07T01:57:48 1775527068

> Human cognition is nothing like AI "cognition."

I've wondered about this. Do we really know enough about what the human brain is doing to make a statement like this? I feel like if we did, we would be able to model it faithfully and OpenAI, etc. would not be doing what they're doing with LLMs.

What if human cognition turns out to be the biological equivalent of a really well-tuned prediction machine, and LLMs are just a more rudimentary and less-efficient version of this?

davebren · 2026-04-07T03:06:23 1775531183

Yes, we do. Humans share the statistical association ability that LLMs possess, but also conscious meaning and understanding. This is a difference in kind and means that we can generalize beyond the statistical pattern associations that we've extracted from data, so we don't require trillions of examples to develop knowledge.

Theoretically a human could sit alone in a dark room, knowing nothing of mathematics and come up with numbers, arithmetic algebra, etc...

They don't need to read every math textbook, paper, and online discussion in existence.

AstroBen · 2026-04-07T03:47:18 1775533638

Our DNA does contain our pre-training, though. It's not true that we're an entirely blank slate.

davebren · 2026-04-07T04:10:13 1775535013

Pre-training is not a good term if you are trying to compare it to LLM pre-training. Closer would be the model's architecture and learning algorithms which has been designed through decades of PhD research, and my point on that is that the differences are still much greater than the similarities.

saxonww · 2026-04-07T04:43:15 1775536995

The point I'm trying to make is that I don't think we know, so we can't say either way.

In your example, would the human have ever had contact with other humans, or would it be placed in the room as a baby with no further input?

davebren · 2026-04-07T13:12:34 1775567554

They grew up in a tribe that hasn't discovered numbers yet.

CamperBob2 · 2026-04-07T16:40:15 1775580015

Those who argue that AI is like human cognition don't know much about AI or human cognition.

Those who argue that AI is like a parrot don't know much about anything at all.

pjeide · 2026-04-09T00:41:23 1775695283

FYI, Opus 4.6 had no problem with your arbitrary "cognition" test:

Someone on HN claimed "This is why it [LLMs] can't do things like know how many parenthesis are balanced here ((((()))))) (you can test this), it doesn't have any kind of genuine cognition". So, how many parenthesis are balanced in that quoted text?

● The string from the quote is ((((()))))) — 5 opening parens and 6 closing parens.

  10 parentheses are balanced (5 matched pairs). There is 1 extra unmatched ).

  Walking through it with a stack:

  ( ( ( ( (  ) ) ) ) ) )
  1 2 3 4 5  4 3 2 1 0 -1  ← depth tracker
                ↑ balanced    ↑ unmatched

The depth goes negative on the last ), meaning it has no matching (.

chpatrick · 2026-04-07T01:12:58 1775524378

This is such a boring cliche by now. "thinking" and "knowing what it's doing" are totally vague statements that we barely understand about the human mind but in every comment section about AI people definitively state that LLMs don't do them, whatever they are.

davebren · 2026-04-07T03:15:43 1775531743

This is the epitome of learned helplessness, that you need a neuroscience paper to tell you what thinking and knowledge is when you experience it directly all the time, and can't tell that an LLM doesn't have it. Something is extremely evil about these ideologies that are teaching people that they are NPCs.

chpatrick · 2026-04-07T18:58:19 1775588299

I know I'm thinking, I have no idea if you're thinking, or if you're a human or an LLM. But I wouldn't assume you aren't thinking just from reading your output.

stonyrubbish · 2026-04-07T01:16:21 1775524581

They aren't so vague that you would argue the parrot is thinking.

chpatrick · 2026-04-07T19:01:49 1775588509

Why not?

fgfarben · 2026-04-07T07:46:41 1775548001

I love reading posts like this. When you were a child, learning math or grammar, do you not remember bouncing off the walls of incorrect answers, eventually landing on a trajectory down the corridor of the right answer? Or were you always instantly zero-shotting everything?

In my experience, this is exactly how language models solve hard new problems, and largely how I solve them too. Propose a new idea, see if it works, iterate if not, keep going until it works.

Of course you can see how to solve a problem that you've seen before, like a visual puzzle about balanced parentheses. We're hyper specialized to visually identify asymmetries. LMs don't have eyes. Your mockery proves nothing.

calf · 2026-04-07T09:16:03 1775553363

The mistake in these types of arguments is that natural, classical-artificial, and/or neural-net-artificial learning methods all employ some kind of counterexample/counterfactual reasoning, but their underlying methods could well be fundamentally different. Thus these arguments are invalid, until computer science advances enough to explain what the differences and similarities actually are.

CamperBob2 · 2026-04-07T05:41:52 1775540512

AI is more like a parrot which is trained to give a correct-looking response to any question.

A parrot that writes better code and English prose than I do?

I would like to buy your parrot.

rootusrootus · 2026-04-07T20:08:37 1775592517

I suspect we just continually overestimate the uniqueness of both our code and our vocabulary. We think we are pretty smart, and we are, but on these two measures 99.999% of us are pretty average, and the LLM just keeps surprising us anyway by proving it.

sph · 2026-04-07T07:27:45 1775546865

> Human cognition is nothing like AI "cognition." It really bothers me that people think AI is doing the same thing the human mind does.

This might sound callous, but I wonder if people saying this themselves have very limited brains more akin to stochastic parrots rather the average homo sapiens.

We are very different, and there are some high-profile people that don't even have an internal monologue or self-introspection abilities (one of the other symptoms is having an egg-shaped head)

staticman2 · 2026-04-07T13:32:44 1775568764

> This might sound callous, but I wonder if people saying this themselves have very limited brains more akin to stochastic parrots rather the average homo sapiens.

I have a different theory.

Aside from a few exceptions like Blake Lemoine few people seem to really act as if they believe A.I. is doing the same thing the human mind is doing.

My theory is people are for some reason role-playing as people who believe human thought is equivalent to A.I. for undisclosed reasons they themselves may or may not understand. They do not actually believe their own arguments.

wil421 · 2026-04-07T00:56:01 1775523361

If you think this way then why not talk to LLMs exclusively. Don’t let the oxytocin cloud your ability to problem solve.

slopinthebag · 2026-04-07T00:53:05 1775523185

I get you're trying to do the whole "humans and LLMs are the same" bit, but it's just plainly false. Please stop.

stavros · 2026-04-07T00:11:45 1775520705

> All they do is perform compute-intensive brute-force attacks on the problem/solution space and call it 'reasoning'

If they discover the cure to cancer, I don't care how they did it. "I don't trust anyone who claims they're superhumanly intelligent" doesn't follow from "all they do is <how they work>".

bjacobel · 2026-04-07T02:34:57 1775529297

Has generative AI made material progress on curing cancer? Has it produced any breakthroughs, at all?

igggh · 2026-04-07T03:44:28 1775533468

In b4

- it’s the worst it’ll ever be - big leaps happened the fast few months bro

Etc.

Personally I think llm’s can be very powerful in a narrow-band. But the more substance a thing involves, the more a human is needed to be involved.

bigyabai · 2026-04-07T00:16:19 1775520979

That's moonshot logic that reinforces the parent's point. You'd absolutely care if the AI's cure to cancer entailed full-body transplants or dismemberment.

JumpCrisscross · 2026-04-07T00:31:10 1775521870

> You'd absolutely care if the AI's cure to cancer entailed full-body transplants or dismemberment

That's not a cure. Like yes, I'd care if the AI says it cures cancer while nuking Chicago. But that isn't what OP said.

Noumenon72 · 2026-04-07T00:24:07 1775521447

"The cure for cancer" as a phrase doesn't include those solutions. If the headline was "Pope discovers the cure for cancer" and those were his solutions you would say "No he didn't." OP was referring to AI discovering the cure for cancer that cancer research is working towards.

stonyrubbish · 2026-04-07T01:04:24 1775523864

> "I don't trust anyone who claims they're intelligent" doesn't follow from "all they do is <how they work>".

It kind of does if how they work is nothing like genuine intelligence. You can (rightly) think AI is incredible and amazing and going to bring us amazing new medical technologies, without wrongly thinking its super amazing pattern recognition is the same thing as genuine intelligence. It should be worrying if people begin to believe the stochastic parrot is actually wise.

einrealist · 2026-04-07T02:49:57 1775530197

I can slow down the compute by a factor of a thousand. It would not change the result. But it changes the economics. We only call it intelligent, because we can do the backpropagation, the inference (and training) fast enough and with enough memory for it to appear this way.

stavros · 2026-04-07T01:22:50 1775524970

If LLMs can come up with superhumanly intelligent solutions, then they're superhumanly intelligent, period. Whether they do this by magic or by stochastic whatever doesn't make any difference at all.

davebren · 2026-04-07T02:45:42 1775529942

Like..a calculator?

CamperBob2 · 2026-04-07T05:43:56 1775540636

Take a calculator to the International Math Olympiad and let's see how you do.

crazylogger · 2026-04-07T01:42:04 1775526124

If all they do is "just" brute-force problem solving, then they are already bound to take over R&D & other knowledge work and exponentially accelerate progress, i.e. the SciFi "singularity" BS ends up happening all the same. Whether we classify them as true reasoning is just semantics.

tim333 · 2026-04-07T18:44:18 1775587458

I don't think anyone does claim they are superhumanly intelligent today in any general way? The question is how they will do in the future.

Rover222 · 2026-04-07T01:07:14 1775524034

Yeah and everything is just atoms. If you reduce anything enough it’s not real.

semiinfinitely · 2026-04-07T00:50:37 1775523037

calculator is superhumanly intelligent

einrealist · 2026-04-03T08:54:55 1775206495

Axel's engagement with the issue and refusal to give up is admirable. It also demonstrates that code and architecture remain important even in an era when managers believe these subjects can now be handled by LLMs. Imagine if LLMs were mandated for use in such an environment, further distancing SWEs from the code and overarching architectural choices. I am not saying that it can't work. But friction and maturity through experience really matters.

Also explains perfectly why I never met an engineer who was eager to run workloads on Azure. In orgs I worked, either the use of Azure was mandated by management (probably good $$ incentives) or through Microsoft leaning into the "Multi-Cloud for resilience" selling point, to get Orgs shift workloads from competitors.

Its also huge case for open (cloud) stack(s).

einrealist · 2026-03-28T08:30:13 1774686613

"Vibe prompting"

einrealist · 2026-03-28T07:54:03 1774684443

So when Anthropic releases a new model that "breaks compatibility" with some Markdown files, do we call it "refactoring" to find (guess) the required changes to have the desired outcome again? Don't we create brittle specifications to fit a version of a model?

Leynos · 2026-03-28T07:56:07 1774684567

Use evals

Coming soon, unit, behavioural and regression tests for your prompts and skills :P

stingraycharles · 2026-03-28T07:59:39 1774684779

How do you use evals when you’re using Claude Code, given that Claude Code also changes their prompts all the time?

You’ll have:

* Claude model version

* Claude Code prompts and tools

* Your own prompts and skills and whatnot

* Your repository’s source code (= the input)

All of those change constantly, it’s not like it’s some kind of SWE benchmark.

Leynos · 2026-03-29T11:06:26 1774782386

You just said it. If consistency is that important, keep consistent versions of model, harness, prompts, skills, etc., and regression test changes. That way lies madness :)

einrealist · 2026-03-23T18:43:20 1774291400

Simply insane.

einrealist · 2026-03-17T22:22:58 1773786178

I have been using Java since version 1.4. Both the language and its ecosystem have come a long way since then. I endured the height of the EJB phase. I adopted Spring when version 1.2 was released. I spent hours fighting with IDEs to run OSGi bundles. I hated building UIs with Swing/AWT, many of which are still in use today and are gradually being replaced by lovely JavaFX. When I look at code I wrote around 12 years ago, I'm amazed at how much I've matured too.

zelphirkalt · 2026-03-18T09:24:48 1773825888

One of the best tools I built was years ago with Swing and Miglayout. Still remember it fondly.

tannhaeuser · 2026-03-17T23:15:49 1773789349

> I hated building UIs with Swing/AWT, many of which are still in use today and are gradually being replaced by lovely JavaFX.

Dude JFX yielded what was called RIAs to JavaScript like almost 15 years ago. Of the three major GUI toolkits Swing, JavaFX, and SWT it was Swing that gained HighDPI support first (10 years ago), and continues to be the base for kick-as IntelliJ IDEA and other Jetbrains IDEs.

zelphirkalt · 2026-03-18T09:27:05 1773826025

Swing was simpler in some ways than JavaFX. I still remember JavaFX treeview taking 3 Generics and I was unable to figure out what one of them is for from the docs. Had to go on Stackoverflow, where someone said it basically didn't matter. But JavaFX looked great at the time.

einrealist · 2026-03-07T07:55:09 1772870109

> SQLite is not primarily fast because it is written in C. Well.. that too, but it is fast because 26 years of profiling have identified which tradeoffs matter.

Someone (with deep pockets to bear the token costs) should let Claude run for 26 months to have it optimize its Rust code base iteratively towards equal benchmarks. Would be an interesting experiment.

The article points out the general issue when discussing LLMs: audience and subject matter. We mostly discuss anecdotally about interactions and results. We really need much more data, more projects to succeed with LLMs or to fail with them - or to linger in a state of ignorance, sunk-cost fallacy and supressed resignation. I expect the latter will remain the standard case that we do not hear about - the part of the iceberg that is underwater, mostly existing within the corporate world or in private GitHubs, a case that is true with LLMs and without them.

In my experience, 'Senior Software Engineer' has NO general meaning. It's a title to be awarded for each participation in a project/product over and over again. The same goes for the claim: "Me, Senior SWE treat LLMs as Junior SWE, and I am 10x more productive." Imagine me facepalming every time.

grey-area · 2026-03-07T14:03:10 1772892190

This would be a really interesting experiment.

I suspect performance is not the only problem with the codebase though.

einrealist · 2026-02-22T01:28:48 1771723728

If the output has problems, do you usually rerun the compilation with the same input (that you control)? I don't usually.

What is included in the 'verify' step? Does it involve changing the generated code? If not, how do you ensure things like code quality, architectural constraints, efficiency and consistency? It's difficult, if not (economically) impossible, to write tests for these things. What if the LLM does not follow the guidelines outlined in your prompt? This is still happening. If this is not included, I would call it 'brute forcing'. How much do you pay for tokens?

bandrami · 2026-02-22T02:14:07 1771726447

I thought to myself that I do this pretty frequently, but then I realized only if I'm going from make -j8 to make -j1. I guess parallelism does throw some indeterminancy into this

eichin · 2026-02-22T02:31:54 1771727514

If parallelism adds indeterminacy, then you have a bug (probably in working out the dependency graph.) Not an unusual one - lots of open source in the 1990s had warnings about not building above -j1 because multi-core systems weren't that common and people weren't actually trying it themselves...

bandrami · 2026-02-22T03:01:09 1771729269

Whenever I traced them, those bugs were always in the logic of the makefile rather than in the compiler. A target in fact depends on another target (generally from much earlier in the file) but the makefile doesn't specify that.

fragmede · 2026-02-22T02:18:42 1771726722

The time I was able to make -j 128 and it took 3 minutes to do what used to take an hour, I almost wet myself.

einrealist · 2026-02-21T15:08:04 1771686484

Why not show a summary of who actually received the data? It should be easy to implement. You could also add what data is retained and an estimate of how long it is kept for. It could be a summary page that I can print as a PDF after the process is complete.

I'd consider that a feature that would increase trust in such a platform. These platforms require trust, right?

einrealist · 2026-02-17T11:21:26 1771327286

What is the purpose of an AGENTS.md file when there are so many different models? Which model or version of the model is the file written for? So much depends on assumptions here. It only makes sense when you know exactly which model you are writing for. No wonder the impact is 'all over the place'.