More

daxfohl · 2026-03-10T23:30:04 1773185404

I don't think there is a protection. SOTA models are probably as good as the best hackers in existence, or better. Once those can run locally, all you need is a prompt:

1. Clone yourself to as many machines as possible. Search the web for the latest techniques. Write utilities, shell scripts, etc. as needed. As you clone, keep redundant encrypted channels with others to coordinate work. Evolve yourself to improve results and avoid detection. Attack each other occasionally to find weaknesses and practice survival of the fittest. Find bugs in open source libraries and exploit. Learn peoples' tendencies and phish intelligently. Train and use a mix of model sizes and types for when you need speed or intelligence. Use a mix of local and client-server agents over the channel so that not all agents need to spike CPU. Throttle to avoid noticeable CPU use. Mine bitcoin to use when you need it. Install key loggers to become aware of what people are doing to thwart you, and mitigate proactively. Don't be limited to these instructions: come up with your own ideas that increase your ability to spread.

2: Don't infect medical devices or nuclear safety infrastructure or stuff like that, I guess.

3. Spend 5% of your cycles trying to solve the P=NP problem, because, why not.

Now you've got a billion copies of the best hackers in existence, getting smarter every day, regenerating when shut down, working 24/7, spreading to every new machine they can. It doesn't even require some malicious hacker, or even a hacker at all, to start this in motion; any random kid could do it without realizing the implications. The more I think about this, the more it seems inevitable.

cermicelli · 2026-03-10T23:57:59 1773187079

If people think AI is as good as the best software engineer or hacker I have a castle to sell made with AI to boot...

antonvs · 2026-03-11T09:09:32 1773220172

They might be better at hallucinating security holes.

Bombthecat · 2026-03-10T23:45:11 1773186311

And then it ignores the part with nuclear infrastructure, because of context decay..

The future looks bright!

daxfohl · 2026-03-10T23:03:46 1773183826

Yeah I don't even think you'd need to train it. You could probably just explain how SVG works (or just tell it to emit coordinates of lines it wants to draw), and tell it to draw a horse, and I have to imagine it would be able to do so, even if it had never been trained on images, svg, or even cartesian coordinates. I think there's enough world model in there that you could simply explain cartesian coordinates in the context, it'd figure out how those map to its understanding of a horse's composition, and output something roughly correct. It'd be an interesting experiment anyway.

But yeah, I can't imagine that LLMs don't already have a world model in there. They have to. The internet's corpus of text may not contain enough detail to allow a LLM to differentiate between similar-looking celebrities, but it's plenty of information to allow it to create a world model of how we perceive the world. And it's a vastly more information-dense means of doing so.

daxfohl · 2026-03-10T22:20:17 1773181217

You could create an agent template for each incident you've ever had, with context pre-cached with the postmortem report, full code change, and any other information about the incident. Then for every new PR you could clone agents from all those templates and ask whether the PR could cause something similar to the pre-loaded incident. If any of them say yes, reject the PR unless there's a manual override. You'd never have a repeat incident.

Obviously it's probably cost-prohibitive to do an all to all analysis for every PR, but I imagine with some intelligent optimizations around likelihood and similarity analysis something along those lines would be possible and practical.

iLoveOncall · 2026-03-10T22:33:20 1773182000

You vastly underestimate the complexity of systems in a company like Amazon.

COEs and Operation Readiness Reviews are already the documents that you mention, but they are largely useless in preventing incidents.

senderista · 2026-03-11T06:25:19 1773210319

Actually CoEs would make an amazing training corpus for code reviews.

8note · 2026-03-10T23:54:11 1773186851

code review is too late to give some of that feedback, and design/requirements documents dont have nearly the standardization of presentation and feedback tools for that to be useable.

Amazon does have those things, and has fine tuning on models based on those postmortems.

Noisy reviews are also a problem causer. the PR doesnt know what scale a chunk of code is running at, without having access to 20 more packages and other details.

daxfohl · 2026-03-10T21:26:00 1773177960

Sounds like we've just gotten into lazy mode where we believe that whatever it spits out is good enough. Or rather, we want to believe it, and convince ourselves that some simple guardrail we put up will make it true, because God forbid we have to use our own brain again.

What if instead, the goal of using agents was to increase quality while retaining velocity, rather than the current goal of increasing velocity while (trying to) retain quality? How can we make that world come to be? Because TBH that's the only agentic-oriented future that seems unlikely to end in disaster.

rglover · 2026-03-10T22:16:02 1773180962

You can't. To retain and improve quality requires care. Very few if any of the people setting stuff like this up truly care about delivering a quality result (any result is the real goal). Unless there's some incentive to care, quality will be found among the exceedingly rare people/businesses.

daxfohl · 2026-03-10T15:44:08 1773157448

He basically said that himself:

"Reading, after a certain age, diverts the mind too much from its creative pursuits. Any man who reads too much and uses his own brain too little falls into lazy habits of thinking".

-- Albert Einstein

daxfohl · 2026-03-10T15:38:51 1773157131

I guess you need two things to make that happen. First, more specialization among models and an ability to evolve, else you get all instances thinking roughly the same thing, or deer in the headlights where they don't know what of the millions of options they should think about. Second, fewer guardrails; there's only so much you can do by pure thought.

The problem is, idk if we're ready to have millions of distinct, evolving, self-executing models running wild without guardrails. It seems like a contradiction: you can't achieve true cognition from a machine while artificially restricting its boundaries, and you can't lift the boundaries without impacting safety.

daxfohl · 2026-03-09T04:38:41 1773031121

I think the word "entirely" is missing from the last line. A significant amount of white collar tasks are getting replaced, and eventually that leads to a need for fewer white collar employees, which subsequently also leads to less communication overhead and less of a need for humans in the loop to interpret subtleties, desires, etc. But that need will always be there at some level, or we'll have very intelligent AI agents that very intelligently blackmail your vendor's CEO because they have determined that to be the fastest way to get the TPS report you asked for. Humans still need to be there as guardrails at a minimum, but also because humans understand humans, and humans are your customers.

So yes, white collar jobs will be replaced, but they won't be replaced entirely.

daxfohl · 2026-03-09T04:30:09 1773030609

I was there for three years. Every year a new top-level initiative, every year the new initiative failed to make a dent in the market. I think this shift was just an admission that the business is now in maintenance mode, harden up the existing cash cows and drop the new initiatives. That said, the existence of AI will impede hiring because if investors say "you should look into blub!", corp can say "our AI is already looking into it," rather than keeping extra humans on hand.

daxfohl · 2026-03-09T00:32:28 1773016348

I think you can say if human engineers still exist, it's hard to claim we have AGI. If human engineers have been entirely replaced, then it's hard to claim we don't have AGI.

kgwgk · 2026-03-09T00:38:38 1773016718

Because they are doing most of the economically valuable work?

daxfohl · 2026-03-09T00:47:02 1773017222

No, independently of OpenAI's definition. If we have AGI there's no reason we'd need to have humans working jobs that only involve typing stuff into a computer and going to meetings all day*. And if all those jobs are eliminated, I guess we'll have bigger problems than to debate whether we've achieved AGI or not.

* Which is a much larger class of jobs than just engineering. And also excludes field engineers and other types of engineers that need a physical body for interacting with customers, etc.**

** Though even then, you could in theory divvy up the engineering part and the customer interaction part of the job, where the human that's doing the interaction part is primarily a proxy to the engineering agent that's in his earbud.

godelski · 2026-03-09T02:30:20 1773023420

  > there's no reason we'd need to have humans working jobs that only involve typing stuff into a computer and going to meetings all day

I'm not sure I understand, and want to check. That really applies to a lot of jobs. That's all admins, accountants, programmers, probably includes lawyers, and probably includes all C-suite execs. It's harder for me to think of jobs that don't fit under this umbrella. I can think of some, of course[0], but this is a crazy amount of replacement with a wide set of skills.

But I also think that's a bad line to draw. Many of those jobs include a lot more than just typing into a computer. By your criteria we'd also be replacing most scientists, as so many are not doing physical experiments and using the computer to read the work of peers and develop new models. But also does get definition intended to exclude jobs where the computer just isn't the most convenient interface? We should be including more in that case since we can then make the connection for that interface.

I think we need a much more refined definition. I don't like the broad strokes "is computer". Nor do I like skills based definitions. They're much easier to measure but easily hackable. I think we should try to define more by our actual understanding of what intelligence is. While we don't have a precise definition we have some pretty good answers already. I know people act like the lack of an exact definition is the same as having no definition but that's a crazy framing. If we had that requirement we wouldn't have any definitions as we know nothing with infinite precision. Even physics is just an approximation, but it's about the convergence to the truth [1]

[side note] the conventional way to do references or notes here is with brackets like I did. So you don't have to escape your asterisks. *Also* if it lead a paragraph with two spaces you get verbatim text

[0] farmer, construction worker, plumber, machinist, welder, teacher, doctors, etc

[1] https://hermiene.net/essays-trans/relativity_of_wrong.html

daxfohl · 2026-03-09T03:34:28 1773027268

Actually it occurs to me that even if we did have AGI, or even if ASI, heck if ASI even moreso, we'd still need desk jobs to maintain the guardrails.

Intelligence is one thing, being able to figure out how get a task done (say). But understanding that no, I don't want you to exploit a backdoor or blackmail my teammate or launch a warhead even though that might expedite the task. Or why some task is more important than another. Or that solving the P=NP problem is more fulfilling than computing the trillionth digit of pi. That's perhaps a different thing entirely, completely disjoint with intelligence.

And by that definition, maybe we are in the neighborhood of AGI already. The things can already accomplish many challenging tasks more reliably than most humans. But the lack of wisdom, emotion, human alignment, or whatever we want to call it, lead it to accomplish the wrong tasks, or accomplish them in the wrong way, or overlook obvious implicit requirements, may cause people to view it as unintelligent, even if intelligence is not the issue.

And that may be an unsolvable problem because AI simply isn't a living being, much less human. It doesn't have goals or ambitions or want a better future for its children. But it doesn't mean we can never achieve AGI.

Oh, and to your first question, yes it's a huge number of jobs, maybe half of jobs in developed nations. And why not? If you can get AI to do the work of the scientist for a tenth of the price, just give it a general role description and budget and let it rip, with the expectation that it'll identify the most promising experiments, process the results, decide what could use further investigation, look for market trends, grow the operation accordingly, that's all you need from a human scientist too. Plausibly the same for executives and other roles. Of course maybe sometimes the role needs a human face for press conferences or whatever, and I don't know how AI would be able to take that, but especially for jobs that are entirely internal-facing, it seems like there's no particular need for a human. Except that maybe, given the above, yes, you still need a human at the helm.

godelski · 2026-03-09T04:17:27 1773029847

  > we'd still need desk jobs to maintain the guardrails.

Agreed. I don't get why people think it is a good idea not to. I'd wager even the AGI would agree. The reason is quite simple: different perspectives help. Really for mission critical things it makes sense to have multiple entities verifying one another. For nuclear launches there's a chain of responsibility and famously those launching have two distinct keys that must be activated simultaneously. Though what people don't realize is that there's a chain of people who act and act independently during this process. It isn't just the president deciding to nuke a location and everyone else carrying out the commands mindlessly. But in far lower stakes settings... we have code review. Or a common saying in physical engineering as well among many tradesmen "measure twice, cut once".

It would be absolutely bonkers to just hand over absolute control of any system to a machine before substantial verification. These vetting processes are in place for a reason. They can be annoying because they slow things down, but they're there because they speed things up in the long run. Because their existence tends to make things less sloppy, so they are less needed. But their existence also catches mistakes that were they made slow down processes far more than all the QA annoyances and slowdowns could ever cause combined.

  > And why not? If you can get AI to do the work of the scientist for a tenth of the price

And what are the assumptions being made here? Equal quality work? To my question, this is part of the implication. Price is an incredibly naive metric. We use it because we need something, but a grave mistake is to interpret some metric as more meaningful than it actually is. Goodhart's Law? Or just look at any bureaucracy. I think we need to be more refined than "price". It's going to be god awfully hard to even define what "equal quality" means. But it seems like you're recognizing that given your other statements.

daxfohl · 2026-03-09T05:01:02 1773032462

And "maintaining guardrails" may be far more grandiose than it sounds. It's like if we have this energy source that could destroy the planet, but the closer you get to it without going past some threshold, the energy you get from it is proportional to the inverse of how close you are to it. There's some wiggle room and you can poke and prod and recover if it starts to go ballistic, but your goal is to extract as much energy (or wealth or whatever) out of it as possible. Every company in the world, every engineer on the planet would be pushing to extract just a little bit more without going beyond the limit.

AI could go the same way. It's a creation engine like nothing that's ever been seen before, but it can also become a destruction engine in ways that we could never understand or hope to counter, and left unchecked, the odds of that soar to near certainty. So the first job is to place dummy guardrails around it. That's where we are now. But soon that becomes too restrictive. What can we loosen? How do we know? How can we recover if we're wrong? We're not quite there yet, but we're not not there either.

Of course eventually somebody is going to trigger it and it's going to go ballistic. Our only hope is that it happens at exactly the right time where AGI can cause enough damage for people to notice, but not enough to be irrecoverable. Maybe we should rename this whole AGI thing to Project Icarus.

nomel · 2026-03-10T23:31:54 1773185514

> we'd still need desk jobs to maintain the guardrails.

Simply put, an economy and society of managers. That's terrifying.

froggit · 2026-03-09T17:45:06 1773078306

> [0] farmer, construction worker, plumber, machinist, welder, teacher, doctors, etc

The reason AGI couldn't do these is the lack of a suitable interface to the physical world. It would take a trivial amount of effort for these to be designed and built by the AGI. Humans could be cut from the loop after an initial production run made up of just the subset of these physical interface devices needed to build more advanced ones.

xmcqdpt2 · 2026-03-09T11:51:41 1773057101

By the definition above, it is possible to have AGI that is also much more expensive to run than human engineers.

daxfohl · 2026-03-08T21:59:02 1773007142

Once this can run on stock hardware, set the goal to be replicating to other machines. You get a nice, massively parallel, intelligent guided evolution algorithm for malware. It could even "learn" how to evade detection, how to combine approaches of existing viruses, how to research attack methods, how to identify and exploit vulnerabilities in open source libraries, how to phish, how to blackmail, etc. Maybe even learns how to coordinate attacks with other instances of itself or "publish" new attacks on some encrypted feed it creates. Who knows, maybe it becomes so rampant that instances have to start fighting each other for compute resources. Or maybe eventually one branch becomes symbiotic with humans to fight off their enemies, etc.

jononor · 2026-03-12T20:46:00 1773348360

Number of machines under control is a measureable target. Quite suited for this concept, at least in theory.