More

cupofjoakim · 2026-06-01T11:15:05 1780312505

Interesting. there are some parts i like a lot here, but two things that I really dislike syntax wise. One is the lean towards a chainable syntax - this has proven to a big footgun for many devs in both java streams and typescript, making it very easy to go from O(n) to O(2n). The other part i really dislike is the first argument principle noted. If i myself define `string_and_reverse` and I can call it both through `string_and_reverse(42)` and `42.string_and_reverse()` i could definitely see this leading to some very funky looking chaining.

Perhaps it's just one point from me - not liking chaining :D

KolmogorovComp · 2026-06-01T11:40:53 1780314053

> making it very easy to go from O(n) to O(2n)

Strictly speaking I assume everyone knows O(n) = O(2n) =O(kn) for k in R.

But I see your point. I assume any decent compiler would merge the loops though

cupofjoakim · 2026-06-01T12:50:44 1780318244

Fair! That'd depend on the operations right? For example, AFAIK typescript can't do much about multiple chained `map` calls, and i've seen quite a few `.filter(...).map(...).filter(Boolean).map(...)` :/

c0balt · 2026-06-01T13:10:36 1780319436

To be fair this likely should be handled by the interpreter/compiler for the compiled JS. V8 probably can merge this into one loop or another similar based on runtime types

frwrfwrfeefwf · 2026-06-01T14:04:53 1780322693

what if k = n

MarkusQ · 2026-06-01T14:20:44 1780323644

Then either n is a constant and it's really O(1), or k isn't a constant and its naming was in violation of the International Conventions on Naming Things to Avoid Silly Arguments, section 3, paragraph IV.7 & ff.

xigoi · 2026-06-01T14:14:19 1780323259

> i could definitely see this leading to some very funky looking chaining.

At least for me,

  thing
    .doThis()
    .thenDoThat()
    .andFinallyThis()

is much more readable than

  andFinallyThis(
    thenDoThat(
      doThis(thing)
    )
  )

cupofjoakim · 2026-06-01T14:21:36 1780323696

Well, nesting is not the only option.

``` thing.doThis() thing.thenDoThat() thing.andFinallyThis()

// or

doThis(thing) thenDoThat(thing) andFinallyThis(thing) ```

xigoi · 2026-06-01T15:00:18 1780326018

That’s not equivalent.

cupofjoakim · 2026-04-16T14:43:53 1776350633

> Opus 4.7 uses an updated tokenizer that improves how the model processes text. The tradeoff is that the same input can map to more tokens—roughly 1.0–1.35× depending on the content type.

caveman[0] is becoming more relevant by the day. I already enjoy reading its output more than vanilla so suits me well.

[0] https://github.com/JuliusBrussee/caveman/tree/main

Tiberium · 2026-04-16T14:47:57 1776350877

I hope people realize that tools like caveman are mostly joke/prank projects - almost the entirety of the context spent is in file reads (for input) and reasoning (in output), you will barely save even 1% with such a tool, and might actually confuse the model more or have it reason for more tokens because it'll have to formulate its respone in the way that satisfies the requirements.

embedding-shape · 2026-04-16T15:10:46 1776352246

> I hope people realize that tools like caveman are mostly joke/prank projects

This seems to be a common thread in the LLM ecosystem; someone starts a project for shits and giggles, makes it public, most people get the joke, others think it's serious, author eventually tries to turn the joke project into a VC-funded business, some people are standing watching with the jaws open, the world moves on.

ghgr · 2026-04-30T08:43:33 1777538613

And not only in the LLM ecosystem. Flask was originally an April Fool's joke too.

https://hackertimes.com/item?id=13436724

simonw · 2026-04-16T15:33:37 1776353617

I was convinced https://github.com/memvid/memvid was a joke until it turned out it wasn't.

embedding-shape · 2026-04-16T15:36:00 1776353760

To be fair, most of us looked at GPT1 and GPT2 as fun and unserious jokes, until it started putting together sentences that actually read like real text, I remember laughing with a group of friends about some early generated texts. Little did we know.

Alifatisk · 2026-04-16T15:52:06 1776354726

Are there any public records I can see from GPT1 and GPT2 output and how it was marketed?

embedding-shape · 2026-04-16T16:22:29 1776356549

HN submissions have a bunch of examples in them, but worth remembering they were released as "Look at this somewhat cool and potentially useful stuff" rather than what we see today, LLMs marketed as tools.

https://hackertimes.com/item?id=21454273 / https://hackertimes.com/item?id=19830042 - OpenAI Releases Largest GPT-2 Text Generation Model

HN search for GPT between 2018-2020, lots of results, lots of discussions: https://hn.algolia.com/?dateEnd=1577836800&dateRange=custom&...

Den_VR · 2026-04-17T03:36:52 1776397012

I still think of The Unreasonable Effectiveness of Recurrent Neural Networks and related writings.

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

embedding-shape · 2026-04-17T11:21:56 1776424916

Fun to revisit no doubt, the comments make it even better.

> SuckCocker 7 years ago - "in short: SKYNET is not far away. Be proud to be a part of it!"

dalemhurley · 2026-04-17T04:14:37 1776399277

Wild how many people were predicting the AI slop, but was dismissing it as unlikely beyond some trolls.

wat10000 · 2026-04-16T16:29:39 1776356979

You can run GPT2! Here's the medium model: https://huggingface.co/openai-community/gpt2-medium

I will now have it continue this comment:

I've been running gps for a long time, and I always liked that there was something in my pocket (and not just me). One day when driving to work on the highway with no GPS app installed, I noticed one of the drivers had gone out after 5 hours without looking. He never came back! What's up with this? So i thought it would be cool if a community can create an open source GPT2 application which will allow you not only to get around using your smartphone but also track how long you've been driving and use that data in the future for improving yourself...and I think everyone is pretty interested.

[Updated on July 20] I'll have this running from here, along with a few other features such as: - an update of my Google Maps app to take advantage it's GPS capabilities (it does not yet support driving directions) - GPT2 integration into your favorite web browser so you can access data straight from the dashboard without leaving any site! Here is what I got working.

[Updated on July 20]

fancyfredbot · 2026-04-16T20:51:36 1776372696

Wow that is terrible. In my memory GPT 2 was more interesting than that. I remember thinking it could pass a Turing test but that output is barely better than a Markov chain.

I guess I was using the large model?

daveguy · 2026-04-16T21:23:41 1776374621

Here is the XL model. 20x the size of the medium model. Still just 2B parameters, but on the bright side it was trained pre-wordslop.

https://huggingface.co/openai-community/gpt2-xl

sillysaurusx · 2026-04-16T22:27:41 1776378461

There’s an art to GPT sampling. You have to use temperature 0.7. People never believe it makes such a massive difference, but it does.

wat10000 · 2026-04-16T21:32:23 1776375143

Probably a much better prompt, too. I just literally pasted in the top part of my comment and let fly to see what would happen.

mlsu · 2026-04-16T16:55:40 1776358540

I was first made aware of GPT2 from reading Gwern -- "huh, that sounds interesting" -- but really didn't start really reading model output until I saw this subreddit:

https://www.reddit.com/r/SubSimulatorGPT2/

There is a companion Reddit, where real people discuss what the bots are posting:

https://www.reddit.com/r/SubSimulatorGPT2Meta/

You can dig around at some of the older posts in there.

walthamstow · 2026-04-16T15:56:32 1776354992

I don't think it was marketed as such, they were research projects. GPT-3 was the first to be sold via API

maplethorpe · 2026-04-16T16:27:27 1776356847

From a 2019 news article:

> New AI fake text generator may be too dangerous to release, say creators

> The Elon Musk-backed nonprofit company OpenAI declines to release research publicly for fear of misuse.

> OpenAI, an nonprofit research company backed by Elon Musk, Reid Hoffman, Sam Altman, and others, says its new AI model, called GPT2 is so good and the risk of malicious use so high that it is breaking from its normal practice of releasing the full research to the public in order to allow more time to discuss the ramifications of the technological breakthrough.

https://www.theguardian.com/technology/2019/feb/14/elon-musk...

ethbr1 · 2026-04-16T16:38:02 1776357482

Aka 'We cared about misuse right up until it became apparent that was profit to be had'

OpenAI sure speed ran the Google and Facebook 'Don't be evil' -> 'Optimize money' transition.

sfn42 · 2026-04-16T17:39:47 1776361187

Or - making sensational statements gets attention. A dangerous tool is necessarily a powerful tool, so that statement is pretty much exactly what you'd say if you wanted to generate hype, make people excited and curious about your mysterious product that you won't let them use.

eric_h · 2026-04-16T18:24:00 1776363840

Much like what Anthropic very recently did re: Mythos

xpe · 2026-04-16T21:35:37 1776375337

Think about all the possible explanations carefully. Weight them based on the best information you have.

(I think the most likely explanation for Mythos is that it's asymmetrically a very big deal. Come to your own conclusions, but don't simply fall back on the "oh this fits the hype pattern" thought terminating cliché.)

Also be aware of what you want to see. If you want the world to fit your narrative, you're more likely construct explanations for that. (In my friend group at least, I feel like most fall prey to this, at least some of the time, including myself. These people are successful and intelligent by most measures.)

Then make a plan to become more disciplined about thinking clearly and probabilistically. Make it a system, not just something you do sometimes. I recommend the book "the Scout Mindset".

Concretely, if one hasn't spent a couple of quality hours really studying AI safety I think one is probably missing out. Dan Hendrycks has a great book.

PufPufPuf · 2026-04-16T19:19:07 1776367147

I used GPT-2 (fine-tuned) to generate Peppa Pig cartoons, it was cutely incoherent https://youtu.be/B21EJQjWUeQ

Bombthecat · 2026-04-16T16:13:56 1776356036

And now gpt is laughing,while it replaces coders lol

MarcelOlsz · 2026-04-16T15:44:59 1776354299

Why? Doesn't have jokey copy. Any thoughts on claude-mem[0] + context-mode[1]?

[0] https://github.com/thedotmack/claude-mem

[1] https://github.com/mksglu/context-mode

simonw · 2026-04-16T16:00:30 1776355230

The big idea with Memvid was to store embedding vector data as frames in a video file. That didn't seem like a serious idea to me.

nico · 2026-04-16T16:19:04 1776356344

Very cool idea. Been playing with a similar concept: break down one image into smaller self-similar images, order them by data similarity, use them as frames for a video

You can then reconstruct the original image by doing the reverse, extracting frames from the video, then piecing them together to create the original bigger picture

Results seem to really depend on the data. Sometimes the video version is smaller than the big picture. Sometimes it’s the other way around. So you can technically compress some videos by extracting frames, composing a big picture with them and just compressing with jpeg

jermaustin1 · 2026-04-16T16:20:18 1776356418

> embedding vector data as frames in a video file

Interesting, when I heard about it, I read the readme, and I didn't take that as literal. I assumed it was meant as we used video frames as inspiration.

I've never used it or looked deeper than that. My LLM memory "project" is essentially a `dict<"about", list<"memory">>` The key and memories are all embeddings, so vector searchable. I'm sure its naive and dumb, but it works for my tiny agents I write.

niuzeta · 2026-04-16T15:50:12 1776354612

Just read through the readme and I was fairly sure this was a well-written satire through "Smart Frames".

Honestly part of me still thinks this is a satire project but who knows.

pred_ · 2026-04-17T13:39:21 1776433161

Time for https://itsid.cloud/index2.html to be acquired by one of the big players, I guess.

DiffTheEnder · 2026-04-16T16:17:49 1776356269

Is this... just one file acting as memory?

paulddraper · 2026-04-16T23:41:10 1776382870

One video file

imiric · 2026-04-16T15:46:01 1776354361

A major reason for that is because there's no way to objectively evaluate the performance of LLMs. So the meme projects are equally as valid as the serious ones, since the merits of both are based entirely on anecdata.

It also doesn't help that projects and practices are promoted and adopted based on influencer clout. Karpathy's takes will drown out ones from "lesser" personas, whether they have any value or not.

combobyte · 2026-04-16T17:20:05 1776360005

> most people get the joke

I hope you're right, but from my own personal experience I think you're being way too generous.

dakolli · 2026-04-16T17:39:10 1776361150

Its the same as cyrpto/nft hype cyles, except this time one of the joke projects is going to crash the economy.

msikora · 2026-04-16T22:01:34 1776376894

This has been a thing way before AI. Anyone remembers Yo, the single button social media app that raised $1M in 2014?

stingraycharles · 2026-04-16T15:21:28 1776352888

While the caveman stuff is obviously not serious, there is a lot of legit research in this area.

Which means yes, you can actually influence this quite a bit. Read the paper “Compressed Chain of Thought” for example, it shows it’s really easy to make significant reductions in reasoning tokens without affecting output quality.

There is not too much research into this (about 5 papers in total), but with that it’s possible to reduce output tokens by about 60%. Given that output is an incredibly significant part of the total costs, this is important.

https://arxiv.org/abs/2412.13171

altruios · 2026-04-16T16:05:11 1776355511

Who would suspect that the companies selling 'tokens' would (unintentionally) train their models to prefer longer answers, reaping a HIGHER ROI (the thing a publicly traded company is legally required to pursue: good thing these are all still private...)... because it's not like private companies want to make money...

stingraycharles · 2026-04-16T17:55:56 1776362156

I don’t think this is a plausible argument, as they’re generally capacity constrained, and everyone would like shorter (= faster) responses.

I’m fairly certain that in a few more releases we’ll have models with shorter CoT chains. Whether they’ll still let us see those is another question, as it seems like Anthropic wants to start hiding their CoT, potentially because it reveals some secret sauce.

Ifkaluva · 2026-04-17T01:27:53 1776389273

I guess mainly they don’t want you to distill on their CoT

stingraycharles · 2026-04-17T10:46:28 1776422788

Yes, which I understand, but I think they’re crippling their product for users this way.

I don’t think it’s just this, because the thinking tokens often reveal more about Anthropic’s inner workings. For example, it’s how the whole existence of Claude’s soul document was reverse engineered, it often leaks details about “system reminders” (eg long conversation reminders).

I think it’s also just very convenient for Anthropic to do this. The fact that they’re also presenting this as a “performance optimization” suggests they’re not giving the real reason they do this.

fancyfredbot · 2026-04-16T21:00:15 1776373215

Try setting up one laundry which charges by the hour and washes clothes really really slowly, and another which washes clothes at normal speed at cost plus some margin similar to your competitors.

The one which maximizes ROI will not be the one you rigged to cost more and take longer.

sebastiennight · 2026-04-16T22:50:29 1776379829

I don't think the analogy is correct here.

Directionally, tokens are not equivalent to "time spent processing your query", but rather a measure of effort/resource expended to process your query.

So a more germane analogy would be:

What if you set up a laundry which charges you based on the amount of laundry detergent used to clean your clothes?

Sounds fair.

But then, what if the top engineers at the laundry offered an "auto-dispenser" that uses extremely advanced algorithms to apply just the right optimal amount of detergent for each wash?

Sounds like value-added for the customer.

... but now you end up with a system where the laundry management team has strong incentives to influence how liberally the auto-dispenser will "spend" to give you "best results"

bombcar · 2026-04-17T02:02:57 1776391377

Shades of “repeat” in lather, rinse, repeat.

gwern · 2026-04-16T19:12:01 1776366721

LLM APIs sell on value they deliver to the user, not the sheer number of tokens you can buy per $. The latter is roughly labor-theory-of-value levels of wrong.

ACCount37 · 2026-04-16T15:25:20 1776353120

Some labs do it internally because RLVR is very token-expensive. But it degrades CoT readability even more than normal RL pressure does.

It isn't free either - by default, models learn to offload some of their internal computation into the "filler" tokens. So reducing raw token count always cuts into reasoning capacity somewhat. Getting closer to "compute optimal" while reducing token use isn't an easy task.

stingraycharles · 2026-04-16T15:30:32 1776353432

Yeah the readability suffers, but as long as the actual output (ie the non-CoT part) stays unaffected it’s reasonably fine.

I work on a few agentic open source tools and the interesting thing is that once I implemented these things, the overall feedback was a performance improvement rather than performance reduction, as the LLM would spend much less time on generating tokens.

I didn’t implement it fully, just a few basic things like “reduce prose while thinking, don’t repeat your thoughts” etc would already yield massive improvements.

AdamN · 2026-04-16T15:35:02 1776353702

Yeah you could easily imagine stenography like inputs and outputs for rapid iteration loops. It's also true that in social media people already want faster-to-read snippets that drop grammar so the desire for density is already there for human authors/readers.

ieie3366 · 2026-04-16T15:17:34 1776352654

All LLMs also effectively work by ”larping” a role. You steer it towards larping a caveman and well.. let’s just say they weren’t known for their high iq

roughly · 2026-04-16T15:28:39 1776353319

Fun fact: Neanderthals actually had larger brains than Homo Sapiens! Modern humans are thought to have outcompeted them by working better together in larger groups, but in terms of actual individual intelligence, Neanderthals may have had us beat. Similarly, humans have been undergoing a process of self-domestication over the last couple millenia that have resulted in physiological changes that include a smaller brain size - again, our advantage over our wilder forebearers remains that we're better in larger social groups than they were and are better at shared symbolic reasoning and synchronized activity, not necessarily that our brains are more capable.

(No, none of this changes that if you make an LLM larp a caveman it's gonna act stupid, you're right about that.)

adwn · 2026-04-16T15:58:35 1776355115

I thought we were way past the "bigger brain means more intelligence" stage of neuroscience?

seba_dos1 · 2026-04-16T16:27:26 1776356846

Bigger brain does not automatically mean more intelligence, but we have reasons to suspect that homo neanderthalensis may have been more intelligent than contemporary homo sapiens other than bigger brains.

dtech · 2026-04-16T17:12:53 1776359573

You can't draw conclusions on individuals, but at a species level bigger brain, especially compared to body size, strongly correlates with intelligence

nomel · 2026-04-16T16:19:46 1776356386

All data shows there's a moderate correlation.

waffletower · 2026-04-16T16:24:32 1776356672

Even neuronal density is simplistic, and the dimension of size alone doesn't consider that.

Hikikomori · 2026-04-16T15:25:03 1776353103

Modern humans were also cavemen.

DiogenesKynikos · 2026-04-16T15:23:42 1776353022

This is why ancient Chinese scholar mode (also extremely terse) is better.

bensyverson · 2026-04-16T15:22:49 1776352969

Exactly. The model is exquisitely sensitive to language. The idea that you would encourage it to think like a caveman to save a few tokens is hilarious but extremely counter-productive if you care about the quality of its reasoning.

andai · 2026-04-16T22:21:21 1776378081

Does this imply that if you train it on Gwern style output, the quality will improve?

gwern · 2026-04-16T22:57:30 1776380250

Unfortunately, that is an oversimplification for a highly RLed/chatbot trained LLM like Claude-4.7-opus. It may have started life as a base model (where prompting it with correctly spelled prompts, or text from 'gwern', would - and did with davinci GPT-3! - improve quality), but that was eons ago. The chatbots are largely invariant to that kind of prompt trickery, and just try to do their best every time. This is why those meme tricks about tips or bribery or my-grandmother-will-die stop working.

reacharavindh · 2026-04-16T16:00:43 1776355243

This specific form may be a joke, but token conscious work is becoming more and more relevant.. Look at https://github.com/AgusRdz/chop

And

https://github.com/toon-format/toon

alex7o · 2026-04-16T16:52:04 1776358324

Also https://github.com/rtk-ai/rtk but some people see that changing how commands output stuff can confuse some models

SEJeff · 2026-04-16T20:14:21 1776370461

I believe tools like graphify cut down the tokens in thinking dramatically. It makes a knowledge graph and dumps it into markdown that is honestly awesome. Then it has stubs that pretend to be some tools like grep that read from the knowledge graph first so it does less work. Easy to setup and use too. I like it.

https://graphify.net/

causal · 2026-04-16T16:49:41 1776358181

Output tokens are more expensive

xnx · 2026-04-17T00:40:59 1776386459

There's a tremendous amount of superstition around LLMs. Remember when "prompt engineering" "best practices" were to say you were offering a tip or some other nonsense?

sidrag22 · 2026-04-16T18:02:55 1776362575

I hesitated 100% when i saw caveman gaining steam, changing something like this absolutely changes the behaviour of the models responses, simply including like a "lmao" or something casual in any reply will change the tone entirely into a more relaxed style like ya whatever type mode.

I think a lot of people echo my same criticism, I would assume that the major LLM providers are the actual winners of that repo getting popular as well, for the same reason you stated.

> you will barely save even 1% with such a tool

For the end user, this doesnt make a huge impact, in fact it potentially hurts if it means that you are getting less serious replies from the model itself. However as with any minor change across a ton of users, this is significant savings for the providers.

I still think just keeping the model capable of easily finding what it needs without having to comb through a lot of files for no reason, is the best current method to save tokens. it takes some upfront tokens potentially if you are delegating that work to the agent to keep those navigation files up to date, but it pays dividends when future sessions your context window is smaller and only the proper portions of the project need to be loaded into that window.

egorfine · 2026-04-16T15:14:09 1776352449

They are indeed impractical in agentic coding.

However in deep research-like products you can have a pass with LLM to compress web page text into caveman speak, thus hugely compressing tokens.

claytongulick · 2026-04-16T15:29:20 1776353360

I don't understand how this would work without a huge loss in resolution or "cognitive" ability.

Prediction works based on the attention mechanism, and current humans don't speak like cavemen - so how could you expect a useful token chain from data that isn't trained on speech like that?

I get the concept of transformers, but this isn't doing a 1:1 transform from english to french or whatever, you're fundamentally unable to represent certain concepts effectively in caveman etc... or am I missing something?

egorfine · 2026-04-16T17:00:44 1776358844

Good catch actually.

Okay maybe not exactly caveman dialect, but text compression using LLM is definitely possible to save on tokens in deep research.

Waterluvian · 2026-04-16T15:31:34 1776353494

Help me understand: I get that the file reading can be a lot. But I also expand the box to see its “reasoning” and there’s a ton of natural language going on there.

make3 · 2026-04-16T14:51:08 1776351068

I wonder if you can have it reason in caveman

0123456789ABCDE · 2026-04-16T15:01:31 1776351691

would you be surprised if this is what happens when you ask it to write like one?

folks could have just asked for _austere reasoning notes_ instead of "write like you suffer from arrested development"

Sohcahtoa82 · 2026-04-16T15:22:08 1776352928

> "write like you suffer from arrested development"

My first thought was that this would mean that my life is being narrated by Ron Howard.

sambellll · 2026-04-16T20:56:00 1776372960

Someone should make an MCP that parses every non-code file before it hits claude to turn it into caveman talk

addandsubtract · 2026-04-16T16:45:35 1776357935

We started out with oobabooga, so caveman is the next logical evolution on the road to AGI.

micromacrofoot · 2026-04-16T16:10:18 1776355818

I mean we had a shoe company pivot to AI and raise their stock value by 300%, how can we even know anymore

bombcar · 2026-04-17T02:06:03 1776391563

Lemonade and blockchain rides again!

Or was it ice tea?

acedTrex · 2026-04-16T14:50:48 1776351048

You really think the 33k people that starred a 40 line markdown file realize that?

andersa · 2026-04-16T15:14:10 1776352450

You mean the 33k bots that created a nearly linear stars/day graph? There's a dip in the middle, but it was very blatant at the start (and now)

verdverm · 2026-04-16T15:04:10 1776351850

Stars are more akin to bookmarks and likes these days, as opposed to a show of support or "I use this"

giraffe_lady · 2026-04-16T15:12:53 1776352373

I intentionally throw some weird ones on there just in case anyone is actually ever checking them. Gotta keep interviewers guessing.

zbrozek · 2026-04-16T15:10:01 1776352201

I use them like bookmarks.

LPisGood · 2026-04-16T15:11:05 1776352265

I use them as likes

gghootch · 2026-04-16T15:55:20 1776354920

Caveman is fun, but the real tool you want to reduce token usage is headroom

https://github.com/gglucass/headroom-desktop (mac app)

https://github.com/chopratejas/headroom (cli)

SnowLprd · 2026-04-17T06:51:29 1776408689

This smells heavily of astroturfing. Particularly because Headroom is a paid product, and that fact is not mentioned here or in the GitHub README.

Here was my experience…

I download and run the Mac application, which starts installing a bunch of things. Then the following happens without advance notice:

- Adds background item(s) from "Idiosyncratocracy BV"

- Downloads over 2 GB of files

- Pollutes home with ~/.headroom directory

- Adds hook(s) to ~/.claude/hooks/

- Modifies your ~/.claude/settings.json to add above hook(s)

… and then I see something in the settings that talks about creating an account. That's when I realized that this is a paid product, after all of the above has happened.

Headroom seems to use https://github.com/rtk-ai/rtk under the hood. What does Headroom offer over the actually-free RTK? Who knows.

At this point I have had it with this subterfuge — I immediately trash the app and every related file and folder I can find, of which there are many. Hopefully I got them all, but who knows. There should have been an easy way to uninstall this mess, but of course there isn't.

The lack of transparency here is really concerning.

gghootch · 2026-04-17T08:53:35 1776416015

Thanks for the feedback, will work on making this more transparent so future users do not have this experience.

I did want to call out that headroom is not based on RTK - it includes RTK sure, but headroom cli has a lot more going on under the hood. For more see https://github.com/chopratejas/headroom

shapeling · 2026-04-17T12:29:02 1776428942

I installed Headroom to give it a try, quickly decided to uninstall when I realized how invasive it is and requires a subscription. Spent the next few hours having issues with CC where it was asking for permission on every command. It was using absolute paths for all commands - turns out it was running into `zsh: command not found: rtk`. To fully uninstall I had to:

- Remove hook from `~/.claude/settings.local.json

- rm -rf ~/.headroom

- rm ~/.claude/hooks/headroom-rtk-rewrite.sh

- launchctl unload ~/Library/LaunchAgents/Headroom.plist

- rm ~/Library/LaunchAgents/Headroom.plist

- rm -rf ~/Library/Preferences/com.extraheadroom.headroom*

- rm -rf ~/Library/Caches/com.extraheadroom.headroom

gghootch · 2026-04-21T09:18:35 1776763115

Thanks for sharing your experiences. We incorporated changes in the latest version to improve this:

1. On install we explain what Headroom installs 2. We added an uninstall feature that removes all of this for you 3. On quit of the app, we immediately remove all items that may intervene with normal Claude Code behavior

gilles_oponono · 2026-04-16T18:40:40 1776364840

Different positionning - headroom compress inputs and open source project - caveman is output and open source - edgee more corporate offer

stavros · 2026-04-16T17:42:05 1776361325

I tried to use rtk for the same, and my agent session would just loop the same tool call over and over again. Does headroom work better?

gghootch · 2026-04-16T18:13:55 1776363235

Way better. You don’t notice it’s there.

selcuka · 2026-04-17T05:56:08 1776405368

Note that Headroom GUI installs rtk by default.

stavros · 2026-04-16T18:18:47 1776363527

Thanks, I'll try it!

firemelt · 2026-04-17T03:34:31 1776396871

rtk vibes a product of vibe code

kokakiwi · 2026-04-16T16:11:31 1776355891

Headroom looks great for client-side trimming. If you want to tackle this at the infrastructure level, we built Edgee (https://www.edgee.ai) as an AI Gateway that handles context compression, caching, and token budgeting across requests, so you're not relying on each client to do the right thing.

(I work at Edgee, so biased, but happy to answer questions.)

anandvshah · 2026-04-17T05:32:47 1776403967

I have used Edgee.AI and it is amazing.

gilles_oponono · 2026-04-16T18:39:18 1776364758

100% agree

computomatic · 2026-04-16T14:56:28 1776351388

I was doing some experiments with removing top 100-1000 most common English words from my prompts. My hypothesis was that common words are effectively noise to agents. Based on the first few trials I attempted, there was no discernible difference in output. Would love to compare results with caveman.

Caveat: I didn’t do enough testing to find the edge cases (eg, negation).

computerphage · 2026-04-16T15:31:38 1776353498

Yeah, when I'm writing code I try to avoid zeros and ones, since those are the most common bits, making them essentially noise

ruairidhwm · 2026-04-16T15:28:49 1776353329

I literally just posted a blog on this. Some seemingly insignificant words are actually highly structural to the model. https://www.ruairidh.dev/blog/compressing-prompts-with-an-au...

cheschire · 2026-04-16T15:32:40 1776353560

I suspect even typos have an impact on how the model functions.

I wonder if there’s a pre-processor that runs to remove typos before processing. If not, that feels like a space that could be worked on more thoroughly.

ruairidhwm · 2026-04-16T16:07:50 1776355670

I guess just a spell-check in the repo? But yes, I'd imagine that they have an effect. Even running the same input twice is non-deterministic.

cheschire · 2026-04-16T16:14:43 1776356083

The ability for audio processing to figure out spelling from context, especially with regards to acronyms that are pronounced as words, leads me to believe there’s potential for a more intelligent spell check preprocess using a cheaper model.

mathieudombrock · 2026-04-16T17:54:26 1776362066

The same input twice is only nondeterministic if you don't control the seed.

0123456789ABCDE · 2026-04-16T15:52:48 1776354768

there is no pre-processor, i've had typos go through, with claude asking to make sure i meant one thing instead of the other

PhilipRoman · 2026-04-16T16:10:46 1776355846

I strongly suspected that there was some pre/postprocessing going on when trying to get it to output rot13("uryyb, jbyeq"), but it's probably just due to massively biased token probabilities. Still, it creates some hilarious output, even when you clearly point out the error:

  Hmm, but wait — the original you gave was jbyeq not jbeyq:
  j→w, b→o, y→l, e→r, q→d = world
  So the final answer is still hello, world. You're right that I was misreading the input. The result stands.

AlecSchueler · 2026-04-16T15:35:36 1776353736

Doesn't it just use more tokens in reasoning?

slashdave · 2026-04-17T00:21:02 1776385262

> My hypothesis was that common words are effectively noise to agents

Umm... a few words can be combined in a rather large number of ways.

Punctuation is used a lot. Why not just remove all the periods and commas and see what happens? Probably not pretty

user34283 · 2026-04-16T15:32:57 1776353577

I used Opus 4.7 for about 15 minutes on the auto effort setting.

It nicely implemented two smallish features, and already consumed 100% of my session limit on the $20 plan.

See you again in five hours.

alach11 · 2026-04-16T19:29:06 1776367746

On my private internal oil and gas benchmark, I found a counterintuitive result. Opus 4.7 scores 80%, outperforming Opus 4.6 (64%) and GPT-5.4 (76%). But it's the cheapest of the three models by 2x.

This is mainly driven by reduced reasoning token usage. It goes to show that "sticker price" per token is no longer adequate for comparing model cost.

TIPSIO · 2026-04-16T15:05:06 1776351906

Oh wow, I love this idea even if it's relatively insignificant in savings.

I am finding my writing prompt style is naturally getting lazier, shorter, and more caveman just like this too. If I was honest, it has made writing emails harder.

While messing around, I did a concept of this with HTML to preserve tokens, worked surprisingly well but was only an experiment. Something like:

> <h1 class="bg-red-500 text-green-300"><span>Hello</span></h1>

AI compressed to:

> h1 c bgrd5 tg3 sp hello sp h1

Or something like that.

Leynos · 2026-04-16T15:15:10 1776352510

Combine that with emmet / zen coding: https://en.wikipedia.org/wiki/Emmet_%28software%29?wprov=sfl...

naoru · 2026-04-16T15:14:32 1776352472

You'd like Emmet notation. Just look at the cheat sheet: https://docs.emmet.io/cheat-sheet/

fzaninotto · 2026-04-16T17:24:55 1776360295

To reduce token count on command outputs you can also use RTK [0]

[0]: https://github.com/rtk-ai/rtk

chrisweekly · 2026-04-16T15:50:03 1776354603

I really enjoy the party game "Neanderthal Poetry", in which you can only speak using monosyllabic words. I bet you would too.

motoboi · 2026-04-16T16:21:45 1776356505

Caveman hurt model performance. If you need a dumber model with less token output, just use sonnet-4-6 or other non-reasoning model.

hayd · 2026-04-16T18:34:30 1776364470

Does it? I'm not sure I'd necessarily notice but I haven't found it noticeably worse.

nickspag · 2026-04-16T16:36:20 1776357380

I find grep and common cli command spam to be the primary issue. I enjoy Rust Token Killer https://github.com/rtk-ai/rtk, and agents know how to get around it when it truncates too hard.

JustFinishedBSG · 2026-04-16T17:25:47 1776360347

Interesting, it doesn't seem intuitive at all to me.

My (wrong?) understanding was that there was a positive correlation between how "good" a tokenizer is in terms of compression and the downstream model performance. Guess not.

stacktraceyo · 2026-04-16T23:48:39 1776383319

What about some thing like

https://github.com/rtk-ai/rtk

willsmith72 · 2026-04-17T00:50:25 1776387025

That's such a poor way to communicate a number. I take it they mean an increase of up to 35%?

ojuschugh1 · 2026-04-20T17:33:32 1776706412

try this - https://github.com/ojuschugh1/sqz

p_stuart82 · 2026-04-16T17:04:33 1776359073

caveman stops being a style tool and starts being self-defense. once prompt comes in up to 1.35x fatter, they've basically moved visibility and control entirely into their black box.

hayd · 2026-04-16T15:33:12 1776353592

me feel that it needs some tweaking - it's a little annoyingly cute (and could be even terser).

OtomotO · 2026-04-16T14:55:33 1776351333

Another supply chain attack waiting?

Have you tried just adding an instruction to be terse?

Don't get me wrong, I've tried out caveman as well, but these days I am wondering whether something as popular will be hijacked.

pawelduda · 2026-04-16T15:09:50 1776352190

People are really trigger-happy when it comes to throwing magic tools on top of AI that claim to "fix" the weak parts (often placeboing themselves because anthropic just fixed some issue on their end).

Then the next month 90% of this can be replaced with new batch of supply chain attack-friendly gimmicks

Especially Reddit seems to be full of such coding voodoo

xienze · 2026-04-16T15:16:53 1776352613

> coding voodoo

Well, we've sacrificed the precision of actual programming languages for the ease of English prose interpreted by a non-deterministic black box that we can't reliably measure the outputs of. It's only natural that people are trying to determine the magical incantations required to get correct, consistent results.

JohnMakin · 2026-04-16T15:48:23 1776354503

My favorite to chuckle at are the prompt hack voodoo stuff, like, “tell it to be correct” or “say please” or “tell it someone will die if it doesnt do a good job,” often presented very seriously and with some fast cutting animations in a 30 second reel

OtomotO · 2026-04-20T16:22:39 1776702159

That's the same thing employees were told... right?! :)

pawelduda · 2026-04-16T18:17:40 1776363460

Make no mistakes!

ctoth · 2026-04-16T17:01:44 1776358904

1.35 times! For Input! For what kinds of tokens precisely? Programming? Unicode? If they seriously increased token usage by 35% for typical tasks this is gonna be rough.

4b11b4 · 2026-04-17T03:53:31 1776398011

but what about DDD

cupofjoakim · 2026-02-27T11:16:20 1772190980

This is great. I wish however there was some sandbox support, perhaps running the whole hive inside a vm for example

cupofjoakim · 2026-01-21T12:28:31 1768998511

The US being stuck in imperial is such a meme nowadays with "freedum units" and the like. It's yet another odd thing that makes it easy for the rest of the world to laugh at the US. In these isolationist times I doubt this will change soon though, but it'd definitely help international collaboration.

yurishimo · 2026-01-21T12:37:09 1768999029

Everyone who wants to collaborate internationally is already doing it. Science in the US is entirely metric. Construction and domestic measurements are the two biggest holdouts and honestly they’re both negligible. Given the proliferation of global manufacturing, most businesses are converting at the end before retail for US customers.

If the government was competent, they could rip off the bandaid and everyone would adapt within a year or two, but we need to wait at least 3 years for that to even begin to become a possibility again.

kevin_thibedeau · 2026-01-21T12:59:50 1769000390

The US has gone almost fully metric on plywood thickness due to globalization.

t-3 · 2026-01-21T13:06:00 1769000760

Honestly, I don't think anyone would raise much of a fuss over changing distance measurements to metric. Both centimeters and inches are easy enough to eyeball or rule-of-thumb, meters and yards are basically the same, and larger units are only relevant for speed limits and travel planning. Metric lacks a good "foot", but I guess people would get used to eyeballing things in ~50cm increments instead.

Weights are even easier as pretty much everyone uses grams as the smallest daily unit and most people can convert to and from metric on the fly for ounces, lbs, kgs. Liters aren't uncommon, and ml<->gram equivalence for water is well-known. Traditional kitchen volumes probably wouldn't be displaced because metric has no answer for those in first place.

Temperature is where metric will fail to gain adoption because Celsius totally sucks unless your daily life consists only of boiling or freezing water at sea level. No advantages over Fahrenheit except maybe arguably for science, because it's Kelvin with an offset.

kakacik · 2026-01-21T15:53:31 1769010811

Now sure what sucks on Celsius, water freezing and boiling have been some of the most important scientific and just plain existence facts of mankind since we evolved. We humans have 10 fingers so its split by decimal system.

Since we humans operate 99% of our existence in a narrow band between 0 and 100 degrees celzius, I'd say its more important than starting from absolute 0 and dealing constantly with big offsets.

0 or 100 or -100 or 10 Fahrenheit is what? From Gemini: "0°F was the lowest temperature achievable with a mixture of ice, water, and salt (brine), while 96°F was set as the approximate temperature of the human body (blood heat), chosen because 96 is easily divisible by many numbers, allowing for finer divisions" - rather insignificant things.

t-3 · 2026-01-21T17:15:43 1769015743

In Celsius, my daily life uses values from ~ -20 to +30 for the weather, but from ~0 - 90F. For cooking both are equally arbitrary, as the only place I set or read a temperature when cooking is candymaking, setting the oven, or cooking large amounts of meat.

shoxidizer · 2026-01-21T13:47:01 1769003221

> Metric lacks a good "foot", but I guess people would get used to eyeballing things in ~50cm increments instead.

Perhaps as a compromise we could adopt the meter but divide it by halves, quarters, and so on. Binary fractions are so much more universal than arbitrary base ten ;)

ungovernableCat · 2026-01-21T16:16:44 1769012204

In a country where every single facet of life is being increasingly politicised, you think this wouldn't cause a fuss?

Oddly enough if any government could just push and shove this through it might be Trump. I bet 20 years later you'd have a sizeable constituency who could be convinced that the change from imperial to foreign units was the beginning of the fall and decline and that everything could be fixed if you went back.

ebiester · 2026-01-21T15:40:47 1769010047

Oh, you bet they would. Nothing causes old white people to riot like mild inconvenience.

That's only mildly sarcastic. For many people, it's become a part of being American, especially on the conservative side of the isle. Now, I personally live in celsius and work comfortably in kilometers, liters, and grams. However, it has become a weird point of pride for some Americans.

walthamstow · 2026-01-21T15:51:06 1769010666

re construction, we use 8x4ft sheets of timber in Britain still, same as before, we just call them twelve-twenty-by-two-four-forty now.

cucumber3732842 · 2026-01-21T14:16:55 1769005015

The unit itself doesn't actually matter. Even industries with the least precision set their stuff up with so much precision that the unit you use basically doesn't matter.

Your machine may spit out widgets that are plus or minus an inch. But when you set up the machine you set it up to the 1/16 regardless. Swapping all that to metric doesn't actually change anything other than the number the guy setting it up dials it in to.

relaxing · 2026-01-21T15:10:38 1769008238

1/16” is just over 1.5 mm, so yes, the guy setting the machine in millimeters is giving you more precision. In the real world measurements aren’t just abstract figures you can move around losslessly.

I have a socket set in half-millimeter sizes for the absolute plague of cheap bolts and nuts that are being manufactured with obscene levels of slop.

throwway120385 · 2026-01-21T16:05:24 1769011524

He's not giving you more accuracy though. A machine that's accurate to 1/32" is accurate to .75mm. If those cheap bolts were in US customary they would still need to be in smaller increments.

cucumber3732842 · 2026-01-21T16:36:18 1769013378

You're missing the point.

The guy buying the widgets doesn't care because he's expecting a widget that's plus or minus dozens of the unit the machine is being set to. The setting is just as precise as it is in order to set the fat part of your output curve over the middle of your quality control pass range.

The machine might not even be calibrated in a direct measurement, it might be calibrated in a secondary measurement. Like tons of force or rpm or cycle speed or something that then translates to the dimension of your output part.

The units on machines mostly only exist for calibration. Beyond that they can just be made up "my amp goes to 11" type scales because they're so divorced from the outputs, either in precision (or are literally indirect as described above) that you "just have to know" that if you want a "X<unit>" widget you'll actually set the machine

Tons and tons and tons of stuff in our world is even intentionally spec'd out in this manner. A 14" tire rim is not 14, there's a tolerance. A 3" pipe isn't 3". These are all just nominal sizes. Just about everything in our world is nominally sized. A nut and bolt manufacturer doesn't care whether they're making 12mm or 1/2 on a given day. Those are just nominal sizes, arbitrary names, in their minds. It doesn't matter whether the factory runs on metric or imperial or something else because they're just shooting for an arbitrary number.

The only time your unit really matters is when interfacing with other parties and it only matters insofar as you need to know what each other are uses.

mgoetzke · 2026-01-21T12:44:24 1768999464

The fact that canadian lumber companies seem to be switching their machinery to metric is funny though. https://woodcentral.com.au/canadas-sawmills-weigh-metric-swi...

FridayoLeary · 2026-01-21T13:03:16 1769000596

They got rid of the penny. Just suggest that the Imperial system is some leftist conspiracy and they'll have moved over by the end of the month.

neutronicus · 2026-01-21T12:59:40 1769000380

Construction is negligible?

I guess you imagine we’ll all be calling half inch pipe twelve seven after this year adjustment period?

I guess people do it with bullet calibers.

bluGill · 2026-01-21T13:06:12 1769000772

There is nothing half inch in a half inch pipe. One inch emt is not one inch, it is 27mm outside diameter (for some reason I know that one)

criddell · 2026-01-21T13:35:06 1769002506

I believe 1/2" pipe is exactly the same as DN15 pipe. 1/2" and 15mm are both just nominal sizes. Calipers will only help you if you happen to know the pipe schedule.

strken · 2026-01-21T13:06:03 1769000763

You might eventually end up calling 15mm pipe half-inch, depending on where the cheapest pipe can be sourced from.

leosarev · 2026-01-21T17:16:42 1769015802

Ironically, a lot of countries with metric system calling half inch pipe a "1/2" :-)

deadbabe · 2026-01-21T13:20:08 1769001608

Most likely the current administration will pass executive orders banning the use of metric system, and then force other countries to switch to imperial or face heavy tariffs.

zahlman · 2026-01-21T16:46:01 1769013961

Jokes of this form were tired by about November 5, 2024.

cupofjoakim · 2026-01-13T14:21:03 1768314063

That's great if you need everything. If you need one of them, not so much.

bayindirh · 2026-01-13T14:24:41 1768314281

You can buy the individual tools, if you want.

al_borland · 2026-01-13T14:22:25 1768314145

I'm curious how many people actually use all this stuff themselves. It seems like an extreme niche, and more often than not will have people paying for apps they will never use.

d_runs_far · 2026-01-13T15:33:56 1768318436

Maybe I'm old skool... but for the last 30+ years I've been using a combination of photoshop, illustrator, FCP, after effects (back when it was CoSA...), some audio editing and mixing in quite a bit of code as well. While others on my team specialize in one or two domains, I've managed to keep my skills in many.

Back in the day I was considered a 'MultiMedia' creative. I don't even know what to call myself these days.

cupofjoakim · 2025-11-26T16:29:20 1764174560

I find your comment a bit funny on multiple levels. "Linux" does not force anything on you right? It's the community that has by and large decided to move to maintaining other solutions. If you still want to use fvwm you can still run it on arch with x11 until x11 is not maintained and the kernel breaks it somehow

serf · 2025-11-26T16:54:01 1764176041

>"Linux" does not force anything on you right?

>It's the community that has by and large decided to move to maintaining other solutions. If you still want to use fvwm you can still run it on arch with x11 until x11 is not maintained and the kernel breaks it somehow

well you just framed it perfectly; it's still forced on the end-user regardless of whether or not you want to call it 'linux' or 'the community that controls and steers linux" .

MyOutfitIsVague · 2025-11-26T17:04:09 1764176649

It's not forced if you were getting it all for free anyway and can walk away at any time. "They've stopped giving away old thing for free and are now only doing new thing" doesn't put you in the position of a captive who has no freedom. You can complain, you can develop your own solutions, you can leave, but I find it over the line the number of people in the X11/Wayland conversation whose position amounts to looking at people who are working for free, and demanding that they do a specific kind of free work without compensation or help. It's all people working on their free time, or companies sponsoring the developments they need. It's hard to make demands as an end user who isn't paying or even helping.

Gud · 2025-11-26T17:09:09 1764176949

"Linux" is mostly controlled by a few corporations and their interests. It's been a loooooooooong time since it was a grass roots movement.

I don't know about you, but corporate dictates always leave a bad taste in my mouth.

jauntywundrkind · 2025-11-26T18:16:03 1764180963

Oh sure there's absolutely places where this is true. But there's so many many counter examples. Sway, Niri, Hyprland desktops... Top tier incredible experiences begat as small personal passions. So many incredible tools that have become must-have-daily-drivers for folks, alike this modern shell tools thread. https://hackertimes.com/item?id=41292835

The narrative that everything is corporate and greed is, imo, a deep deep dis-service. Incredible things are happening on the edge, and there's nothing else on the planet remotely resembling the conjunctive discollaboration here. Folks have incredible leverage from existing open source works, & add their own sparkle, time and time again. (Nearly never does this box us in.)

For sure there are big projects too, with huge corporate influence and millions of users.

But it is a deeply rotten proposition to me to try selling some corrupt world case, that this land here is just as rotten and poisoned as the application/apppliance-ized rest of world. That there's coersion. There's some being left behind the pack, some, but so little. "Linux" is still the best freest most augment-intelligemce computing out there by a light year, and it's trends are healthy.

(Wayland in fact has improved & strengthened that stance, freed us from a nasty monolith that everyone had to use, and given us actual freedom of implementation. Wayland is part of the liberation, the addition of choice & liberty. It's wild to me that people seek those old chains.)

Gud · 2025-11-27T16:45:21 1764261921

I disagree. The freedom loving hippie hackers of the 90s you can mostly find in the BSD communities, in my not so humble opinion.

jauntywundrkind · 2025-11-27T20:22:10 1764274930

Do you have anything to argue that with? I'm all ears, would be neat to see.

I already mentioned quite an array of projects that I find inspiring. I can't say I know of anything in particular coming from BSD land. ZFS has adherents for sure but there doesn't seem like there's any innovation or creation or downstream net new coming out of that.

bigyabai · 2025-11-26T19:32:05 1764185525

"Linux" is a kernel, mostly controlled by one old curmudgeonly guy. Corporations can offer him code, but he can reject it.

The Linux software environment is more broadly controlled by corporations, but that goes for every single mainstream operating system.

jmclnx · 2025-11-28T03:41:13 1764301273

Actually it did, well technically it was userland.

Pulseaudio was forced on me because Firefox and a few others need it.

PAM was forced on me because some applications needing it, I believe that was due to kde.

Until v15, slackware had no need for pam or pulseaudio

Now many of us are waiting until systemd is forced on us :(

cupofjoakim · 2025-11-24T22:29:08 1764023348

an image is available in the PR where it was added: https://github.com/0xhckr/ghostty-shaders/pull/30

cupofjoakim · 2025-11-13T15:49:41 1763048981

It'd be cool, but I guess the hodgepodge of different solutions in that space would make it really hard. For example, many mods for GW2 don't work in linux if you're on something like Hyprland due to them having to act as overlays. Not sure if that's a wayland issue or just a typical hyprland thing though

sho_hn · 2025-11-13T17:41:51 1763055711

You're right, but a plugin for a common compositor like Plasma's KWin would still make it accessible to a large number of users. Shouldn't be too hard to do either. Maybe I'll do it this weekend!

cupofjoakim · 2025-10-31T22:07:04 1761948424

Dead Internet Theory is no longer a theory huh?

cupofjoakim · 2025-10-29T15:02:51 1761750171

Not sure if that's satire or not but how would you even identify the party to sue? What do you do if they're based in a country where you can't sue them ofer relatively trivial matters as this?

2OEH8eoCRo0 · 2025-10-29T15:11:49 1761750709

Not satire but it's a huge problem with the internet. Everyone washes their hands and people can harm you without liability.

immibis · 2025-10-30T23:22:58 1761866578

You could start by subpoenaing the registered holder of the IP address.

In general, you can ask a lawyer for your options. Chances are good there are more than zero. But only if you can afford a lawyer.

If you're getting scraped from a country where you don't do business, you can block the country. It's not good to block a country, but it works as a temporary measure. If they really want your data, they will move to a country where you do business, which means a country where you can get a lawyer. Assuming you can afford a lawyer. If they're using rotating IPs, likely some of them are from your country. You might show a judge: "Hey, look, we're getting so much traffic, from a wide variety of IP addresses but it all seems to be the same person on the other end, which would make it illegal DDoS. Can we trace back some of these?" and if you're lucky, the judge might say yes.

DDoS is not a relatively trivial matter.