More

npn · 2026-03-25T02:47:06 1774406826

20 years.

npn · 2026-03-25T02:34:33 1774406073

turn out the schizos were right. most of OpenAI *real* investment money comes from Gulf countries. without that money flow they can't sustain the cash burn anymore.

npn · 2026-03-24T13:18:46 1774358326

Faster is not always the best thing. I still remember when vs code changed to ripgrep I had to change my habit using it, before then I can just open vs code to any folder and do something with it, even if the folder contains millions of small text files. It worked fine before, but then rg was picked, and it happily used all of my cpu cores scanning files, made me unable to do anything for awhile.

To be honest I hate all the new rust replacement tools, they introduce new behavior just for the sake of it, it's annoying.

npn · 2026-03-20T14:38:19 1774017499

A part of internet dies with him. RIP.

npn · 2026-03-13T14:04:26 1773410666

Now it is only age verification. Next they will try to impose digital ID.

That's when you know the new world has begun.

npn · 2026-03-12T04:21:07 1773289267

it's not like they found anyway to open the Hormuz strait, why should the price be lower.

one again the world suffers thank to US stupidity.

npn · 2026-03-10T11:42:05 1773142925

I wish him luck.

Recently all papers are about LLM, it brings up fatigue.

As GPT is almost reaching its limit, new architecture could bring out new discovery.

npn · 2026-03-10T07:32:00 1773127920

you think in FP16. nobody uses FP16 for inference anymore. 400% probably for FP4/INT4 computation.

EvgeniyZh · 2026-03-10T09:09:31 1773133771

Tensor core performance is inversely proportional to precision across all generations (i.e., reducing precision by a factor of 2 increases OPS by a factor of 2). 8-bit precision will give you the same improvement ratio. A100/H100 didn't support 4-bit if I remember correctly.

So FP4/INT4 will likely improve the same 30% OPS/W. You could get a separate improvement by reducing precision, but going 1-bit for 4x improvement feels unlikely for now.

npn · 2026-03-08T10:15:25 1772964925

Don't use Vultr. I still have 60 dollars credit there that I can never spend, because they will limit your account if you do not use it for sometime.

And the pricing is laughable expensive comparr to OVH.

npn · 2026-03-06T04:15:39 1772770539

the problem the price point is increasing sharply every time.

gemini 2 flash lite was $0.3 per 1Mtok output, gemini 2.5 flash lite is $0.4 per 1Mtok output, guess the pricing for gemini 3 flash lite now.

yes you guess it right, it is $1.5 per 1Mtok output. you can easily guest that because google did the same thing before: gemini 2 flash was $0.4, then 2.5 flash it jumps to $2.5.

and that is only the base price, in reality newer models are al thinking models, so it costs even more tokens for the sample task.

at some point it is stopped being viable to use gemini api for anything.

and they don't even keep the old models for long.