More

LukeB42 · 2026-03-01T16:38:44 1772383124

Would you prefer to see ESM or neither?

LukeB42 · 2026-03-01T14:49:08 1772376548

React components might eventually be removed in favor of making the templating system as fast and as elegant as possible but for the time being they provide flexibility.

You can read https://lukeb42.github.io/vertex-interop.html for more info.

LukeB42 · 2026-03-01T14:31:58 1772375518

9kb (minifier+brotli)

We could remove 3kb by removing the router but that's not gonna happen. You're more than welcome to minify+brotli it yourself if you use vertex.js in production.

LukeB42 · 2026-03-01T13:55:29 1772373329

Beauty /is/ in the eye of the beholder. The rationale /here/ is that the more text in a page the more code you'll fit in your head the more you'll get done, the more confidence you'll have and again the more you'll achieve.

turtlebits · 2026-03-01T15:10:31 1772377831

OP has a valid point. On my Mac, it's unreadable without zooming in. I immediately left the page.

LukeB42 · 2026-03-01T15:26:36 1772378796

Thanks for the valuable feedback. I've added a mediaquery for your use case. This should be more legible now: https://lukeb42.github.io/vertex-manual.html

Send me a screenshot via https://catbox.moe if it's unreadable and I'll get this done and dusted before leaving the thread. Thanks.

DiabloD3 · 2026-03-01T14:26:25 1772375185

Zero code is in your head if you can't read it.

The predominant monitor in existence is your average 24" 1080p monitor, sat at, on average, 32" away from the head. The average person has worse than 20/20 vision.

You must test your website in such conditions and make sure it is readable, and also make sure it meets at minimum WCAG A, but preferably the whole way to AAA if possible.

LukeB42 · 2026-03-01T14:38:22 1772375902

Thank you but the predominant monitor's probably a smartphone. The average professional is probably using a 4k monitor at the moment.

Everything in the free documentation I've provided you out of my own time and money that you're referring to exists in relation to the other elements in that page, so to get the experience you're after simply ctrl+scroll and change the CSS zoom level like the riot at parties that you could be or catch up with circa 2013-2014 and invest in a 4k display please.

rschristian · 2026-03-01T19:53:22 1772394802

"Zoom in" or "buy a different monitor" is not an appropriate response to people bringing up the plethora of objective & subjective a11y issues on the page.

If you really don't care about providing an accessible experience, try this: no one will use the tool if they can't read the docs. With my monitor and eyesight, it's entirely illegible.

LukeB42 · 2026-03-01T20:24:44 1772396684

Name one objective issue with what I've provided your ungrateful self for free and do it in HN's 12pt again because I need a laugh.

LukeB42 · 2025-04-04T14:10:41 1743775841

Agitation by community-based agents leading to honest people dropping out and wicked people doubling down.

LukeB42 · on Oct 17, 2023

>you'd need a pretty custom toolchain

Just port/extend pytorch to make it easy to utilize fused ops.

elwypea · on Oct 17, 2023

Easier said than done. Even with Google level resources, TPU support for pytorch is patchy (https://arxiv.org/abs/2309.07181). Device abstraction is not great, assumes CUDA in unexpected places.

visarga · on Oct 17, 2023

The Groq AI chip startup has solved this problem. They don't use hand written kernels at all, instead they use a compiler, and they have the top speed in the world on LLaMA2-70B, 240tokens/s.

https://www.youtube.com/@GroqInc/videos

Other interesting Groq tidbits - their models are deterministic, the whole system up to thousands of chips runs in sync on the same clock, memory access and network are directly controlled without any caches or intermediaries so they also run deterministically.

That speeds up communication and allows automatic synchronisation across thousands of chips running as one single large chip. The compiler does all the orchestration/optimisation. They can predict the exact performance of an architecture from compile time.

What makes Groq different is that they started from the compiler, and only later designed the hardware.

elwypea · on Oct 17, 2023

What is the pass rate on torchbench? This gives a more realistic measure of how good a vendor's pytorch support is.

All the big chip startups have their own pytorch compiler that works on the examples they write themselves. From what I've seen of Groq it doesn't appear to be any different.

The problem is that pytorch is incredibly permissive in what it lets users do. torch.compile is itself very new and far from optimal.

omneity · on Oct 17, 2023

Pytorch XLA is such a pain to use. And once you go TPU you need the same energy to switch back, so you can’t quickly test out how it performs on your problem.

lumost · on Oct 17, 2023

One of the big reasons custom hardware solutions struggle.

IMO - you’d have better luck as a hardware vendor implementing an LLM toolchain and bypassing a general purpose DL framework. At the very least you should be able to post impressive results with this approach rather than a half baked pytorch port.

omneity · on Oct 17, 2023

I feel like that would make it harder for a vendor to keep up with the industry.

Say you took all the effort in the world to build your custom LLM toolchain to train a Llama on custom hardware. And then suddenly someone comes up with LoRA. You didn't even finish porting it to your toolkit then someone comes up with GPTQ.

Can't keep up with a custom toolchain imo.

It's like a forked linux kernel. Eventually you're gonna have to upstream if you're serious about it, which is what AMD is actively doing with pytorch for ROCm (masquerading it as CUDA for compatibility).

ronsor · on Oct 17, 2023

I disagree. llama.cpp[0] is a good counterpoint to this, since it uses a custom ML framework created from scratch. Despite not having the developer team of a large company, it still keeps up with many of the advancements in LLMs.

[0] https://github.com/ggerganov/llama.cpp

elwypea · on Oct 17, 2023

llama.cpp is not necessary for creating lots of demand for the chip it was originally written for (Apple M1), whereas new hardware vendors need to demonstrate they can plugin to existing tools to generate enough demand to ship in volume.

ronsor · on Oct 18, 2023

> lots of demand for the chip it was originally written for (Apple M1)

To be fair, the M1/M2 chip can't be purchased or used separately from the Mac, unlike GPUs or socketed CPUs, and demand for Macs is already fairly high.

elwypea · on Oct 17, 2023

That might be good enough to get a hardware startup acquired, but not good enough to get major sales. Users want pytorch and negligible switching cost between chips.

Bigger problem for startups trying to muscle in on LLMs is that there isn't much room for improvement on existing solutions to do something radically different.

lumost · on Oct 17, 2023

>Bigger problem for startups trying to muscle in on LLMs is that there isn't much room for improvement on existing solutions to do something radically different.

aye - unless you are able to notch a 10x cost/performance improvement. The migration overhead will just make it not worth it to switch.

adolph · on Oct 17, 2023

Typically Google resources go to TF before PyTorch, no?

elwypea · on Oct 17, 2023

Even after prioritising tensorflow, keras, jax etc., they can still afford to have a very large team working on torch_xla and still hedge their bets with a separate team on torch_mlir.

LukeB42 · on June 6, 2023

Everything in an organism is learned.

If you think it isn't then you're not observing the right timescale.

kayodelycaon · on June 6, 2023

Read the article. They got the mechanism of “learned helplessness” wrong. It’s actually a “learned lack of control”.

This makes every difference in how to treat people who have this problem. All you have to do is teach people how to feel they have control again.

LukeB42 · on Feb 9, 2023

Is this because you don't know how to implement the weight updates in NumPy yourself?

LukeB42 · on Feb 9, 2023

Terrible joke and COMPLETELY missing the point.

LukeB42 · on Feb 9, 2023

Can you explain like I'm 5 why this matters distinctly from how transformers are normally trained with autodiff and what its possible applications are?

adamnemecek · on Feb 9, 2023

I’m talking about attention only transformers. Those don’t have an autodiff but still learn. The math is actually really cool.

lostmsu · on Feb 9, 2023

> attention only transformers

Can you share any good link on the subject?

adamnemecek · on Feb 10, 2023

https://transformer-circuits.pub/2021/framework/index.html

lostmsu · on Feb 10, 2023

Maybe I am missing something, but I don't see any learning without autodiff.

adamnemecek · on Feb 11, 2023

I thought you were asking about attention only transformers. This paper touches on some of it https://arxiv.org/abs/2212.10559v2.

lostmsu · on Feb 13, 2023

The paper speculates that it is analogous to gradient descent and empirically confirms it is similar in behavior, but it is not a rigorous proof of any kind.

The momentum experiment they made also does not seem related. E.g. it just adds past values to V, which extends the effective context length.

adamnemecek · on Feb 14, 2023

> but it is not a rigorous proof of any kind.

Such is the nature of early theories.