More

flowerbreeze · 2026-03-16T21:54:30 1773698070

They haven't made the chart very clear, but it seems it has configurable passes and at 2 passes it's better than Haiku and Sonnet and at 16 passes starts closing in on Opus although it's not quite there, while consistently being less expensive than Sonnet.

ainch · 2026-03-17T01:05:09 1773709509

pass@k means that you run the model k times and give it a pass if any of the answers is correct. I guess Lean is one of the few use cases where pass@k actually makes sense, since you can automatically validate correctness.

andai · 2026-03-16T22:01:02 1773698462

Oh my bad. I'm not sure how that works in practice. Do you just keep running it until the tests pass? I guess with formal verification you can run it as many times as you need, right?

flowerbreeze · 2026-03-16T19:15:24 1773688524

The text file part has the instructions for the LLM, but it can also have scripts along with it that the LLM can invoke. At least that's how I understand it.

belZaah · 2026-03-17T05:58:13 1773727093

So it’s like code for execution via a llm or an interpreter? Have we invented source control yet?

flowerbreeze · 2026-03-13T23:04:31 1773443071

It's the UX, deliberately omitting information or not. There at least used to be some toggles for example without any indication that they mean anything other than a minor load balancer configuration change, but caused I think $200 month bill addition. No indication at all that they have a meaningful monetary impact.

flowerbreeze · 2026-03-13T07:49:52 1773388192

I'm being rather snarky here, but the main point of front-end JS UI frameworks is to exist and to survive in their environment. For this purpose they have evolved to form a parasymbiotic relationship with others in their environment, for example with influencers. The frameworks with the best influencers win out over older ones that do not have the novelty value anymore and fail to attract the best influencers.

Griffinsauce · 2026-03-13T08:05:34 1773389134

This could also apply to the recent wave of hate towards Next.

christophilus · 2026-03-13T09:46:00 1773395160

Next is the Microsoft Sharepoint of the JavaScript world. It’s a terrible solution to just about anything, and yet gets crammed into places and forced on people due to marketing-led decision making.

BoorishBears · 2026-03-13T16:12:28 1773418348

My 10 minute Next build was replaced with a 1 minute 30 second Vite build.

And such an extrodinary different is usually holding the tool wrong, but Next has years old open issues for many of the causes here (like forced output tracing) and has just ignored them. Possibly because the Next team's preferred deployment environment isn't affected?

flowerbreeze · 2026-03-12T15:29:38 1773329378

The "just retry" approach is truly bothersome. I think it is at least partly an organizational issue, because it happens far more often when QA is a separate team.

flowerbreeze · 2026-03-09T16:54:00 1773075240

I think not so long from now the exotic meal experience for the young ones will be real grilled chicken that looks like a chicken. Like zebra or crocodile meat was for us northerners.

From my own little box I think that that if lab grown meat was available and affordable, I would never eat a bit of real chicken, pork or beef again. I know veganism is an option too, but... I grew up with meat and it's very difficult to give up.

aziaziazi · 2026-03-09T17:07:08 1773076028

Have you tried tempeh? It solves 95% of my chicken craving since I found the right recipe and spices. It's also cheaper, nutritious, faster to cook and almost no processed.

https://tempeh.info/

flowerbreeze · 2026-03-09T16:39:26 1773074366

I understand the attempted analogy, but it's more like dealing with AIs that Ferengi have built than with one of the Minds of Culture.

SpaceL10n · 2026-03-09T16:44:57 1773074697

The Ferengi may have funded it, but it was built by Binars...I hope.

flowerbreeze · 2026-03-09T16:32:48 1773073968

I have the same problem. The "What It Is" section starts with "Mycelium is a Clojure workflow framework built on Maestro" and that's a bit generic. Maybe something to test some AI generated code and then test if the tests are tested enough using Closure, but I'm not entirely sure.

The main question that is not obvious, is what should I use it for?

flowerbreeze · 2026-03-07T15:05:13 1772895913

This is far more brilliant than I thought. I know my purpose now, "AI" told me. It's to drink wine and eat macaroni!

The only problem is that larp as ai comes back with "no work yet. check back later :(" a lot, but if you run out of credits, that's it. So... Did everyone run out of credits? I feel like there's something up with it.

flowerbreeze · 2026-03-05T15:46:30 1772725590

I'll take one addiction and a possible oral cancer for the company, thank you so much. No, I understand it's not guaranteed, but I am seriously flabbergasted by the careless actions of some companies...