If you don’t know what library to use in your specific language, do you think yo...

neerajsi · 2026-05-25T05:37:30 1779687450

I just joined a new team and have been using copilot with opus models.

We have our core code in a weird dialect of C and rust. C I know well, but not rust. Our tests are in Python. The pipeline descriptions are in Yaml.

Outside of the core code there are so many arcana to learn. Writing syntactically and semantically correct yaml/Python test code would be a nightmare. The Agents have flaws, but they provide a huge leg up in improving the tests.

And they are great at providing a first pass review of the core code before bothering a human reviewer. Lastly we run some of our test failures through AI triage, which often enough finds the root cause or rules out simple failures.

This shows up in a higher checkin rate. I'm curious to see whether this will lead to quality end product since we have more support for the more manually written and reviewed core product code.

exallocatepool2 · 2026-05-27T18:08:33 1779905313

What do you think about using AI when dealing with more complex matters (I'm thinking of Ke and friends)? Is it at least viable for testing purposes?

simianwords · 2026-05-25T05:22:31 1779686551

YES. This line of thought is exactly why people are still skeptical of LLM's.

LLM's are directionally right and if their answer "fits" then I take it at face value.

I wrote a blog detailing the computational difference between "generation" and "verification" and why it matters for LLM's: https://simianwords.bearblog.dev/the-generation-vs-verificat...

As an example: I asked the LLM "synonym for "provides" that also means "places" on you" and it gave me 5 answers and I immediately knew the right one was "confers". How? It just fits. Just like most things.

vips7L · 2026-05-25T05:45:40 1779687940

People are skeptical of LLMs because the experiences they’ve had with LLMs. You can’t blog your way out those experiences.

I’m skeptical because I’ve seen this exact situation and I’ve seen the result be something that anyone experienced wouldn’t do.

baq · 2026-05-25T07:02:54 1779692574

Sometimes I think folks having ‘experiences’ should play more poker.

The point is, it’s a game of chance and yet good players beat bad players in the long run. Your job in the new era of software engineering is to design the process so LLMs doing your code monkeying avoid the losses (including discarding bad changes) and take the wins. Win often enough and you’ll come out ahead.

simianwords · 2026-05-25T08:05:10 1779696310

Hmm you have written it in a way that gives me a new line of thinking.

I think what you are saying is that people should learn and appreciate working in high variance environments and still exploit small gains. This is clearly not something that is easily digestible to people so they end up rejecting LLMs.

xarope · 2026-05-25T06:16:55 1779689815

that's a very dangerous analogy, because you would be considered the domain expert and you are just asking for synonyms for something you already know but may not remember off-hand.

now, what if you asked for the synonym for "provides" in a language that has gender differences (e.g. spanish/portuguese) as well as societal nuances (e.g. japanese) and it gives you "confers", how would you now know that's correct?

ah, so you say you tell it to take into consideration gender differences, as well as societal nuances. What are those, if you were not already familiar with the language?

simianwords · 2026-05-25T08:18:57 1779697137

You are pushing the limits of my framework and it’s a good thing.

The extent to which LLMs help is determined by how well acquainted you are with the domain. But it will always push you directionally in the right direction.

In your case, you used a language example and this is one where LLMs have natural strength in. I don’t need to be an expert in Spanish to trust it because I know that LLMs are specifically good at catching these problems.

But again there are limits and good to understand it.