More

knes · 2026-03-18T05:10:00 1773810600

Agreed.this paper studied 33k+ agent-authored PRs on GitHub (https://arxiv.org/pdf/2601.15195)

#1 rejection reason: missing context. 80% needed human fixes. Agents can write code fine. They just don't know what "done" looks like in your codebase.

Count successful merges into repos with real history instead of LOC and the hard part is specification, not execution.

Wrote about this topic @ https://www.augmentcode.com/blog/the-end-of-linear-work

knes · 2026-03-18T05:02:17 1773810137

what if you move from reviewing the code to reviewing the spec?

sarchertech · 2026-03-18T10:40:46 1773830446

That’s like asking why don’t we switch from reviewing PRs to reviewing jira tickets.

There’s probably a world where you could do that if the spec was written in a formal language with no ambiguity and there was a rigorous system for translating from spec to code sure.

duskdozer · 2026-03-18T11:25:46 1773833146

Hm, that's an interesting concept. What if we were able to create an unambiguous, rigorous specification language for creating prompts so that we could get consistent and predictable output from AI? Maybe we could call it a "prompt programming language" or something

ccanassa · 2026-03-18T18:48:31 1773859711

That exists, it's called code.

knes · 2026-02-05T18:57:13 1770317833

At Augment' we've been working on this. Multi agents orchestration, spec driven, different models for different tasks, etc.

https://www.augmentcode.com/product/intent

can use the code AUGGIE to skip the queue. Bring your own agent (powered by codex, CC, etc) coming to it next week.

knes · 2026-01-27T07:45:56 1769499956

At augment code we specifically build our code review tool to find noise to signal ratio problem. In benchmark our comments are 2 to 3x more likely to get fixed compared to bugbot coderabbit etc

You should check it at Augmentcode.com

knes · 2025-12-19T17:05:19 1766163919

Blacksmith.sh acquisition in 3, 2, 1 ...

Then Cursor takes on GitHub for the control of the repo.

jacobegold · 2025-12-20T00:23:31 1766190211

As a Graphite employee, would love this tbh - we love Blacksmith!

knes · 2025-11-12T19:37:18 1762976238

is this a mishap/ leak? dont see the model yet

knes · 2025-10-15T17:59:52 1760551192

At augmentcode.com, we've been evaluating Haiku for some time, it's actually a very good model. We found out it's 90% as good as Sonnet and is ~34% faster than sonnet!

Where it doesn't shine much is on very large coding task. but it is a phenomenal model for small coding tasks and the speed improvement is much welcome

samuelknight · 2025-10-15T18:10:49 1760551849

90% as good as Sonnet 4 or 4.5? Openrouter just started reporting, and it's saying Haiku is 2x as fast (60tps vs 125tps) and 2-3x less latent (2-3s vs 1s)

jdoe1337halo · 2025-10-15T20:56:44 1760561804

Do you have a definition of what is considered a small vs large coding task?

knes · 2025-10-13T18:01:43 1760378503

Fivetran acquired Census (reverse-etl) & Tobiko (dbt alternative).

I wonder who's next to really consolidate their platform play and compete with the old legacy MDM provider like Informatica. Data Observability or Catalog like Monte Carlo and Atlan. The whole Modern Data Stack has either died, acquired or merged by now. Wonder what's missing for Fivetran to IPO too.

I also wonder what this merge means for Airbyte who raised 150m at 1.5b in 2023.

mfdupuis · 2025-10-13T18:29:02 1760380142

Observability is a good guess, but I'd venture to guess that the conversations going on internally are about how to capture value across the entire stack. I wouldn't be surprised if we hear about them acquiring either a database/warehouse company and/or an analytics solution. Or vice versa, them getting acquired by a bigger player that wants to offer more connectors and data modeling functionality.

knes · 2025-10-09T13:50:16 1760017816

This is what we do at Augmentcode.com.

We started with building the best code retrieval and build an agent around it.

knes · 2025-09-23T15:55:53 1758642953

the MTR study is a joke. it surveyed only 16 devs. in the era of Sonnet 3.5

Can we stop citing this study

I'm not saying the DORA study is more accurate, but at least it surveyed 5000 developers, globally and more recently (between June 13 and July 21, 2025) which means using the most recent SOTA models

rsynnott · 2025-09-23T16:31:01 1758645061

> I'm not saying the DORA study is more accurate, but at least it surveyed 5000 developers, globally and more recently

It's asking a completely different question; it is a survey of peoples' _perceptions of their own productivity_. That's basically useless; people are notoriously bad at self-evaluating things like that.

capnrefsmmat · 2025-09-23T16:23:33 1758644613

It didn't "survey" devs. It paid them to complete real tasks while they were randomly assigned to use AI or not, and measured the actual time taken to complete the tasks vs. just the perception. It is much higher quality evidence than a convenience sample of developers who just report their perceptions.

bopbopbop7 · 2025-09-23T16:07:28 1758643648

Yea cite the study funded by a company that invested billions into AI instead, that will surely be non biased and accurate.