How many of those wtf moments are simply from not “being in the room when it happened?” Most enterprise software is riddled with wtf moments demanded as one compromise or another.
Request: “manual step X should not be part of the automated build script”
Fulfilled as: build script is now split in two. X is still done as a manual step in between. Rather than prompting and waiting for it to be done, the documentation and scripts no longer mention X.
Part poorly written requirements, part implementing under pressure, and part lack of engineering discipline.
The main issue is catching stuff like this early enough to course-correct. Differences in time zone, language and cultural norms can make that a challenge, all of which LLMs have the advantage in.
There's always wtf, why did we add this feature, but at least in my experience, once a week or so I run into something in this category. Me: "AI, please cleanup/refactor/improve this thing" AI: "Roger that! I deleted the file so now it's perfectly clean" ... insert W.T.F.
Same but I have seen it try and change my tests because it decided that it’s code was correct and my tests were incorrect.
I saw it’s thinking tokens said something along the lines of “I have implemented it correctly but the test is failing. I’ll update the tests so the pipeline passes”