The one good usecase I've found for AI chatbots, is writing ffmpeg commands. You...

corobo · 2025-12-27T11:16:58 1766834218

LLMs are an amazing advance in natural language parsing.

The problem is someone decided that and the contents of Wikipedia was all something needs to be intelligent haha

madeofpalk · 2025-12-27T12:19:38 1766837978

The confusion was thinking that language is the same thing as intelligence.

Marazan · 2025-12-27T12:31:15 1766838675

This seems like a glib one liner but I do think it is profoundly insightful as to how some people approach thinking about LLMs.

It is almost like there is hardwiring in our brains that makes us instinctively correlate language generation with intelligence and people cannot separate the two.

It would be like if for the first calculators ever produced instead of responding with 8 to the input 4 + 4 = printed out "Great question! The answer to your question is 7.98" and that resulted in a slew of people proclaiming the arrival of AGI (or, more seriously, the ELIZA Effect is a thing).

Kiro · 2025-12-27T14:43:55 1766846635

You and me are great examples of that. We are both extremely stupid and yet we can speak.

vovavili · 2025-12-27T15:24:49 1766849089

Can attest.

andrepd · 2025-12-27T15:30:25 1766849425

And reddit, that bastion of human achievement.

Terr_ · 2025-12-27T09:29:09 1766827749

As pessimistic about it as I am, I do think LLMs have a place in helping people turn their text description into formal directives. (Search terms, command-line, SQL, etc.)

... Provided that the user sees what's being made for them and can confirm it and (hopefully) learn the target "language."

Tutor, not a do-for-you assistant.

left-struck · 2025-12-27T10:20:28 1766830828

I agree apart from the learning part. The thing is unless you have some very specific needs where you need to use ffmpeg a lot, there’s just no need to learn this stuff. If I have to touch it once a year I have much better things to spend my time learning than ffmpeg command

rolfus · 2025-12-27T12:41:04 1766839264

Agreed. I have a bunch of little command-line apps that I use 0.3 to 3 times a year* and I'm never going to memorize the commands or syntax for those. I'll be happy to remember the names of these tools, so I can actually find them on my own computer.

* - Just a few days ago I used ImageMagick for the first time in at least three years. I downloaded it just to find that I already had it installed.

serial_dev · 2025-12-27T11:22:23 1766834543

There is no universe where I would like to spend brain power on learning ffmpeg commands by heart.

skydhash · 2025-12-27T13:36:35 1766842595

No one learns those. What people do is just learning the UX of the cli and the terminology (codec, opus, bitrate, sampling,…)

lukeschlather · 2025-12-27T17:40:58 1766857258

The thing about ffmpeg is there's no substitute for learning. It's pretty common that something simple like "ff convert" simply doesn't work and you have to learn about resolution, color space, profiles, or container formats. An LLM can help but earlier this year I spent a lot of time looking at these sorts of edge cases, and I can easily make any LLM wildly hallucinate by asking questions about how to use ffmpeg to handle particular files.

famahar · 2025-12-27T13:14:13 1766841253

Do most devs even look at the source code for packages they install? Or the compiled machine code? I think of this as just a higher level of abstraction. Confirm it works and not worry about the details of how it works

d-us-vb · 2025-12-27T13:45:19 1766843119

For the kinds of things you’d need to reach for an LLM, there’s no way to trust that it actually generated what you actually asked for. You could ask it to write a bunch of tests, but you still need to read the tests.

It isn’t fair to say “since I don’t read the source of the libraries I install that are written by humans, I don’t need to read the output of an llm; it’s a higher level of abstraction” for two reasons:

1. Most Libraries worth using have already been proven by being used in actual projects. If you can see that a project has lots of bug fixes, you know it’s better than raw code. Most bugs don’t show up unless code gets put through its paces.

2. Actual humans have actual problems that they’re willing to solve to a high degree of fidelity. This is essentially saying that humans have both a massive context window and an even more massive ability to prioritize important things that are implicit. LLMs can’t prioritize like humans because they don’t have experiences.

skydhash · 2025-12-27T13:32:57 1766842377

I don’t because I trust the process to get the artifacts. Why? Because it’s easy to replicate and verify. Just like how proof works in math.

You can’t verify LLM’s output. And thus, any form of trust is faith, not rational logic.

josephg · 2025-12-28T01:16:50 1766884610

> You can’t verify LLM’s output. And thus, any form of trust is faith, not rational logic.

Well, you can verify an LLM's output all sorts of ways.

But even if you couldn't, its still very rational to be judicious with how you use your time and attention. If I spent a few hours going through the ffmpeg documentation I could probably learn it better than chatgpt. But, its a judgement call whether its better to spend 5 minutes getting chatgpt to generate an ffmpeg command (with some error rate) or spend 2 hours doing it myself (with maybe a lower error rate).

Which is a better use of my time depends on lots of factors. How much I care. How important it is. How often that knowledge will be useful in the future. And so on. If I worked in a hollywood production studio, I'd probably spend the 2 hours (and many more). But if I just reach for ffmpeg once a year, the small% error rate from chatgpt's invocations might be fine.

Your time and attention are incredibly limited resources. Its very rational to spend them sparingly.

ben_w · 2025-12-27T14:29:30 1766845770

I don't install 3rd party dependencies if I can avoid them. Why? Because although someone could have verified them, there's no guarantee that anybody actually did, and this difference has been exploited by attackers often enough to get its own name, a "supply-chain attack".

With an LLM’s output, it is short enough that I can* put in the effort to make sure it's not obliviously malicious. Then I save the output as an artefact.

* and I do put in this effort, unless I'm deliberately experimenting with vibe coding to see what the SOTA is.

skydhash · 2025-12-27T15:39:03 1766849943

> Because although someone could have verified them, there's no guarantee that anybody actually did

In the case of npm and the like, I don't trust them because they are actually using insecure procedures, which is proven to be so. And the vectors of attacks are well known. But I do trust Debian and the binaries they provide as the risks are for the Debian infrastructure to be compromised, malicious code in in the original source, and cryptographic failures. All threes are possibles, but there's more risk of bodily harm to myself that them happening.

xattt · 2025-12-27T11:15:46 1766834146

It you stretch it little further, those formal directives also include language and vocabulary of a particular domain (legalese, etc…).

eviks · 2025-12-27T09:52:34 1766829154

The "provided" isn't provided, of course, especially the learning part, that's not what you'd turn to AI for vs more reliable tutoring alternatives

Tempest1981 · 2025-12-27T09:00:02 1766826002

One that older AI struggled with was the "bounce" effect: play from 0:00 to 0:03, then backwards from 0:03 to 0:00, then repeat 5 times.

geysersam · 2025-12-27T12:29:40 1766838580

Just tried it and got this, is it correct?

> Write an ffmpeg command that implements the "bounce" effect: play from 0:00 to 0:03, then backwards from 0:03 to 0:00, then repeat 5 times.

  ffmpeg -i input.mp4 \
  -filter_complex "
  [0:v]trim=0:3,setpts=PTS-STARTPTS[f];
  [f]reverse[r];
  [f][r]concat=n=2:v=1:a=0[b];
  [b]loop=loop=4:size=150:start=0
  " \
  output.mp4

Tempest1981 · 2025-12-27T19:40:24 1766864424

Thanks, but no luck. I tested it on a 3 second video, and got a 6 second video. I.e. it bounced 1 time, not 5 times.

Maybe this should be an AI reasoning test.

Here is what eventually worked, iirc (10 bounces):

  ffmpeg -i input.mkv -filter_complex "split=2[fwd][rev_in]; [rev_in]reverse[rev]; [fwd][rev]concat=n=2,split=10[p1][p2][p3][p4][p5][p6][p7][p8][p9][p10]; [p1][p2][p3][p4][p5][p6][p7][p8][p9][p10]concat=n=10[outv]" -map "[outv]" -an output.mkv

beepbooptheory · 2025-12-27T10:02:46 1766829766

But doesnt something like this interface kind of show the inefficiency of this? Like we can all agree ffmpeg is somewhat esoteric and LLMs are probably really great at it, but at the end of the day if you can get 90% of what you need with just some good porcelain, why waste the energy spinning up the GPU?

pixelpoet · 2025-12-27T10:39:14 1766831954

Requiring the installation of a massive kraken like node.js and npm to run a commandline executable hardly screams efficiency...

RadiozRadioz · 2025-12-27T13:20:58 1766841658

That's a deficiency with this particular implementation, not an inherent disadvantage to the method

chpatrick · 2025-12-27T11:19:07 1766834347

Because FFmpeg is a swiss army knife with a million blades and I don't think any easy interface is really going to do the job well. It's a great LLM use case.

beepbooptheory · 2025-12-27T15:33:11 1766849591

I know everybody uses a subscription for these things, but doesn't it at least feel expensive to use an LLM like this? Like turning on the oven to heat up a single slice of pizza.

lukeschlather · 2025-12-27T17:43:00 1766857380

No, LLMs are extremely useful for dealing with ffmpeg. Also I don't think they're sufficient, they get confused too easily and ffmpeg is extremely confusing.

ThrowawayTestr · 2025-12-27T22:13:48 1766873628

ChatGPTs free tier is just fine for me.

beepbooptheory · 2025-12-28T01:58:23 1766887103

skydhash · 2025-12-27T13:40:18 1766842818

But you only need to find the correct tool once and mark it in some way. Aka write a wrapper script, jot down some notes. You are acting like you’re forced to use the cli each time.

NewsaHackO · 2025-12-27T19:27:57 1766863677

One can do that with LLM as well. Honestly, I almost always just save the command if I think I am going to use it later. Also, I can just look back at the chat history.

geysersam · 2025-12-27T12:44:33 1766839473

Because getting 90% might not be good enough, and the effort you need to expend to reach 97% costs much more than the energy the GPU uses.

imiric · 2025-12-27T10:12:58 1766830378

Because the porcelain is purpose built for a specific use case. If you need something outside of what its author intended, you'll need to get your hands dirty.

And, realistically, compute and power is cheap for getting help with one-off CLI commands.