I would go even further. People are not even that advanced stochastic parrots. In the similar task, we failed just as spectacularly as the GPT-4 did right now. Rust, Zig, Julia, Kotlin, and dozens smaller "new" languages fail to bear a smear of originality in them. Their authors remixed old ideas in different proportions and got different grammars as a result. That's something an LLM can do and much faster too.
But there is also Unison. That one new language that stands out, and the very fact that it does makes it improbable for an LLM to generate. LM is a language model, it avoids marginality and originality altogether by its design.
Language models can and are useful if you specifically want to avoid marginality. To reduce noise, to remove errors. There is huge potential in them. If the problem was to design the most average programming language with no purpose, no market niche, and no technological context - then GPT-4 is clearly a winner.
> Rust, Zig, Julia, Kotlin, and dozens smaller "new" languages fail to bear a smear of originality in them.
To be fair, most programming languages don't try to be original. They try to solve problems. They also try to keep syntax close enough to other established programming treds so people have an easier time learning them.
> I would go even further. People are not even that advanced stochastic parrots. In the similar task, we failed just as spectacularly as the GPT-4 did right now. Rust, Zig, Julia, Kotlin, and dozens smaller "new" languages fail to bear a smear of originality in them. Their authors remixed old ideas in different proportions and got different grammars as a result. That's something an LLM can do and much faster too.
We are in good part stochastic parrots for sure, but there's a small nuance that is completely lost in your reasoning and which is what makes the difference today between humans and GPT.
In those languages, reusing commonly known syntax is an explicit feature. Whenever the language had no reason to introduce something new, it used known idioms to lower the amount of things one has to learn to use the language.
Occasionally though, a language will have something unusual, but it won't be for random reasons, but instead to mark something unique and interesting that was added to the language to tackle a facet of programming in a new way.
> Language models can and are useful if you specifically want to avoid marginality. To reduce noise, to remove errors.
But this makes me think of Shannon Information Theory. What you're describing is something that has no actual information (in the Shannon sense) at all.
And maybe that's why so much GPT output reads so blandly. Even the ones that are not glaringly wrong still read like... like food with no seasoning.
Really? As far as I know, Rust’s borrow checker was a newly developed approach to memory safety. There have been other non-GC/RC approaches to memory safety in e.g. Cyclone or Ada but Rust’s system is novel.
Likewise, Julia’s use of just in time compilation of dynamic dispatched code to achieve zero overhead execution was rather groundbreaking.
Small LMs do tend to avoid originality, but large LMs can use the context as a starting point to walk outside their training data distribution. They can do that by being better at composing concepts and using context-conditioning, which is also known as in-context learning.
I would go even further. People are not even that advanced stochastic parrots. In the similar task, we failed just as spectacularly as the GPT-4 did right now. Rust, Zig, Julia, Kotlin, and dozens smaller "new" languages fail to bear a smear of originality in them. Their authors remixed old ideas in different proportions and got different grammars as a result.
It wasn’t an originality contest, so there’s at least one non-sequitur in that statement. Everyone can create an original syntax, given modern character sets and formatting abilities. The key issue with that is the amount of developers who want to learn and switch between these is low.
But there is also Unison. That one new language that stands out, and the very fact that it does makes it improbable for an LLM to generate. LM is a language model, it avoids marginality and originality altogether by its design.
Language models can and are useful if you specifically want to avoid marginality. To reduce noise, to remove errors. There is huge potential in them. If the problem was to design the most average programming language with no purpose, no market niche, and no technological context - then GPT-4 is clearly a winner.