Trust of a project long term always was and continues to be of concern when choosing a critical dependency .
The concern basically boils down to how large and serious is the team and what if they abandon the project in few weeks or months .
These were always the risks, many here have been burned by betting years of their career building against promising but what turned out to be weak projects
OP is alluding to the fact that today commit frequency, lines of code or how active the contributors in the issue trackers are no longer good signals to use as proxy.
When the underlying project to yours is few million lines of code written by machines only it is not going to be feasible fork and maintain or in-house it if the maintainers abandon it
To be clear users of a library or a tool aren’t owed anything when it available gratis and fully open source .
However not everyone has access to unlimited tokens to disregard the quality (in terms of history and usage ) or size of the underlying project completely
I think the primary value of a project like this is the demonstration that this is possible and a proof that it does not incur some unknown tradeoff you'll discover after spending resources doing it.
IMO the maintenance story is more or less solved if you can keep AI agents refactoring and improving it in a loop.
> However not everyone has access to unlimited tokens
Apologies. I did not consider this when writing my comment, being spoilt by unlimited 'free' AI.
Free in quotes because, presumably, training agents on AI usage from developers is worth more than the cost of providing free AI.
> IMO the maintenance story is more or less solved if you can keep AI agents refactoring and improving it in a loop.
That’s a weak argument, though, if the future of AI is totally unreliable when it comes to cost and quality. Right now I definitely wouldn’t want to depend on being able to infinitely access AI tools for such an important part of the toolchain.
Aside from that it’s just not attractive to trust a project made by one person.
> rather than the optimizations involved in rendering the text.
Any views they have on this topic is going to come across as quite opinionated given their choices for text rendering for this post and general aesthetics of website.
Naw, the truth is I'm not really smart or intelligent enough to build a semantic diff system. For that you'll need to wait on a post from one of our smarter devs, this was a post about rendering diffs in a browser.
Using the keyword “Workflow”like “Ultrathink” is problematic?
Ultrathink is uncommon enough that it is unlikely to be used in code or prompt outside its intended purpose.
Workflow is generic keyword and used in so many contexts both inside the codebase and orchestration tooling like say temporal.io or others that name their constructs “workflows”.
Everyone has critical risk on multiple parts of the supply chain. GPUs and Memory are just things OAI mitigated for.
Power - Bigger bottleneck than GPU or RAM perhaps, New Grid connected capacity is typically 10+ year timescale with lot of regulatory friction. Captive capacity is also quite constrained - now Gas turbines have 7+ year wait time.
There are plenty of hard constraints that OAI cannot easily solve either.
Comparing $/MTokfor models makes as much sense as comparing $/ghz for CPUs. Models have different tokenizers and take varying number of "thinking" to get to a solution. A far better proxy is how much it takes to do a run, which takes all of that into account. Such metrics are much harder to gather, but once source claims $3357 for gpt-5.5 vs $4686 for opus, the opposite of your conclusion.
There is no conclusion , I only stated the only objective fact to compare with that will not change for you to me.
Everything else is subjective to your setup, use case, configuration tuning and so forth.
More importantly bean-counters and decision makers at even 150+ seat orgs are looking at pricing sheets and enterprise contracts not how it performs for some team in a specific harness today to make million dollar annual contracts. It is not common for procurement teams to do commission the level of detailed analysis or large scale pilots that will actually hold for the duration of contract.
That doesn't mean that GPT-5.5 is selling less than Claude at all, just that cost is not the primary driver if list price is not cheaper, there is reason these are published in the same format by every vendor, because the common metric is how finance likes to compare with.
Most variants of GPT-5.5 are less chatty and token-intensive than Opus 4.8/4.7, so despite the output token price being higher, it generates fewer tokens, so the net cost is lower.
Per-token pricing is totally sensible from the provider-perspective on mapping COGS to revenue, but for a consumer, different models will produce more or less tokens, meaning the cost calculation is multi-dimensional.
You can configure model to be terse/concise with output style ? There are plenty of popular projects like https://github.com/JuliusBrussee/caveman which do it for you even.
Input/Cache/Output ratios are use case and configuration dependent . Any benefits in one model can usually be roughly to another with configuration tuning, and discussions devolve into subjective experience.
Pricing sheet is the objective way to compare cost.
>Airbus reported a commercial aircraft backlog of 9,031
> 10.4 years of production coverage
Kinda true, airlines and manufacturers like to do big order announcements/deals for their future needs of few years all upfront. If Airbus suddenly delivered all 9k aircraft most airlines simply cannot afford it, or take possession and use them even.
For example Indigo is Airbus only operator with a fleet of 450 today and has around 920 more Airbus aircraft (10% of the book) on order. Neither Indigo or Indian aviation sector( of which Indigo is 60%) can triple the capacity today . India need serious upgrades (Terminals, Runways, Gates, new airports) coming online and also demand maturing, i.e. more people can afford to fly for that kind of volume to make sense which even the best scenario will happen over the next decade.
For more mature/slow growing airlines it is function of existing fleet age and the optimal point each aircraft is retired/sold , doing it too early will make them unprofitable .
It is a less a backlog and more their next 10 years of committed sales.
P.S. There is whole other industry aspect around Buy-Sell-and-leaseback financial engineering that can drive order volumes a bit. The backlog/order book also have commodity futures aspects.
One factor to consider , the base will not remain the same over the next 5 yearts.
Every generation of developer tooling that increase of absolute code throughput creates a new class of developers (and users).
Always been the case since first compilers, through eras of frameworks to today, and the skill level needed to be one has dropped. In mid/late 80s only Master / Doctorate level Comp Sci professional could write any applications. It dropped to undergrad and just Information Technology engineers and comp sci theory became mostly optional and dropped further to any college level educated with some training and has been trending below with no/low code tools like retool pre 2022, that was before agent codegen services such as v0/replit and so on.
The next generation developers will not produce applications and architecture as previous generations did, just as we most of us here don't produce the level of quality that pg did when building this platform[1] , but as long as the user can find value it doesn't matter as countless enterprise applications of middling quality already prove today.
All this to say the 200M/30M numbers will not remain the same is the thesis for these businesses, will it change by large enough at a fast enough pace to justify the capex, I don't think so either. However web 1 then 2.0 , saas and mobile revolutions were pretty quick with new class of users and developers so not completely unrealistic .
[1] While HN is a heavy outlier with its custom lang lisp implementation, there are any number of examples from previous eras that are more moderate in choices but written with solid architecture with skill levels would be hard to find in today's generation founders.
> Play money means it's much easier for anyone anywhere in the world to get started and try out forecasting without any risk. It also means there's more freedom to create and bet on any type of question.
Their investors/lawyers probably did not want to back the risk using real currency adds.
i18n language models are not area something frontier labs are focusing ton of resources on? ( certainly not in Norwegian)
The corpus of content in Norwegian - may not require very large clusters, or even if it does, this is best that the library could do, it would be certainly more than anyone else is investing in Norwegian models
SOTA models do not have the access to the quality of content that the national library does? The article mentions licensing with newspapers specifically, and the library has access to its own content archive.
English and Norwegian are not closely related language families, perhaps LoRA is not best approach?
I am curious if there is published research on how well localization works with LoRA depending on how far off the target language grammar/vocabulary is from English.
Projects like this typically have more than one objective and are not only building SOTA project, but is also to build/train foundational local talent , similar to universities launching satellites .
> English and Norwegian are not closely related language families, perhaps LoRA is not best approach?
Yes, they are. English is a West Germanic language. Norwegian is a North Germanic language. The French vocabulary in English obscures it a bit, but the two languages have similar grammar and the vocabulary has a huge number of close cognates.
E.g. day -> dag, ship -> skip, apple -> eple, cow -> ku (which makes more sense when you pronounce them correctly out loud), bairn (child; mostly Scotland and Northern England) -> barn, hop -> hopp, yule -> jul just to give a random selection of English Germanic words.
But more than that, the frontier models both a) knows Norwegian quite well, b) certainly knowns German and Dutch well, and there's a continuum of language transfer around the North sea especially when accounting for sounds rather than modern orthography, e.g. to take a couple of examples from above: ship -> schip -> Schiff -> skib -> skip; day -> dag -> Tag -> dag). The "jump" to Dutch already weeds out most of the French. A lot of modern Norwegian orthography comes from Danish, which again shares more than modern Norwegian does with German.
Knowing any of these helps a lot with learning Norwegian and vice versa. E.g. I'm Norwegian, I've never learnt Dutch, but I have learnt English and German, and I can read Dutch fairly well from that alone.
This makes me deeply curious about how LLMs understand language. Do LLMs relate cognates more than words that are dissimilar in different languages? I wonder if that plays some role in the effectiveness of tokenization.
I have no idea if the similar spelling will somehow help - I used that mostly because it's a simple way if illustrating the close relationship, but I suspect you'd find that the meanings of closely related words are likely to more directly overlap.
The grammar is perhaps more likely to help. Similar word order etc. Even weirdness like German - my only top grade on a German essay in school was one where I on purpose ignored what I thought I knew about German and tried to evoke "old fashioned" Norwegian. The result was guessing at a bunch of grammatical structures that I didn't know if was valid German. Turned out I was right about most of it - century old Norwegian was far closer to century old Danish, was a lot closer to valid German, and enough so to impress my teacher enough to overlook a number of orthographic mistakes.
The same thing works for guessing German grammar from English. The farther back you go in English, the more its grammar resembles German.
"What sayest thou?" -> "Was sagst du?"
In fact, for the above, you don't even have to know a single German word. You just have to know what for question words, "wh" -> "w", that the English "y" at the end of a syllable usually comes from an older Germanic "g" sound, and that "th" was replaced by "d" in German. That gets you 90% of the way from early modern English to modern German in the above example.
That's interesting. I haven't thought about it in that direction before. I'm "of course" aware of the High German consonant shift, which also muddled things a lot (the continuum around to North Sea is a lot "cleaner" if you look at Plattdeutsch instead), but never thought much about what other simple transformations to apply with standard modern German.
All memory products use many shared resources in the supply chain, so if there is high demand in one product line, others have to raise prices to compete for the resources or stop making those lines altogether.
That is to say at least you were able to buy them at $350 today, with the current trajectory there will be no supply at all in few months.
The concern basically boils down to how large and serious is the team and what if they abandon the project in few weeks or months .
These were always the risks, many here have been burned by betting years of their career building against promising but what turned out to be weak projects
OP is alluding to the fact that today commit frequency, lines of code or how active the contributors in the issue trackers are no longer good signals to use as proxy.
When the underlying project to yours is few million lines of code written by machines only it is not going to be feasible fork and maintain or in-house it if the maintainers abandon it
To be clear users of a library or a tool aren’t owed anything when it available gratis and fully open source .
However not everyone has access to unlimited tokens to disregard the quality (in terms of history and usage ) or size of the underlying project completely
reply