> In the future if LLMs are more efficient and eventually commoditized, needing less infraestructure to be run (think on-device as chip efficiency increases).
You speak as if this is given. Sure, LLMs are gonna get effective but frontier ones are always going to be hosted.
Also, on-device LLMs are gonna have a 'cutoff' for training data. You cant ask a gpt-oss4 about "Who won Arsenal x Athetico Madrid game". It has to go to the internet, and do the 'search'. You certainly cant ask the local model "What was Google's Q1 2026 earnings".
The self-hosted / on-device LLMs are going to do a lot, but not all and the moment 'search' is involved, people will reach for Google.
--
Lots of commercial queries need to be 'fresh' ("Build me an itinerary for Paris"). While a self-hosted LLM can do it, you might not want to trust it, because its info might be stale.
I encourage you to go beyond group think (HN is guilty of that) and really evaluate your position is actually valid or not.
You speak as if this is given. Sure, LLMs are gonna get effective but frontier ones are always going to be hosted.
Also, on-device LLMs are gonna have a 'cutoff' for training data. You cant ask a gpt-oss4 about "Who won Arsenal x Athetico Madrid game". It has to go to the internet, and do the 'search'. You certainly cant ask the local model "What was Google's Q1 2026 earnings".
The self-hosted / on-device LLMs are going to do a lot, but not all and the moment 'search' is involved, people will reach for Google.
--
Lots of commercial queries need to be 'fresh' ("Build me an itinerary for Paris"). While a self-hosted LLM can do it, you might not want to trust it, because its info might be stale.
I encourage you to go beyond group think (HN is guilty of that) and really evaluate your position is actually valid or not.