It's also a good play to try to take resources away from local, self-hosted "Feasible AI" solutions. With compute resources, I think Microsoft hopes Mistral skews their focus and resources towards large models that can run only run in the cloud, trying to lure them away with the bait:
"Don't you want to build the best AI possible, independent of compute?"
I'd be surprised if they didn't consider the notion that they are hitting to birds with one stone: OpenAI and Indie AI.
It's not like Microsoft is working on "Windows AI Studio" [1], or released Orca, or Phi.
It's not like there's any talk of AI PCs with mandatory TOPs requirements for Windows 12.
Big bad Microsoft coming for your local AI, beware.
> Mistral Remove "Committing to open models" from their website
That was 5 hours ago.
Without having insider details it is hard to know why, but the coincidence of timing with the Microsoft deal is not lost on me. It could have even been a stipulation.
I have no explanation for why Microsoft has started aggressively innovating again (with the introduction of Satya) than my theory that US DoD realized the country's tool of dominance in the future will be predominantly with tech superiority instead of military power. Microsoft's new strategy of running everything on the cloud aligns with this, even if it may have been also motivated by the fact that most people now only own a battery-constrained mobile device and laptops getting smaller and thinner.
From my understanding, which may be wrong, you only need the massive compute resources initially to create a compiled vector space LLM - and then that LLM once compiled can be run locally?
This is why anti-CSAM measures policy is possible so compiled-release LLMs can have certain vector spaces removed before release; but apparently people are creating cracks for these types of locks?
You are a little confused. There’s no “compiling” of LLMs. It’s just once it’s trained, inference takes less compute than further training. So you can run things locally that you couldn’t necessarily train locally.
Not sure where you are getting the CSAM bit. We aren’t that good at blanking out weights in any kind of model, certainly not good enough to lobotomize specific types of content.
The CSAM bit seems to then be propaganda from at least one AI company putting out PR to falsely quell people's concerns about their LLMs being able to generate content involving children that's sexualized.
I've yet to see details of how much compute-minimum server requirements are necessary to run LLMs. Maybe you know a source who's compiling a list in a feature matrix that includes such details?
Large LLMs like gpt-3 and gpt-4 need very serious hardware. They have hundreds of billions of parameters (or more) which need to be loaded in memory all at once.
I don't see why Mistral would acquiesce. Like the other comment says, Microsoft has a lot of chips on the table for local AI. They didn't even mention DirectML, ONNX or Microsoft's other local AI frameworks - suffice to say Microsoft does care about on-device AI.
So... would Mistral deliberately sabotage their low-end models to appease Microsoft's cloud demand? I don't think so. Microsoft probably knows that letting Mistral fall behind would devalue their investment. It makes more sense to bolster the small models to increase demand for the larger ones, at least from where I'm standing.
If you're asking about Microsoft's APIs - I'd keep an eye on ONNX. It's the most ambitious, but also supports an insane amount of acceleration targets. It would be the proverbial "big guns" if vendors continued investing in more insular frameworks like Metal and CUDA.
I'd be surprised if they didn't consider the notion that they are hitting to birds with one stone: OpenAI and Indie AI.