Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

This is just a smart move by Satya Nadella after the non-standard drama that occurred with OpenAI a few months back where it nearly imploded and then didn't.

You want both a backup for OpenAI as well as negotiating leverage if OpenAI gets too powerful and this achieves both.



It's also a good play to try to take resources away from local, self-hosted "Feasible AI" solutions. With compute resources, I think Microsoft hopes Mistral skews their focus and resources towards large models that can run only run in the cloud, trying to lure them away with the bait: "Don't you want to build the best AI possible, independent of compute?"

I'd be surprised if they didn't consider the notion that they are hitting to birds with one stone: OpenAI and Indie AI.


It's not like Microsoft is working on "Windows AI Studio" [1], or released Orca, or Phi. It's not like there's any talk of AI PCs with mandatory TOPs requirements for Windows 12. Big bad Microsoft coming for your local AI, beware.

[1] https://github.com/microsoft/windows-ai-studio


The whole embrace, extend.. ?

> Mistral Remove "Committing to open models" from their website

That was 5 hours ago.

Without having insider details it is hard to know why, but the coincidence of timing with the Microsoft deal is not lost on me. It could have even been a stipulation.


I have no explanation for why Microsoft has started aggressively innovating again (with the introduction of Satya) than my theory that US DoD realized the country's tool of dominance in the future will be predominantly with tech superiority instead of military power. Microsoft's new strategy of running everything on the cloud aligns with this, even if it may have been also motivated by the fact that most people now only own a battery-constrained mobile device and laptops getting smaller and thinner.


You’re downvoted for the snarky tone I guess, but you’re absolutely right


Even easier to rug pull your own teams project than someone else's.


Moving straight from Embrace to Extinguish, why not!


Antitrust. Using their dominance in one market to destroy another. I hope the EU tares them a new one if the us doesn't.


From my understanding, which may be wrong, you only need the massive compute resources initially to create a compiled vector space LLM - and then that LLM once compiled can be run locally?

This is why anti-CSAM measures policy is possible so compiled-release LLMs can have certain vector spaces removed before release; but apparently people are creating cracks for these types of locks?


You are a little confused. There’s no “compiling” of LLMs. It’s just once it’s trained, inference takes less compute than further training. So you can run things locally that you couldn’t necessarily train locally.

Not sure where you are getting the CSAM bit. We aren’t that good at blanking out weights in any kind of model, certainly not good enough to lobotomize specific types of content.


Thanks for the clarification.

The CSAM bit seems to then be propaganda from at least one AI company putting out PR to falsely quell people's concerns about their LLMs being able to generate content involving children that's sexualized.

I've yet to see details of how much compute-minimum server requirements are necessary to run LLMs. Maybe you know a source who's compiling a list in a feature matrix that includes such details?


Large LLMs like gpt-3 and gpt-4 need very serious hardware. They have hundreds of billions of parameters (or more) which need to be loaded in memory all at once.


I love that you are using the word lobotomize.


I don't see why Mistral would acquiesce. Like the other comment says, Microsoft has a lot of chips on the table for local AI. They didn't even mention DirectML, ONNX or Microsoft's other local AI frameworks - suffice to say Microsoft does care about on-device AI.

So... would Mistral deliberately sabotage their low-end models to appease Microsoft's cloud demand? I don't think so. Microsoft probably knows that letting Mistral fall behind would devalue their investment. It makes more sense to bolster the small models to increase demand for the larger ones, at least from where I'm standing.


Money. How do they get revenue in your model?


what’s the most promising?


If you're asking about Microsoft's APIs - I'd keep an eye on ONNX. It's the most ambitious, but also supports an insane amount of acceleration targets. It would be the proverbial "big guns" if vendors continued investing in more insular frameworks like Metal and CUDA.


So.. Embrace, Extend, Extinguish?


> This is just a smart move by Satya Nadella

Diversifying their AI bets definitely makes total sense. If this wasn't their strategy originally, it almost certainly became so the moment the OpenAI board fired Sam Altman.

It's easy to make simplistic judgements from the outside, but with the limited information we have, it does seem like Satya Nadella came out of this OpenAI debacle looking pretty competent.

It's hard to reconcile the fact that the Microsoft that handled the unexpected OpenAI issue so well is the same Microsoft that seems intent on literally setting fire to their flagship product! (Windows)


I totally agree it’s also like the move where Microsoft is at least supporting Linux on their systems and cloud as not a backup but to just close you into their ecosystem . Honestly I could see Microsoft buying Huggingface.


Yes, Microsoft doesn't have to pick the sole winner in AI, but rather they could just start eating the AI ecosystem bit by bit so that they win by default. It is what large players can do. May open themselves up to some scrutiny for too many acquisitions and reducing competition though, but that is a separate issue.


This is how microsoft has been doing data for at least 10 years (See databricks).

Step 1: Get the industry leaders to be purchasable via Azure. Step 2: Slowly build your own clone and start stealing user share even though your offering is still worse.


And they are used to that issue too. A long history of it.


"Microsoft recommends OpenAI as your default overlord. Did you know it can do everything your current AI can do, sometimes better, but always more profitably for us? [Switch now] [Ask me again in 30 seconds]"


I’ve had similar thoughts, Microsoft buying Huggingface would be very similar to them buying GitHub.


Please, god, no. I can’t think of two more antithetical companies.


Would you mind elaborating why? I'm not super experienced in the AI world, and barely use Hugging Face. Frankly, the name makes it difficult to take it seriously.


Hugging Face is very supportive of the open source machine learning community, both in the work they do with the transformers library, as well going above and beyond in developer and community relations to build an all around great product offering and user experience. Microsoft does the opposite of all of those things and has only made GitHub worse and more unstable since acquiring them.


This has me thinking about the context behind the striking quote in https://www.theinformation.com/articles/how-microsoft-swallo... (May 2023, months before the OpenAI drama):

> Nadella [in December 2022] abruptly cut off Lee midsentence, demanding to know how OpenAI had managed to surpass the capabilities of the AI project Microsoft’s 1,500-person research team had been working on for decades. “OpenAI built this with 250 people,” Nadella said, according to Lee, who is executive vice president and head of Microsoft Research. “Why do we have Microsoft Research at all?”

> At the same time, even as the company began weaving OpenAI into the fabric of Microsoft’s products, Nadella decided not to abort Microsoft’s own research efforts in AI. During the tense exchange at the December meeting between the Microsoft CEO and Lee, other executives spoke up to defend the work of Microsoft’s researchers, including Mikhail Parakhin, who oversees Microsoft’s Bing search and Edge browser groups, Lee said. After grilling Lee in the meeting, Nadella called him privately, thanking him for the work Microsoft Research had done to understand and implement OpenAI’s work in a way that passed muster for corporate customers. Nadella said he saw Lee’s group as a “secret weapon.”

While this is entirely speculation, it's easy to imagine that there are many levels of PR magic going on here, to share a quote that on the surface feels "leaked" and "explosive" but, among investors and clients who read beyond the (very good) paywall, actually shores up a narrative that Microsoft has a capability that significantly augments OpenAI's, and allows the existence of MSR to become headline news without even needing a product release.

The Mistral deal feels like yet another step in this direction. Microsoft is not afraid of seeming "messy" in the press as long as it can control the narrative around its value-add to customers in the context of its partnerships. By contrast, the rest of FAANG's more consumer-facing positioning makes it a lot harder for them to maneuver in a similar way.


> Nadella [in December 2022] abruptly cut off Lee midsentence, demanding to know how OpenAI had managed to surpass the capabilities of the AI project Microsoft’s 1,500-person research team had been working on for decades. “OpenAI built this with 250 people,” Nadella said, according to Lee, who is executive vice president and head of Microsoft Research. “Why do we have Microsoft Research at all?”

The answer to that is till Google released the Attention is All You Need paper in 2017 there were no breakthroughs allowing models as we have now to be built, OpenAI being a small and nible team picked up on which direction the wind is blowing with LLMs and quickly brought a product to market whilst MS just did what corps do - move slowly (same for Google etc).


Microsoft research has also been not solely devoted in AI I have seen much in quantum computing and programming language research and general computer science .


They did that because they can't compete with Linux and had no relevancy in the tech world outside of providing business users with terrible software


The pro is that MS buys more AI hype to pump up their share price.

The con is that MS attracts more attention from regulators.


Those same regulators who check Microsoft's stock price regularly to see how 25% of their retirement plan is doing?


I guess it has a cost, though? I presume OpenAI didn’t like this move. If that’s the case, what might be the consequences?


> I guess it has a cost, though? I presume OpenAI didn’t like this move. If that’s the case, what might be the consequences?

Until OpenAI releases GPT 5 and it blows everyone away, OpenAI's leverage is constantly decreasing as the gap between their best model and everyone else's best model decreases.

There doesn't seem to be moats right now in this industry except for pure model performance.

Maybe someone should as ChatGPT what OpenAI should do to maintain long-term leadership in this industry?


If I had to pick one player who wanted to win the AI race and was willing to be ruthless to do it, I'd pick Nvidia. Computation is the excludable bottleneck, and Nvidia is the essentially the singular company who makes AI computers.

Hire Ilya, get him to hire as many of the best folks he can.

Stop selling GPUs. Hoard them. Introduce some subtle bug into the drivers that dramatically increases their rate of burn out.

Figure out some reasonable way to give attribution to original content creators, approximately solve the content ID problem of the AI age. Cut the content creators into the rev share in proportion to their data importance to the model. Make the content creators incredibly pissed off that their work is being stolen by big AI companies unfairly and encourage to them to sue the other big AI firms. Their content share multiplier increases if they get injunctions against LLM firms.

Convince politicians that the AI firms have performed an intellectual heist of epic proportions, and that they must not be allowed to even generate synthetic training data from poisoned models. With the content creators united behind you, convince congress that poisoned models must be destroyed, that even using synthetic training data from poisoned models must be illegal. Make them start over from a clean room with no copyrighted data.


> Make them start over from a clean room with no copyrighted data.

And when such models become popular[0], all the artists now have no job and no way to get compensation for being unable to work through no fault of their own.

I don't think that's really a winning condition. It might make you feel better about the world, but the end result is still all the artists being out of work.

[0] some models are already trained that way, although I assume you're using the word "copyrighted" in the conventional sense of "neither public domain nor an open license", as e.g. all my MIT licensed stuff is still copyrighted but it's fine to use.


In my hypothetical future, at least the people who create the content used to train the models can get "training royalties", which they aren't getting now.

There is still also money to be made in producing physical art or performances, even when AI can produce amazing digital works.


"Make them start over from a clean room with no copyrighted data." makes "the people who create the content used to train the models" the empty set.

> There is still also money to be made in producing physical art or performances, even when AI can produce amazing digital works.

Perhaps, but it may be akin to the way there is still money to be made from horse drawn carriages in city centres, even when cars displaced them over a century ago — a rare treat for special occasions, to demonstrate wealth.


> It might make you feel better about the world, but the end result is still all the artists being out of work.

Is it not the same with every leap in technology? There were professions like street lamp lighters, alarm services etc that have become redundant now?


Sure, though I suspect "art" is the human version of a peacock tail — the difficulty is the point, it how we signal our worth to others, cheapening it breaks that signal — which would suggest that making all forms of art easy messes with (many of) us at a deep, essentially automatic, level.

More specifically, I was responding to the idea that "compensating creators whose works are used to train the models" would actually solve anything; to use your examples, it would be as if the literal luddites were suggesting passing laws saying that "all textile machines that work like humans need to compensate the humans they displace, and also you need to make your new machines from scratch without talking to any textile workers to make sure you don't cheat", and my response would be analogous to saying "there's already machines which don't work like humans, so you're going to be out of work and have no compensation".

The Luddite movement preceded The Communist Manifesto by about 30 years. Everything's sped up since then, so I'd be surprised if we have to wait 30 years for a political shift which is to AI what Communism was to industrialisation. I'm just hoping we don't get someone analogous to Stalin or Pol Pot this time.


>If I had to pick one player who wanted to win the AI race and was willing to be ruthless to do it, I'd pick Nvidia. Computation is the excludable bottleneck, and Nvidia is the essentially the singular company who makes AI computers.

I've thought the same thing. NVIDIA getting into AI seriously is a vertical integration play and they often do that -- like NVIDIA trying to buy ARM.


Sounds like someone has finally asked ChatGPT for a feasible plan to consolidate the AI landscape.


>Stop selling GPUs. Hoard them. Introduce some subtle bug into the drivers that dramatically increases their rate of burn out.

Well, they didn't stop selling GPU when cryptomining was going strong. Instead they continued to sell them (with a hefty markup, tho).

It's like selling shovels and picks during a gold rush.


If google benchmarks are to be believed, gemini 1.5 will be better than gpt and they use their own chips (Google TPU), no nvidia involved. There is also Groq. I don't see Nvidia keeping their lead and profit margins forever.


Not to mention intel rapidly catching up in the $/perf calculation.


> Stop selling GPUs. I'm not sure Wall St. would reward that plan


Yeah, he seems to have left out the critical middle step ("Gather investors to take $2T company private.")


don't stop just raise the margin slightly and limit the number available of the higher end chip using the proceeds to self fund building their own datacneter

do runs of cards for themselves with higher core counts and clock speed that they dont release to others.


> There doesn't seem to be moats right now in this industry except for pure model performance.

Hard disagree. OpenAI's function calling is something no other commercial model provides, not even Gemini and Mistral Large.



The mixtral large marketing at the top of hn right now claims json mode and native function calling. I haven't tried it, but that's what they say.


They also have brand recognition, for what it’s worth. Every non-engineer in the world practically thinks AI == ChatGPT.


Sure that helps with the consumer market, but most people will use AI integrated into other products and not directly.

Those integrated AI solutions will usually be done via enterprise deals where brand name is not quite as important. It will be done by people who care about cost, reliability and ease of use.

Think of nginx's dominance in web servers even though it has no name recognition among the general population. Or Stripe's payment system.


Yes, however it's increasingly likely that the GPT in ChatGPT will not be limited to OpenAI (in the US), so I'm not sure how much ChatGPT will be worth with countless other platforms containing GPT in their names.

https://techcrunch.com/2024/02/15/no-gpt-trademark-for-opena...


The thing is that there is almost no lock in in the models. So brand recognition doesn't help much as people look into the benchmarks and price sometime in the future, if not when just starting out.


Meh, I don't think it's worth much. In a few years that'll be like claiming that so-and-so had name brand recognition for transistors. Most people don't need to care who manufactures their transistors.


It’s not necessarily a bad thing. Most people don’t know that TSMC exists, or what Microsoft does beyond Windows and Xbox (which are a small fraction of its business).


Correct, I didn't say it's a bad thing. I said it's not clear that it's a good thing (i.e. an asset)


Brands can change quickly, but they do matter in the short term. I've witnessed customer support teams use Firefox to say they only supported Internet Explorer and government ministers who thought it was "good" that IE was the "only" web browser, and weirdly a phone company whose customer support person thought their SIM cards worked better on Android than iPhone and that their web chat wouldn't work with a Mac even though they were talking to me on a Mac at the time.

And when I was a kid, it seemed like all the teachers thought it would be a waste of time to learn MacOS because "Apple would be bankrupt soon". (Given how much all the app UIs changed, right decision for the wrong reason).


All of these examples are end-products. "AI" itself will not be. The winner in AI will be whoever permeates other products/brands most successfully, and end-user brand familiarity doesn't matter much for that. Familiarity among engineering and product leaders is what matters.


Maybe, but maybe AI will become front and center of consumer and productivity IT products and their premier brand ambassadors will be anthropomorphized AI agents. Hello Clippy, this time for real.


Indeed, I'm not disagreeing on that, merely opining that "ChatGPT" as a name could well be relevant for a bit.


Most people don't need to care who manufactures their transistors.

They might, in an upside-down world where the Shockley Semiconductor board tried to fire Shockley, and where the Traitorous Eight not only didn't bail out but took his side.


Unless your market is direct to end user, end user brand name recognition doesn't matter. In the case of AI, at least so far, the primarily income won't be from end-users directly, but rather via enterprise integrations into existing tools that already have end user market share (e.g. Microsoft Office, Microsoft Windows, VS Code, Notion, etc.)


Eh, all this talk of "moats" etc. feels weird when just a few years ago it seemed like everyone was complaining they'd rearranged their corporate structure to include a fully-owned profit-making subsidiary to attract investments, and all the loud voices seemed to think a cap of x100 return on investment was so large it was unlikely to be reached.

And then OpenAI tripped and fell over a magic money printing factory, and the complaints are now in the set ["it's just a stochastic parrot", "it's so good it's a professional threat to $category", "they've lobotomised it", "they don't have a moat", "they're too expensive"].

As the saying goes, "Prediction is very difficult, especially if it’s about the future!"


> There doesn't seem to be moats right now in this industry except for pure model performance.

Compute?

At least in the short term, it seems like the biggest wallets are going to win by default.


I suppose they didn't but Microsoft has a $3T market cap and OpenAI is theoretically valued at $80B.

OpenAI has 700 people, Microsoft has 220,000 people.

OpenAI is strong but they're still dependent on MSFT.


MSFT needs companies like OpenAI to give Azure credits to for their valuation to continue soaring. The deferred revenue on their balance sheet from the unspent Azure credits they give as investment are worth much more to their market cap than $80B.


It sounds like they took the Federal Reserve's business model and applied it to computing.


Just to play this out, what possible moves would OpenAI make at this point that they wouldn't have until this happened?

Altman is out there trying to raise ridiculous sums to get away from Azure, didn't he make the first move here?


I think the main move would be some type of true AGI that leads to a hard takeoff scenario, but it isn't clear we are close to that or not.

Basically something that is more than just another bump in the scorecard for GPT 5 over GPT 4. Otherwise it is still just a horse race between relatively interchangeable GPT engines.


OpenAI will want to expand to all clouds to increase TAM.

Microsoft will want to avoid things regulators in the current regime will go after.

This seems like a step towards both and ultimately good for developers as it seems likely to bring costs down by increasing competition.


There are no consequences for Microsoft. It owns a 49% stake in OpenAI, so the only action that OpenAI could take to hurt Microsoft would be to deliberately destroy its own value.


They are under an exclusive contract with MSFT, they can't do anything.


Putting multiple bets and having multiple partnerships is smart regardless of OpenAI drama.

This way Microsoft is less dependent on a single deal and can diversify their offering based on use cases.


Satya is known to play 4-D chess. With this deal, MSFT is at least two dimensions ahead of the competition. /jk


Yes, it’s their tried and true maneuver: embrace, extend, extinguish.


Can we stop having this comment in every Microsoft post. It is like people have no clue what EEE is/was, but if it is MS let's post this.


We should just automete these comments: MS: Oh No! Embrace Extend Extinguish ! Google: Oh No! killedbygoogle ! Meta: Oh No! So much Ads ! Apple: Oh No! Evil App Store policy !


Uh, sorry, but this seems pretty consistent with trying to co-opt and kill open source AI competition:

> [EEE] describe its strategy for entering product categories involving widely used standards, extending those standards with proprietary capabilities, and then using those differences in order to strongly disadvantage its competitors.


> "The US tech giant will provide the 10-month-old Paris-based company with help in bringing its AI models to market. Microsoft will also take a minor stake in Mistral although the financial details have not been disclosed"

Where are the "widely used standards"? Where are the "extending the standards with proprietary capabilities"? Where is the "strongly disadvantaging competitors"?


Mistral is the most used and fine-tuned open source model by a mile, close to the standard for open models, they’ve locked them down into offering their models behind an API and in Azure. The Azure offering sets them up for be the most safe, GDPR compliant offering for enterprises in Europe, where Microsoft already has a huge reach and customer base, bolstered by Mistral being a homegrown brand.


Also I've never heard the term myself


I suggest reading up on this, it is an important element that significantly shaped the computer world.

But yeah, not sure it is at play here.


A smart move by Microsoft is to not be reliant on another company for their AI needs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: