For the first time since it invested more than $10 billion into OpenAI in exchange for the rights to reuse the startup’s AI models, Microsoft is training a new, in-house AI model large enough to compete with state-of-the-art models from Google, Anthropic and OpenAI itself, The Information reported.
The new model, internally referred to as MAI-1, is being overseen by Mustafa Suleyman, the ex-Google AI leader who recently served as CEO for the AI startup Inflection before Microsoft hired the majority of the startups’s staff and paid $650 million for the right to its intellectual property in March.
But this is a Microsoft model, not one carried over from Inflection, although it may build on training data and other tech from the startup. It is separate from the models that Inflection previously released, according to two Microsoft employees with knowledge of the effort.
MAI-1 will be far larger than any of the smaller, open source models that Microsoft has previously trained, meaning it will require more computing power and training data and will therefore be more expensive, according to the people. MAI-1 will have roughly 500 billion parameters, or settings that can be adjusted to determine what models learn during training.
By comparison, OpenAI’s GPT-4 has more than 1 trillion parameters, while smaller open source models released by firms like Meta Platforms and Mistral have 70 billion parameters.
ArsTechnica reported this marks the first time Microsoft has developed in-house AI model of this magnitude since investing over $10 billion in OpenAI for the rights to reuse the startup’s AI models. OpenAI’s GPT-4 powers not only ChatGPT but also Microsoft Copilot.
The development of MAI-1 is being led by Mustafa Suleyman, the former Google AI leader who recently served as CEO of the AI startup inflection before Microsoft acquired the majority of the startup’s staff and intellectual property for $650 million in March. Although MAI-1 may build on techniques brought over by former inflection staff, it is reportedly an entirely new large language model (LLM), as confirmed by two Microsoft employees familiar with the project.
With approximately 500 billion parameter, MAI-1 will be significantly larger than Microsoft’s previous open source models (such as Phi-3, which we covered last month), requiring more computing power and training data. This reportedly places MAI-1 in a similar league as OpenAI’s GPT-4, which is rumored to have over 1 trillion parameters (in a mixture-of-experts configuration) and well above smaller models like Meta and Minstrel’s 70 billion parameter models.
The Verge reported Microsoft’s head of communications Frank Shaw posted that “sometimes news is just a blinding flash of the obvious,” and linked to a longer statement on LinkedIn by CTO Kevin Scott. There, Scott says that Microsoft plans to keep working closely with OpenAI “well into the future” while continuing to train and release its own models.
In my opinion, there certainly seems to be a trend happening with big companies that are creating various AI models. It kind of feels like an “everyone else is doing it” situation.