Microsoft is taking a smart step forward with its first proprietary AI models—MAI-Voice-1 and MAI-1-preview. Designed with everyday users in mind, MAI-Voice-1 can generate a full minute of audio in less than a second using just one GPU. Already a key part of features like Copilot Daily, it narrates news stories and sparks engaging, podcast-style discussions that break down complex topics. You can even experiment with different speech styles through Copilot Labs, making the experience uniquely yours.
MAI-1-preview, meanwhile, was trained across roughly 15,000 Nvidia H100 GPUs. It’s built to handle everyday queries by following instructions closely and delivering practical responses, signalling Microsoft’s shift from relying on external language models. As Microsoft’s AI chief, Mustafa Suleyman, explained, “My logic is that we have to create something that works extremely well for the consumer and really optimise for our use case.”
This development not only reflects a fresh, consumer-centred approach but also points to a future where specialised models serve diverse user needs. With MAI-1-preview already being tested on the LMArena benchmarking platform, you can expect a new wave of tailored AI experiences that move beyond one-size-fits-all solutions.