Google’s next major AI model has arrived to combat a slew of new offerings from OpenAI, TechCrunch reported.
On Wednesday, Google announced Gemini 2.0 Flash, which the company says can natively generate images and audio in addition to text. 2.0 Flash can also use third-party apps and services, allowing it to tap into Google Search, execute code, and more.
An experimental release of 2.0 Flash will be available through the Gemini API and Google’s AI developer platforms AI Studio and Vertex AI, starting today. However, the audio and image generation capabilities are launching only for “early access partners” ahead of a wide rollout in January.
In the coming months, Google says that it’ll bring 2.0 Flash in a range of flavors to products like Android Studio, Chrome, DevTools, Firebase, Gemini Code Assist, and others.
The production version of 2.0 Flash will land in January. But in the meantime, Google is releasing an API, the Multimodal Live API, to help developers build app with real-time audio and video streaming functionality.
Google https://blog.google/products/gemini/google-gemini-ai-collection-2024/ posted “Gemini 2.0: Our latest, most capable AI model yet.”
Today, we’re announcing Gemini 2.0 — our most capable AI model yet, designed for the agent era. Gemini 2.0 has new capabilities, like multimodal output with native image generation and audio output, and native tools including Google Search and Maps.
We’re releasing an experimental version of Gemini 2.0 Flash, our workhorse model with low latency and enhanced performance. Developers can start building with this model in the Gemini API via Google AI Studio and Vertex AI. And Gemini and Gemini Advanced users globally can try out a chat optimized version of Gemini 2.0 by selecting it in the model dropdown on desktop.
We’re also using Gemini 2.0 in new research prototypes including Project Astra, which explores the future capabilities of a universal AI assistant; Project Mariner, an early prototype capable of taking actions in Chrome as an experimental extension; and Jules, an experimental AI-powered code agent.
We continue to prioritize safety and responsibility with these projects, which is why we’re taking an exploratory and gradual approach to development, including working with trusted testers.
Engadget reported: The battle for AI supremacy is heating up. Almost exactly a week after OpenAI made its o1 model available to the public, Google today is offering a preview of its next-generation Gemini 2.0 model.
In a blog post attributed to Google CEO Sundar Pichai, the company says 2.0 is its most capable model yet, with the algorithm offering native support for image and audio output.
“It will enable us to build new AI agents that bring us closer to our vision of a universal assistant,” says Pichai.
Google is doing something different with Gemini 2.0. Rather than starting today’s preview by first offering its most advanced version of the model, Gemini 2.0 Pro, the search giant is instead kicking things off with 2.0 Flash.
As of today, the more efficient (and affordable) model is available to all Gemini users. If you want to try it yourself, you can enable Gemini 2.0 from the dropdown menu in the Gemini web client, with availability within the mobile app coming soon.
In my opinion, it will be interesting to see what Google wants to do with its Gemini 2.0. It seems like several other companies are working hard to create the best AI bot.