With new multimodal capabilities and enhanced agentic models, Gemini 2.0 promises to revolutionise how AI assists users
Google’s vision of AI continues to evolve with the launch of Gemini 2.0, a groundbreaking model designed to usher in the “agentic era” of artificial intelligence. Announced by Sundar Pichai, CEO of Google and Alphabet, Gemini 2.0 promises to expand on the capabilities of its predecessor, Gemini 1.0, by introducing advanced features that will enable AI to think multiple steps ahead and act on users’ behalf, with human oversight.
“Information is at the core of human progress,” Pichai explained in a message to the public, reflecting on the company’s commitment to organising the world’s information. This latest AI model takes that mission further by combining multimodal inputs — such as text, images, video, audio, and code — with long-context understanding, making it not only smarter but more adaptable to diverse tasks.
A significant step forward, Gemini 2.0 introduces several new features, including the ability to generate native image and audio outputs, making it a more versatile tool for developers and everyday users. One of the most anticipated additions is the Gemini 2.0 Flash experimental model, which will be available to all Gemini users and promises faster, more responsive results in a variety of applications.
Embed from Getty ImagesThe reimagined Gemini 2.0 also brings a new research assistant feature, Deep Research, which leverages advanced reasoning and long-context capabilities to explore complex topics and generate comprehensive reports. This tool is already available in Gemini Advanced and is expected to be a game-changer for anyone needing to dive deep into multifaceted subjects.
Google’s focus on integrating AI into their existing products is another hallmark of Gemini 2.0. A key example is the transformation of Google Search, which now benefits from the advanced reasoning capabilities of Gemini 2.0. This enhancement allows AI Overviews to tackle more complex questions — including advanced maths, coding problems, and multimodal queries. The feature is gaining traction, with 1 billion users already benefitting from AI-powered summaries, and will expand to more regions and languages next year.
The technological advancements behind Gemini 2.0 are supported by Google’s investments in proprietary hardware like the Trillium, a sixth-generation TPU (Tensor Processing Unit). This hardware powers Gemini 2.0’s training and inference, and is now available for use by developers, enabling them to build their own AI-driven applications.
While Gemini 1.0 was about organising and understanding vast amounts of information, Gemini 2.0 takes this a step further, emphasising action and utility. By integrating AI’s reasoning, problem-solving, and interactive capabilities, it creates more efficient workflows for developers and users alike.
Pichai’s excitement about the next phase of AI innovation is palpable, as he anticipates the ways Gemini 2.0 will reshape the AI landscape. As it becomes more deeply integrated into Google products, the possibilities for creating more intelligent, context-aware, and proactive tools seem endless. The future of AI has arrived, and with it, a new era of potential for innovation and problem-solving.