As it prepares to roll out live video and screen-sharing capabilities to its conversational AI platform, Gemini Live, Google is planning to extend its functionality to include the processing and understanding of real-time visual information provided by a user via their device. Launched at the Mobile World Congress (MWC) 2025 in Barcelona, these features are designed to offer real-time, interactive experiences, with the AI able to process and respond to visual information provided directly by the user.
With the help of the new live video feature, users will be able to use their smartphone cameras to show Gemini Live what is going on around them, and get real-time, contextual advice. For example, a ceramic artist may require guidelines on the best glazes to use on the vases she has just created, by simply pointing the camera at them and defining the look she is going for. This way, the interactions become more lively and the conversations graduate to a level that goes beyond the use of text-based queries.
The screen-sharing feature allows users to share their device screen with Gemini Live, and receive recommendations based on the content shown on the screen. This could be particularly useful for activities such as shopping, where Gemini Live can help with style suggestions or product comparisons in real-time. A demonstration involved a user seeking fashion advice while browsing through different options, while Gemini Live analyzed the content of the screen to recommend other similar products.
These improvements are made possible by Project Astra, which enables Gemini Live to comprehend live visual inputs properly. The features will first be available to Gemini Advanced subscribers on Android devices from the last week of this month and the company plans to make it available on other platforms and languages over time.
Joshua Hawkins from BGR explained the importance of these updates, stating that they are “a significant evolution in how we are able to communicate with AI on our mobile devices.” He discussed the potential for more natural and easy to use interactions and the continued development of these features is in line with Google’s overall vision for the integration of AI in people’s lives.
Right after that, there were some improvements on the Gemini platform, including the launch of Gemini 2.0 Pro and the experimental Gemini 2.0 Flash Thinking model that displays its reasoning to enhance its efficiency and explainability. The ongoing evolution is consistent with Google’s efforts to enhance the AI technology and offer people better and more personalized tools.