Google has launched a new AI model called Gemini 2.5 Computer Use. This model can navigate web pages and interact with them just like a person — filling out forms, clicking buttons, dragging items, and more. The move marks a major step in making AI agents more useful in real-world web tasks.
What Is Gemini 2.5 Computer Use?
Unlike standard AI models that rely on direct programming interfaces (APIs) or backend access, Gemini 2.5 Computer Use operates within a browser. It perceives the interface visually and then takes steps — such as opening links, typing text, and dragging items — based on what it “sees.”
This makes it especially useful for websites or tools that don’t offer APIs or programmatic access. The AI can “use” the website in a way similar to a person.
Key Features & Capabilities
Here are some of the capabilities and features of this new model:
| Feature | Description |
|---|---|
| Browser interaction | Supports 13 predefined actions like click, type, open URL, drag & drop |
| Visual understanding & reasoning | The AI “sees” interfaces and figures out how to act, rather than relying on structured inputs |
| Agentic tasks | It can carry out multi-step tasks, e.g. filling forms, navigating menus |
| Access & availability | Developers can use it via Google AI Studio and Vertex AI |
| Demo access | Public demos are available through Browserbase The Verge |
| Benchmark performance | Google claims it outperforms alternatives on multiple web and mobile benchmark tests |
| Limitations | Public demos are available through Browserbase, The Verge |
Why It Matters
- Bridging gaps: Many web services don’t offer APIs or structured access. This model can operate directly through the visual interface, opening possibilities for automation in difficult-to-reach web areas.
- More capable agents: With the ability to “use” the web, AI assistants become more powerful — they can go and fetch data, fill forms, and do research steps.
- Developer tools: By making it available through AI Studio and Vertex AI, Google lets developers experiment and build new apps using this capability.
- Competitive edge: Google’s approach differs from models that have deep OS-level or system-level control. Gemini 2.5 Computer Use is more constrained, but more secure and easier to control.
What’s Next & Challenges
- Google notes this is currently limited to browser-level actions, so it doesn’t have full operating-system control.
- Safety, privacy, and trust remain important concerns. Letting an AI act inside web interfaces must be carefully audited to prevent harmful actions.
- Google may expand the action set, improve performance, and add more safety constraints over time.








