Google has unveiled Project Mariner, an experimental browser-based agent powered by Gemini. The prototype can navigate websites, fill forms, extract data, and complete multi-step web tasks — all through a Chrome extension.
How It Works
Project Mariner uses Gemini’s vision capabilities to understand web page layouts, combined with a specialized action model that determines the next interaction. The system:
- Captures the current browser state
- Identifies relevant page elements
- Plans a sequence of actions
- Executes through the Chrome DevTools Protocol
- Adapts based on results
Demo Use Cases
In Google’s demonstrations, Project Mariner handled:
- Shopping research — Comparing prices across multiple retailers
- Travel planning — Searching flights, comparing options, and filling booking forms
- Data extraction — Pulling structured data from multiple web pages into a spreadsheet
- Form automation — Completing multi-page application forms
Competitive Landscape
Project Mariner enters a crowded space alongside:
- Claude Computer Use (Anthropic) — Full desktop, not just browser (see our deep dive on Claude’s computer use)
- Operator (OpenAI) — Browser-based agent in preview
- Browser Use (open source) — Community-driven browser agent framework
What makes Project Mariner distinctive is its deep Chrome integration — since it’s built by Google, it has access to browser internals that third-party agents can’t reach.
Privacy and Safety
Google has implemented several safeguards:
- Users must explicitly activate the agent per session
- Sensitive actions (payments, logins) require manual confirmation
- The agent operates in an isolated browser context
- All actions are logged for user review
Project Mariner hasn’t been publicly released yet, but it signals Google’s serious commitment to the agent space. For more on Google’s agent ambitions, see our coverage of Google’s Remy agent and DeepMind’s AlphaEvolve.