Google's Project Mariner: Agents in the Browser

Google has unveiled Project Mariner, an experimental browser-based agent powered by Gemini. The prototype can navigate websites, fill forms, extract data, and complete multi-step web tasks — all through a Chrome extension.

How It Works

Project Mariner uses Gemini’s vision capabilities to understand web page layouts, combined with a specialized action model that determines the next interaction. The system:

Captures the current browser state
Identifies relevant page elements
Plans a sequence of actions
Executes through the Chrome DevTools Protocol
Adapts based on results

Demo Use Cases

In Google’s demonstrations, Project Mariner handled:

Shopping research — Comparing prices across multiple retailers
Travel planning — Searching flights, comparing options, and filling booking forms
Data extraction — Pulling structured data from multiple web pages into a spreadsheet
Form automation — Completing multi-page application forms

Competitive Landscape

Project Mariner enters a crowded space alongside:

Claude Computer Use (Anthropic) — Full desktop, not just browser (see our deep dive on Claude’s computer use)
Operator (OpenAI) — Browser-based agent in preview
Browser Use (open source) — Community-driven browser agent framework

What makes Project Mariner distinctive is its deep Chrome integration — since it’s built by Google, it has access to browser internals that third-party agents can’t reach.

Privacy and Safety

Google has implemented several safeguards:

Users must explicitly activate the agent per session
Sensitive actions (payments, logins) require manual confirmation
The agent operates in an isolated browser context
All actions are logged for user review

Project Mariner hasn’t been publicly released yet, but it signals Google’s serious commitment to the agent space. For more on Google’s agent ambitions, see our coverage of Google’s Remy agent and DeepMind’s AlphaEvolve.