Playwright MCP: Browser Automation for AI Agents with 34k+ GitHub Stars

Microsoft's Playwright MCP enables AI agents to automate web browsers using accessibility trees instead of screenshots. With 34k+ GitHub stars and active development, it's transforming browser automation for agentic workflows.

Playwright MCP is Microsoft's Model Context Protocol server that transforms Playwright's browser automation into a structured, LLM-friendly interface. With 34,300+ GitHub stars and active development (commits within the last 2 days), it enables AI agents to interact with web pages through accessibility trees instead of screenshots—making browser automation faster, cheaper, and more deterministic for agentic workflows.

What is Playwright MCP?

Playwright MCP is a Model Context Protocol (MCP) server developed by Microsoft that bridges the gap between large language models and web browsers. Rather than relying on vision models to interpret screenshots, Playwright MCP exposes Playwright's browser automation capabilities as structured, JSON-based tools that LLMs can call directly.

The project is maintained by Microsoft's Playwright team and is actively developed. It supports integration with Claude Desktop, VS Code, Cursor, Windsurf, Goose, Cline, and dozens of other AI agent platforms. The core innovation is using accessibility trees instead of pixel-based snapshots—this approach is more token-efficient, deterministic, and works reliably across different screen sizes and rendering contexts.

Playwright MCP runs as a local server that can be configured via JSON, supports multiple browsers (Chromium, Firefox, WebKit), and provides fine-grained control over browser behavior including authentication, permissions, proxies, and network interception.

Core Features and Architecture

1. Accessibility Tree-Based Interaction

Instead of sending screenshots to vision models, Playwright MCP generates structured accessibility trees that describe page elements, their properties, and relationships. This approach reduces token consumption by 10-100x compared to vision-based methods and eliminates ambiguity in element targeting.

The accessibility tree includes semantic information about buttons, forms, links, and interactive elements—exactly what LLMs need to make decisions about what to click or type.

2. 23+ Core Automation Tools

Playwright MCP exposes tools like:

  • browser_click – Click elements with optional modifiers (Ctrl, Shift, Alt)
  • browser_type – Type text into input fields
  • browser_navigate – Navigate to URLs
  • browser_screenshot – Capture page state (optional)
  • browser_evaluate – Execute JavaScript on the page
  • browser_fill_form – Populate form fields
  • browser_extract_data – Parse structured data from pages
  • browser_wait_for_element – Wait for dynamic content

Each tool is fully typed and includes detailed descriptions, making it easy for LLMs to understand when and how to use them.

3. Multi-Browser Support

Playwright MCP supports Chromium, Firefox, and WebKit browsers. You can specify which browser to use via configuration, and the server handles all the complexity of browser launch, context management, and cleanup.

4. Persistent and Isolated Sessions

The server supports three session modes:

  • Persistent Profile – Browser state (cookies, local storage, login sessions) persists across sessions
  • Isolated Mode – Each session starts fresh; useful for testing
  • Browser Extension – Connect to an existing browser tab with your logged-in sessions

5. Comprehensive Configuration

Playwright MCP accepts a JSON configuration file that controls:

  • Browser launch options (headless, channel, executable path)
  • Context options (viewport, device emulation, permissions)
  • Network settings (proxy, allowed/blocked origins)
  • Timeouts (action, navigation, expect)
  • Output handling (snapshots, console logs, network logs)
  • Security (secrets masking, file access restrictions)

6. Docker and Standalone Server Support

Playwright MCP can run as a standalone HTTP server (with SSE transport) or inside Docker. This enables deployment scenarios where the MCP server runs on a remote machine and multiple clients connect to it.

Get free AI agent insights weekly

Join our community of builders exploring the latest in AI agents, frameworks, and automation tools.

Join Free

Getting Started

Installation

The simplest way to get started is to install Playwright MCP in your MCP client. For VS Code:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

For Claude Desktop, add the same configuration to your MCP settings file. For Cursor, use the MCP settings UI to add a new server with command npx @playwright/mcp@latest.

Basic Usage Example

Once installed, you can ask your AI agent to automate browser tasks:

"Navigate to https://example.com, find the search box, type 'AI agents', and click the search button."

The agent will:

  1. Call browser_navigate with the URL
  2. Receive an accessibility tree of the page
  3. Identify the search box using the tree
  4. Call browser_type to enter text
  5. Call browser_click to submit the search

Configuration File Setup

For advanced scenarios, create a config.json:

{
  "browser": {
    "browserName": "chromium",
    "launchOptions": {
      "headless": true
    },
    "contextOptions": {
      "viewport": { "width": 1280, "height": 720 }
    }
  },
  "server": {
    "port": 8931,
    "host": "localhost"
  },
  "capabilities": ["core", "pdf", "vision"]
}

Then launch with: npx @playwright/mcp@latest --config config.json

Real-World Use Cases

1. Autonomous Web Testing

AI agents can navigate test scenarios, fill forms, verify page state, and generate test reports—all without pre-written test scripts. Playwright MCP's accessibility tree makes it easy for agents to understand page structure and make intelligent decisions about what to test next.

2. Data Extraction and Web Scraping

Extract structured data from dynamic websites that require interaction (login, pagination, filtering). The agent can navigate the site, interact with controls, and extract data using the browser_extract_data tool.

3. Workflow Automation

Automate repetitive tasks like filling out forms, uploading documents, or managing accounts across multiple SaaS platforms. Agents can handle multi-step workflows with conditional logic based on page state.

4. Accessibility Auditing

Since Playwright MCP uses accessibility trees, it's naturally suited for auditing web accessibility. Agents can navigate sites and report on ARIA labels, semantic HTML, keyboard navigation, and screen reader compatibility.

How It Compares

Playwright MCP vs. Playwright CLI with SKILLS

Playwright MCP is designed for exploratory automation, self-healing tests, and long-running workflows where maintaining continuous browser context is valuable. It's ideal for agents that benefit from rich introspection and iterative reasoning.

Playwright CLI with SKILLS is optimized for high-throughput coding agents that need to balance browser automation with large codebases and reasoning within limited context windows. CLI invocations are more token-efficient because they avoid loading large tool schemas.

Choose MCP for interactive, exploratory tasks; choose CLI+SKILLS for high-volume coding tasks.

Playwright MCP vs. Selenium WebDriver

Selenium is a mature, language-agnostic automation framework. Playwright MCP is specifically designed for LLM integration with structured, JSON-based tool definitions. Playwright MCP is faster, more reliable, and requires no vision models—but Selenium has broader language support and a larger ecosystem.

Playwright MCP vs. Puppeteer

Puppeteer is a Node.js library for headless Chrome automation. Playwright MCP is a protocol server that works with any MCP client (Claude, VS Code, Cursor, etc.). Playwright MCP supports multiple browsers and is optimized for LLM interaction; Puppeteer is lower-level and requires custom integration code.

What's Next

The Playwright MCP roadmap includes:

  • Enhanced Vision Capabilities – Optional coordinate-based interactions for complex UI elements
  • PDF Generation and Manipulation – Tools for creating and parsing PDFs
  • DevTools Integration – Performance profiling and debugging tools for agents
  • Broader MCP Client Support – Continued integration with new AI agent platforms
  • Self-Healing Tests – Agents that can adapt to UI changes automatically

The project is actively maintained with commits every few days. The community is growing, and adoption is accelerating as more AI agent platforms standardize on MCP.

Sources

Read more