openbrowser
Let AI agents browse the web. An autonomous toolkit for browser-based AI agents.
AI-powered autonomous web browsing framework for TypeScript.
Give an AI agent a browser. It clicks, types, navigates, and extracts data — autonomously completing tasks on any website. Built on Playwright with first-class support for OpenAI, Anthropic, and Google models.
Production-ready since v1.0. Contributions welcome.
Why Open Browser?
- Autonomous agents: Describe a task in natural language, and an AI agent navigates the web to complete it — clicking, typing, scrolling, and extracting data without manual scripting
- Multi-model support: Works with OpenAI, Anthropic, and Google out of the box via the Vercel AI SDK — swap models with a single flag
- Interactive REPL: Drop into a live browser session and issue commands interactively — great for debugging, prototyping, and exploration
- Sandboxed execution: Run agents in resource-limited environments with CPU/memory monitoring, timeouts, and domain restrictions
- Production-ready: Stall detection, cost tracking, session management, replay recording, and comprehensive error handling
- Open source: MIT licensed, fully extensible, bring your own API keys
Quick Start
# Install dependencies
bun install
# Set up your API keys
cp .env.example .env
# Edit .env with your API keys
# Run an agent
bun run open-browser run "Find the top story on Hacker News and summarize it"
# Or open a browser interactively
bun run open-browser interactive
Architecture
Open Browser is a monorepo with three packages:
| Package | Description |
|---|---|
open-browser | Core library — agent logic, browser control, DOM analysis, LLM integration |
@open-browser/cli | Command-line interface for running agents and browser commands |
@open-browser/sandbox | Sandboxed execution with resource limits and monitoring |
CLI Commands
Run an AI Agent
open-browser run <task> [options]
Describe what you want done. The agent figures out the rest.
# Search and extract information
open-browser run "Find the price of the MacBook Pro on apple.com"
# Fill out forms
open-browser run "Sign up for the newsletter on example.com with test@email.com"
# Multi-step workflows
open-browser run "Go to GitHub, find the open-browser repo, and star it"
| Option | Description |
|---|---|
-m, --model <model> | Model to use (default: gpt-4o) |
-p, --provider | Provider: openai, anthropic, google |
--headless / --no-headless | Show or hide the browser window |
--max-steps <n> | Max agent steps (default: 25) |
-v, --verbose | Show detailed step info |
--no-cost | Hide cost tracking |
Browser Commands
open-browser open <url> # Open a URL
open-browser click <selector> # Click an element
open-browser type <selector> <text> # Type into an input
open-browser screenshot [output] # Capture a screenshot
open-browser eval <expression> # Run JavaScript on the page
open-browser extract <goal> # Extract content as markdown
open-browser state # Show current URL, title, and tabs
open-browser sessions # List active browser sessions
Interactive REPL
open-browser interactive
Drop into a live browser> prompt with full control:
같은 카테고리 다른 리소스
Next.js
React 기반 풀스택 프레임워크. App Router + RSC가 사실상 표준.
shadcn/ui
복사-붙여넣기 React 컴포넌트 모음. npm 의존성이 아닌 코드 소유권 모델.
Supabase
PostgreSQL 기반 BaaS. Auth · Realtime · Storage · Edge Functions 통합.
Anthropic MCP
Claude가 외부 도구/데이터에 접근하도록 해주는 프로토콜 표준. 생태계의 근간.