Reve AI
리소스 마켓
Tool개발무료

openbrowser

Let AI agents browse the web. An autonomous toolkit for browser-based AI agents.

9.4k

AI-powered autonomous web browsing framework for TypeScript.


Give an AI agent a browser. It clicks, types, navigates, and extracts data — autonomously completing tasks on any website. Built on Playwright with first-class support for OpenAI, Anthropic, and Google models.

Production-ready since v1.0. Contributions welcome.

Why Open Browser?

  • Autonomous agents: Describe a task in natural language, and an AI agent navigates the web to complete it — clicking, typing, scrolling, and extracting data without manual scripting
  • Multi-model support: Works with OpenAI, Anthropic, and Google out of the box via the Vercel AI SDK — swap models with a single flag
  • Interactive REPL: Drop into a live browser session and issue commands interactively — great for debugging, prototyping, and exploration
  • Sandboxed execution: Run agents in resource-limited environments with CPU/memory monitoring, timeouts, and domain restrictions
  • Production-ready: Stall detection, cost tracking, session management, replay recording, and comprehensive error handling
  • Open source: MIT licensed, fully extensible, bring your own API keys

Quick Start

# Install dependencies
bun install

# Set up your API keys
cp .env.example .env
# Edit .env with your API keys

# Run an agent
bun run open-browser run "Find the top story on Hacker News and summarize it"

# Or open a browser interactively
bun run open-browser interactive

Architecture

Open Browser is a monorepo with three packages:

PackageDescription
open-browserCore library — agent logic, browser control, DOM analysis, LLM integration
@open-browser/cliCommand-line interface for running agents and browser commands
@open-browser/sandboxSandboxed execution with resource limits and monitoring

CLI Commands

Run an AI Agent

open-browser run <task> [options]

Describe what you want done. The agent figures out the rest.

# Search and extract information
open-browser run "Find the price of the MacBook Pro on apple.com"

# Fill out forms
open-browser run "Sign up for the newsletter on example.com with test@email.com"

# Multi-step workflows
open-browser run "Go to GitHub, find the open-browser repo, and star it"
OptionDescription
-m, --model <model>Model to use (default: gpt-4o)
-p, --provider Provider: openai, anthropic, google
--headless / --no-headlessShow or hide the browser window
--max-steps <n>Max agent steps (default: 25)
-v, --verboseShow detailed step info
--no-costHide cost tracking

Browser Commands

open-browser open <url>              # Open a URL
open-browser click <selector>        # Click an element
open-browser type <selector> <text>  # Type into an input
open-browser screenshot [output]     # Capture a screenshot
open-browser eval <expression>       # Run JavaScript on the page
open-browser extract <goal>          # Extract content as markdown
open-browser state                   # Show current URL, title, and tabs
open-browser sessions                # List active browser sessions

Interactive REPL

open-browser interactive

Drop into a live browser> prompt with full control:


GitHub에서 전체 내용 보기