Open-source web data agent optimized for structured web research
Firecrawl runs a research-grade autonomous agent at firecrawl.dev/app/agent, powered by Spark 1 models optimized for structured web research. This repo gives you the open-source foundation to build your own — fork it, swap models, add skills, and deploy however you want.
# 1. Install the Firecrawl CLI and authenticate
npx -y firecrawl-cli@latest init -y --browser
# 2. Scaffold an agent project
firecrawl create agent -t next
Each layer builds on the one below it. Start at the top for a ready-to-use app, or go lower in the stack for finer control over the primitives.
| Layer | Description | Get started |
|---|---|---|
| undefinedNext.js Templateundefined | Chat UI, streaming, Skills, Subagents, structured output | firecrawl create agent -t next |
| undefinedExpress Templateundefined | API server with Skills, Subagents, structured output | firecrawl create agent -t express |
| ↑ | ||
| undefinedAgent Coreundefined | Orchestrator built on Deep Agents (LangChain). Skills, Subagents, structured output | firecrawl create agent -t library |
| ↑ | ||
| undefinedFirecrawl AI SDKundefined | Search, Scrape, Interact as Vercel AI SDK tools | npm i firecrawl-aisdk |
| ↑ | ||
| undefinedFirecrawl SDKundefined | Core API client for Scrape, Search, Crawl, Extract | npm i @mendable/firecrawl-js |
| ↑ | ||
| undefinedAPI Referenceundefined | REST API, use from any language | docs.firecrawl.dev |
| Level | Examples |
|---|---|
| Next.js | Full template |
| Express | API server |
| Agent Core | Basic · Structured output · Parallel Subagents · With Skills · Streaming |
| Firecrawl AI SDK | npmjs.com/package/firecrawl-aisdk |
The agent combines web tools with an AI model in a loop — it plans, acts, observes, and repeats until the task is done. The harness is Deep Agents (from LangChain), which gives us the plan-act loop, parallel task sub-agent spawning, and on-demand SKILL.md loading out of the box. Our agent-core wires Firecrawl’s tools into that runtime and layers on structured output and streaming.
agent-core/src/skills/definitions/, loaded on demand via Deep Agents’ skills middleware.task tool. Each has its own tool set and session state (e.g. an isolated interact browser session).formatOutput (JSON) and data processing via bashExec, a set of bash tools powered by just-bash.| Directory | What’s inside |
|---|---|
agent-core/ |
Core agent logic, orchestrator, Skills, tools |
agent-templates/ |
Deployment templates - Next.js, Express, Library |
MIT