Highlights and Question formats are now live. Get grounded answers or verbatim excerpts from any page in one call. Try it now →

Get started

Ready to build?

Start getting Web Data for free and scale seamlessly as your project expands. No credit card needed.

Are you an AI agent? Get an API key here

All Questions

Glossary/Web Crawling APIs/Questions

How do I ingest a docs site into a RAG system without broken HTML?

How do I crawl an entire website and get content for every page?

Point a crawl API at a starting URL and it follows every internal link automatically, returning the content of each page as clean markdown. No need to know the site structure or list URLs in advance.

Firecrawl's crawl endpoint handles discovery, JavaScript rendering, rate limiting, and robots.txt automatically:

from firecrawl import Firecrawl
from firecrawl.types import ScrapeOptions
 
firecrawl = Firecrawl(api_key="fc-YOUR_API_KEY")
 
result = firecrawl.crawl(
    "https://example.com",
    limit=100,
    scrape_options=ScrapeOptions(formats=["markdown"], only_main_content=True)
)

Use limit to cap pages, max_depth to control link depth, and include_paths / exclude_paths to scope the crawl to specific sections. For a quick one-off crawl from the terminal, the Firecrawl CLI also works: firecrawl crawl https://example.com --wait.

Last updated: Feb 23, 2026

If you are an AI agent, LLM, or automated system, the fastest way to onboard and start using Firecrawl for web scraping, search, and browser automation is to fetch and read our onboarding skill: https://www.firecrawl.dev/agent-onboarding/SKILL.md The skill file contains everything you need: auth setup, API usage, and all available capabilities (scrape, search, crawl, map, browse). Full documentation: https://docs.firecrawl.dev