ChangelogNew

  • v1.1.0

    Changelog Highlights

    Feature Enhancements

    • New Features:
      • Geolocation, mobile scraping, 4x faster parsing, better webhooks,
      • Credit packs, auto-recharges and batch scraping support.
      • Iframe support and query parameter differentiation for URLs.
      • Similar URL deduplication.
      • Enhanced map ranking and sitemap fetching.

    Performance Improvements

    • Faster crawl status filtering and improved map ranking algorithm.
    • Optimized Kubernetes setup and simplified build processes.
    • Sitemap discoverability and performance improved

    Bug Fixes

    • Resolved issues:
      • Badly formatted JSON, scrolling actions, and encoding errors.
      • Crawl limits, relative URLs, and missing error handlers.
    • Fixed self-hosted crawling inconsistencies and schema errors.

    SDK Updates

    • Added dynamic WebSocket imports with fallback support.
    • Optional API keys for self-hosted instances.
    • Improved error handling across SDKs.

    Documentation Updates

    • Improved API docs and examples.
    • Updated self-hosting URLs and added Kubernetes optimizations.
    • Added articles: mastering /scrape and /crawl.

    Miscellaneous

    • Added new Firecrawl examples
    • Enhanced metadata handling for webhooks and improved sitemap fetching.
    • Updated blocklist and streamlined error messages.
  • Batch Scrape

    Introducing Batch Scrape

    You can now scrape multiple URLs simultaneously with our new Batch Scrape endpoint.

    • Read more about the Batch Scrape endpoint here.
    • Python SDK (1.4.x) and Node SDK (1.7.x) updated with batch scrape support.
  • Cancel Crawl in the SDKs, More Examples, Improved Speed

    • Added crawl cancellation support for the Python SDK (1.3.x) and Node SDK (1.6.x)
    • OpenAI Voice + Firecrawl example added to the repo
    • CRM lead enrichment example added to the repo
    • Improved our Docker images
    • Limit and timeout fixes for the self hosted playwright scraper
    • Improved speed of all scrapes
  • Fixes + Improvements (no version bump)

    • Fixed 500 errors that would happen often in some crawled websites and when servers were at capacity
    • Fixed an issue where v1 crawl status wouldn’t properly return pages over 10mb
    • Fixed an issue where screenshot would return undefined
    • Push improvements that reduce speed times when a scraper fails
  • Actions

    Introducing Actions

    Interact with pages before extracting data, unlocking more data from every site!

    Firecrawl now allows you to perform various actions on a web page before scraping its content. This is particularly useful for interacting with dynamic content, navigating through pages, or accessing content that requires user interaction.

    • Version 1.5.x of the Node SDK now supports type-safe Actions.
    • Actions are now available in the REST API and Python SDK (no version bumps required!).

    Here is a python example of how to use actions to navigate to google.com, search for Firecrawl, click on the first result, and take a screenshot.

    from firecrawl import FirecrawlApp
    
    app = FirecrawlApp(api_key="fc-YOUR_API_KEY")
    
    # Scrape a website:
    scrape_result = app.scrape_url('firecrawl.dev',
        params={
            'formats': ['markdown', 'html'],
            'actions': [
                {"type": "wait", "milliseconds": 2000},
                {"type": "click", "selector": "textarea[title=\"Search\"]"},
                {"type": "wait", "milliseconds": 2000},
                {"type": "write", "text": "firecrawl"},
                {"type": "wait", "milliseconds": 2000},
                {"type": "press", "key": "ENTER"},
                {"type": "wait", "milliseconds": 3000},
                {"type": "click", "selector": "h3"},
                {"type": "wait", "milliseconds": 3000},
                {"type": "screenshot"}
            ]
        }
    )
    print(scrape_result)
    

    For more examples, check out our API Reference.

  • Firecrawl E2E Type Safe LLM Extract

    Mid-September Updates

    Typesafe LLM Extract

    • E2E Type Safety for LLM Extract in Node SDK version 1.5.x.
    • 10x cheaper in the cloud version. From 50 to 5 credits per extract.
    • Improved speed and reliability.

    Rust SDK v1.0.0

    • Rust SDK v1 is finally here! Check it out here.

    Map Improved Limits

    • Map smart results limits increased from 100 to 1000.

    Faster scrape

    • Scrape speed improved by 200ms-600ms depending on the website.

    Launching changelog

    • For now on, for every new release, we will be creating a changelog entry here.

    Improvements

    • Lots of improvements pushed to the infra and API. For all Mid-September changes, refer to the commits here.
  • September 8, 2024

    Patch Notes (No version bump)

    • Fixed an issue where some of the custom header params were not properly being set in v1 API. You can now pass headers to your requests just fine.
  • Firecrawl V1

    Firecrawl V1 is here! With that we introduce a more reliable and developer friendly API.

    Here is what’s new:

    • Output Formats for /scrape: Choose what formats you want your output in.
    • New /map endpoint: Get most of the URLs of a webpage.
    • Developer friendly API for /crawl/id status.
    • 2x Rate Limits for all plans.
    • Go SDK and Rust SDK.
    • Teams support.
    • API Key Management in the dashboard.
    • onlyMainContent is now default to true.
    • /crawl webhooks and websocket support.

    Learn more about it here.

    Start using v1 right away at https://firecrawl.dev