ChangelogNew
v1.1.0
Changelog Highlights
Feature Enhancements
- New Features:
- Geolocation, mobile scraping, 4x faster parsing, better webhooks,
- Credit packs, auto-recharges and batch scraping support.
- Iframe support and query parameter differentiation for URLs.
- Similar URL deduplication.
- Enhanced map ranking and sitemap fetching.
Performance Improvements
- Faster crawl status filtering and improved map ranking algorithm.
- Optimized Kubernetes setup and simplified build processes.
- Sitemap discoverability and performance improved
Bug Fixes
- Resolved issues:
- Badly formatted JSON, scrolling actions, and encoding errors.
- Crawl limits, relative URLs, and missing error handlers.
- Fixed self-hosted crawling inconsistencies and schema errors.
SDK Updates
- Added dynamic WebSocket imports with fallback support.
- Optional API keys for self-hosted instances.
- Improved error handling across SDKs.
Documentation Updates
- Improved API docs and examples.
- Updated self-hosting URLs and added Kubernetes optimizations.
- Added articles: mastering
/scrape
and/crawl
.
Miscellaneous
- Added new Firecrawl examples
- Enhanced metadata handling for webhooks and improved sitemap fetching.
- Updated blocklist and streamlined error messages.
- New Features:
Introducing Batch Scrape
You can now scrape multiple URLs simultaneously with our new Batch Scrape endpoint.
- Read more about the Batch Scrape endpoint here.
- Python SDK (1.4.x) and Node SDK (1.7.x) updated with batch scrape support.
Cancel Crawl in the SDKs, More Examples, Improved Speed
- Added crawl cancellation support for the Python SDK (1.3.x) and Node SDK (1.6.x)
- OpenAI Voice + Firecrawl example added to the repo
- CRM lead enrichment example added to the repo
- Improved our Docker images
- Limit and timeout fixes for the self hosted playwright scraper
- Improved speed of all scrapes
Fixes + Improvements (no version bump)
- Fixed 500 errors that would happen often in some crawled websites and when servers were at capacity
- Fixed an issue where v1 crawl status wouldn’t properly return pages over 10mb
- Fixed an issue where
screenshot
would return undefined - Push improvements that reduce speed times when a scraper fails
Introducing Actions
Interact with pages before extracting data, unlocking more data from every site!
Firecrawl now allows you to perform various actions on a web page before scraping its content. This is particularly useful for interacting with dynamic content, navigating through pages, or accessing content that requires user interaction.
- Version 1.5.x of the Node SDK now supports type-safe Actions.
- Actions are now available in the REST API and Python SDK (no version bumps required!).
Here is a python example of how to use actions to navigate to google.com, search for Firecrawl, click on the first result, and take a screenshot.
from firecrawl import FirecrawlApp app = FirecrawlApp(api_key="fc-YOUR_API_KEY") # Scrape a website: scrape_result = app.scrape_url('firecrawl.dev', params={ 'formats': ['markdown', 'html'], 'actions': [ {"type": "wait", "milliseconds": 2000}, {"type": "click", "selector": "textarea[title=\"Search\"]"}, {"type": "wait", "milliseconds": 2000}, {"type": "write", "text": "firecrawl"}, {"type": "wait", "milliseconds": 2000}, {"type": "press", "key": "ENTER"}, {"type": "wait", "milliseconds": 3000}, {"type": "click", "selector": "h3"}, {"type": "wait", "milliseconds": 3000}, {"type": "screenshot"} ] } ) print(scrape_result)
For more examples, check out our API Reference.
Mid-September Updates
Typesafe LLM Extract
- E2E Type Safety for LLM Extract in Node SDK version 1.5.x.
- 10x cheaper in the cloud version. From 50 to 5 credits per extract.
- Improved speed and reliability.
Rust SDK v1.0.0
- Rust SDK v1 is finally here! Check it out here.
Map Improved Limits
- Map smart results limits increased from 100 to 1000.
Faster scrape
- Scrape speed improved by 200ms-600ms depending on the website.
Launching changelog
- For now on, for every new release, we will be creating a changelog entry here.
Improvements
- Lots of improvements pushed to the infra and API. For all Mid-September changes, refer to the commits here.
September 8, 2024
Patch Notes (No version bump)
- Fixed an issue where some of the custom header params were not properly being set in v1 API. You can now pass headers to your requests just fine.
Firecrawl V1 is here! With that we introduce a more reliable and developer friendly API.
Here is what’s new:
- Output Formats for /scrape: Choose what formats you want your output in.
- New /map endpoint: Get most of the URLs of a webpage.
- Developer friendly API for /crawl/id status.
- 2x Rate Limits for all plans.
- Go SDK and Rust SDK.
- Teams support.
- API Key Management in the dashboard.
- onlyMainContent is now default to true.
- /crawl webhooks and websocket support.
Learn more about it here.
Start using v1 right away at https://firecrawl.dev