Introducing Authenticated Scraping

October 29, 2024

•

Eric Ciarla imageEric Ciarla

Launch Week II - Day 2: Introducing Location and Language Settings

Launch Week II - Day 2: Introducing Location and Language Settings image

Welcome to Day 2 of Firecrawl’s second Launch Week! Today, we’re thrilled to introduce our latest feature: Location and Language Settings.

Discover Location and Language Settings

With this new feature, you can now specify a country and preferred languages to receive content that’s tailored to your target location and linguistic preferences. This means more relevant and localized data for your web scraping projects.

How It Works

When you set the location parameters, Firecrawl utilizes an appropriate proxy (if available) and emulates the corresponding language and timezone settings. By default, the country is set to 'US' if not specified.

Getting Started with Location and Language Settings

To leverage these new settings, include the location object in your request body with the following properties:

  • country: ISO 3166-1 alpha-2 country code (e.g., 'US', 'AU', 'DE', 'JP'). Defaults to 'US'.
  • languages: An array of preferred languages and locales for the request in order of priority. Defaults to the language of the specified location.

Example Usage

Here’s how you can get started using Python:

from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key="fc-YOUR_API_KEY")

# Scrape a website with location and language settings
scrape_result = app.scrape_url('airbnb.com',
    params={
        'formats': ['markdown', 'html'],
        'location': {
            'country': 'BR',
            'languages': ['pt-BR']
        }
    }
)
print(scrape_result)

Understanding the Response

By specifying the location as Brazil ('BR') and the preferred language as Brazilian Portuguese ('pt-BR'), you’ll receive content as it appears to users in Brazil, in Portuguese.

Why Use Location and Language Settings?

  • Relevance: Access content that’s specific to a particular country or language.
  • Localization: Scrape websites as they appear to users in different regions.
  • Customization: Tailor your scraping to match your target audience or market.

What’s Next?

We’re just getting warmed up with Launch Week II! The Location and Language Settings are just one of the exciting new features we’re rolling out to enhance your web scraping capabilities.

We’d love to hear how you plan to use these new settings in your projects. Your feedback helps us continue to improve and tailor our services to better meet your needs.

Happy scraping, and stay tuned for Day 3 of Launch Week II tomorrow!

Ready to Build?

Start scraping web data for your AI apps today.
No credit card needed.

About the Author

Eric Ciarla image
Eric Ciarla@ericciarla

Eric Ciarla is the Chief Operating Officer (COO) of Firecrawl and leads marketing. He also worked on Mendable.ai and sold it to companies like Snapchat, Coinbase, and MongoDB. Previously worked at Ford and Fracta as a Data Scientist. Eric also co-founded SideGuide, a tool for learning code within VS Code with 50,000 users.

More articles by Eric Ciarla

How to Create an llms.txt File for Any Website

Learn how to generate an llms.txt file for any website using the llms.txt Generator and Firecrawl.

Announcing Firestarter, our open source tool that turns any website into a chatbot

Spin up a fully functional RAG chatbot from any website URL using Firecrawl and Upstash—clean markdown in, OpenAI-compatible API out, all in under a minute.

Building Fire Enrich, our open source data enrichment tool

See how we built Fire Enrich, an open source tool that uses Firecrawl, OpenAI, and a multi-agent system to automate data enrichment — fully transparent, extensible, and built for developers.

Cloudflare Error 1015: How to solve it?

Cloudflare Error 1015 is a rate limiting error that occurs when Cloudflare detects that you are exceeding the request limit set by the website owner.

Build an agent that checks for website contradictions

Using Firecrawl and Claude to scrape your website's data and look for contradictions.

Why Companies Need a Data Strategy for Generative AI

Learn why a well-defined data strategy is essential for building robust, production-ready generative AI systems, and discover practical steps for curation, maintenance, and integration.

Getting Started with OpenAI's Predicted Outputs for Faster LLM Responses

A guide to leveraging Predicted Outputs to speed up LLM tasks with GPT-4o models.

How to easily install requests with pip and python

A tutorial on installing the requests library in Python using various methods, with usage examples and troubleshooting tips