Introducing Authenticated ScrapingScrape behind auth walls. Join our waitlist today.

Sep 16, 2024

•

How to Use OpenAI's o1 Reasoning Models in Your Applications

OpenAI has recently unveiled its o1 series models, marking a significant leap in the realm of complex reasoning with AI. These models are designed to “think before they answer,” producing extensive internal chains of thought before responding. In this guide, we’ll explore how to integrate these powerful models into your applications, with a practical example of crawling a website using the o1-preview model.

Introduction to o1 Models

The o1 models are large language models trained with reinforcement learning to excel in complex reasoning tasks. There are two models available:

o1-preview: An early preview designed for reasoning about hard problems using broad general knowledge.
o1-mini: A faster, cost-effective version ideal for coding, math, and science tasks that don’t require extensive general knowledge.

While these models offer significant advancements, they are not intended to replace GPT-4o in all use cases. If your application requires image inputs, function calling, or consistent fast response times, GPT-4o and GPT-4o mini remain the optimal choices.

Getting Started with o1 Models in Your Applications

To demonstrate how to integrate the o1 models into your apps, we’ll walk through a practical example: crawling a website to find specific information using the o1-preview model.

Prerequisites

Ensure you have the following libraries installed:

pip install firecrawl-py openai

Step 1: Import Necessary Libraries

We’ll start by importing the required modules.

import os
from firecrawl import FirecrawlApp
import json
from dotenv import load_dotenv
from openai import OpenAI

Step 2: Load Environment Variables

We’ll use environment variables to securely manage our API keys.

# Load environment variables
load_dotenv()

# Retrieve API keys from environment variables
firecrawl_api_key = os.getenv("FIRECRAWL_API_KEY")
openai_api_key = os.getenv("OPENAI_API_KEY")

Step 3: Initialize the FirecrawlApp and OpenAI Client

# Initialize the FirecrawlApp and OpenAI client
app = FirecrawlApp(api_key=firecrawl_api_key)
client = OpenAI(api_key=openai_api_key)

Step 4: Define the Objective and URL

Set the website you want to crawl and the objective of the crawl.

url = "https://example.com"
objective = "Find the contact email for customer support"

Step 5: Determine the Search Parameter Using o1-preview

We’ll use the o1-preview model to come up with a 1-2 word search parameter based on our objective.

map_prompt = f"""
The map function generates a list of URLs from a website and accepts a search parameter. Based on the objective: {objective}, suggest a 1-2 word search parameter to find the needed information. Only respond with 1-2 words.
"""

# Highlighted OpenAI call
completion = client.chat.completions.create(
    model="o1-preview",
    messages=[
        {"role": "user", "content": map_prompt}
    ]
)

map_search_parameter = completion.choices[0].message.content.strip()

Step 6: Map the Website Using the Search Parameter

Use the firecrawl app to map the website and find relevant links.

map_website = app.map_url(url, params={"search": map_search_parameter})

Step 7: Scrape the Top Pages and Check for the Objective

We’ll scrape the top links from the mapping result and check if they meet our objective.

# Get top 3 links
top_links = map_website[:3] if isinstance(map_website, list) else []

for link in top_links:
    # Scrape the page
    scrape_result = app.scrape_url(link, params={'formats': ['markdown']})

    # Check if objective is met
    check_prompt = f"""
    Given the following scraped content and objective, determine if the objective is met with high confidence.
    If it is, extract the relevant information in a simple and concise JSON format.
    If the objective is not met with high confidence, respond with 'Objective not met'.

    Objective: {objective}
    Scraped content: {scrape_result['markdown']}
    """

    completion = client.chat.completions.create(
        model="o1-preview",
        messages=[
            {"role": "user", "content": check_prompt}
        ]
    )

    result = completion.choices[0].message.content.strip()

    if result != "Objective not met":
        try:
            extracted_info = json.loads(result)
            break
        except json.JSONDecodeError:
            continue
else:
    extracted_info = None

Step 8: Display the Extracted Information

if extracted_info:
    print("Extracted Information:")
    print(json.dumps(extracted_info, indent=2))
else:
    print("Objective not met with the available content.")

Conclusion

In this article, we’ve explored how to integrate OpenAI’s new o1 reasoning models into your applications to perform complex tasks like crawling a website and extracting specific information. The o1 models showcase impressive capabilities in reasoning and problem-solving, making them valuable tools for developers tackling challenging AI tasks.

Whether you’re working on advanced coding problems, mathematical computations, or intricate scientific queries, the o1 models can significantly enhance your application’s reasoning abilities.

Happy coding!

References

🔥

Ready to Build?

Start scraping web data for your AI apps today.
No credit card needed.

About the Author

Eric Ciarla@ericciarla

Eric Ciarla is the Chief Operating Officer (COO) of Firecrawl and leads marketing. He also worked on Mendable.ai and sold it to companies like Snapchat, Coinbase, and MongoDB. Previously worked at Ford and Fracta as a Data Scientist. Eric also co-founded SideGuide, a tool for learning code within VS Code with 50,000 users.