Building a Trend Detection System with AI in TypeScript: A Step-by-Step Guide
Introduction
In this comprehensive guide, we’ll explore the development of a sophisticated social media trend detection system built with TypeScript and powered by AI. You’ll learn how to create a robust solution that monitors social platforms and news websites, analyzes emerging trends, and delivers them in real-time as Slack messages.
Before we dive into the technical details and implementation steps, watch a video preview of the project:
While the video demonstrates the project detecting AI-related trends from specific sources, you have the flexibility to customize it for monitoring any topics or themes from your preferred websites and Twitter accounts. If that sounds interesting, let’s set up your development environment for running the project locally.
Note: Before starting this project, ensure you have the following prerequisites installed:
- Node.js (version 16 or higher)
npm
(Node Package Manager)- A code editor like VS Code
- Git for version control
- A Slack workspace with admin privileges
- X Developer Account (for X API access)
- Basic knowledge of TypeScript and Node.js
Project Setup
We start by cloning the project’s GitHub repository, which is maintained by Eric Ciarla, co-founder of Firecrawl:
git clone https://github.com/ericciarla/trendFinder
cd trendFinder
Next, install dependencies:
npm install
Then, configure your .env
file:
cp .env.example .env
# Edit .env with your configuration
if you open .env.example
, you will see that our app depends on four core services:
- Slack Webhook - For sending notifications about detected trends
- X (Twitter) API - For monitoring tweets and engagement metrics
- Together AI - For analyzing content and detecting trends with LLMs
- Firecrawl API - For scraping and monitoring web content
To run the project locally, you will need to obtain the necessary URLs and API keys from these services. Below, you will see some instructions for setting up each required service and obtaining the necessary credentials.
Obtaining API Tokens
X (Twitter) Bearer Token
The X API is a crucial component of our trend detection system. It allows us to monitor prominent accounts in real-time, track engagement metrics, and identify emerging topics. The Bearer Token provides secure authentication for making API requests through your own X developer account. Here are the instructions to get your token:
- Go to Twitter Developer Portal
- Create a developer account if needed
- Create a new project and app (free plan accounts already have an app ready)
- Navigate to “Keys and Tokens”
- Generate/copy your Bearer Token (OAuth 2.0)
- Add to
.env
file
Firecrawl API Key
Firecrawl serves as our primary web content extraction engine, offering several key advantages for trend detection:
- AI-Powered Content Extraction: Uses natural language understanding instead of brittle HTML selectors, ensuring reliable trend detection even when websites change.
- Automated Content Discovery: Automatically processes entire website sections, ideal for monitoring news sites and blogs
- Multiple Output Formats: Supports structured data, markdown, and plain text formats for seamless integration with Together AI
- Built-in Rate Limiting: Handles request management automatically, ensuring stable monitoring
Since Firecrawl is a scraping engine, you will need an API key to connect to it through its TypeScript dependency:
- Visit Firecrawl
- Create an account
- Navigate to your dashboard
- Generate and copy your API key
- Add to
.env
file
Together AI Token
Together AI powers the intelligence layer of our trend detection system:
- Natural Language Processing: Analyzes scraped content to identify emerging trends and patterns
- Sentiment Analysis: Evaluates public sentiment and engagement around potential trends
- Content Summarization: Generates concise summaries of trends for Slack notifications
To get an API token, follow these steps:
- Visit Together AI
- Sign up for an account
- Navigate to API settings/dashboard
- Generate and copy your API key
- Add to
.env
file
Setting Up Slack Webhook
Finally, you will need a Slack webhook URL to receive real-time notifications about emerging trends. When our system runs, it scrapes provided list of sources (X accounts, websites), detects trends related to our specified topics, summarizes their contents and delivers them as a Slack message through the webhook.
To create a webhook for your account, follow these steps:
-
Create a Slack Workspace (Log in if you already have one)
- Visit slack.com
- Click “Create a new workspace”
- Follow the setup wizard to create your workspace
- Verify your email address
-
Create a Slack App
- Go to api.slack.com/apps
- Click “Create New App”
- Choose “From scratch”
- Name your app (e.g., “Trend Finder”)
- Select your workspace
- Click “Create App”
-
Enable Incoming Webhooks
- In your app’s settings, click “Incoming Webhooks”
- Toggle “Activate Incoming Webhooks” to On
- Click “Add New Webhook to Workspace”
-
Configure the Webhook
- Choose the channel where you want notifications to appear
- Click “Allow”
- You’ll see your new webhook URL in the list
- Copy the Webhook URL (it starts with
https://hooks.slack.com/services/
)
-
Add to Environment Variables
- Open your
.env
file - Add your webhook URL:
- Open your
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
- Test the Webhook (Optional)
- You can test your webhook using curl:
curl -X POST -H 'Content-type: application/json' --data '{"text":"Hello from Trend Finder!"}' YOUR_WEBHOOK_URL
First project run
Once the Node dependencies are installed and environment variables are configured, you can launch the app with a single command:
npm run start
The command starts the entire process and sends a Slack message to your workspace with found trends upon completion. By default, the project is configured to watch AI trends. Once we explore each component, you can change this default behavior.
System Architecture Overview
In this section, let’s break down the main components of our system.
Core Components
1. Entry Point (src/index.ts
)
The application’s entry point is minimal and focused, setting up either a one-time execution or a scheduled cron job. It imports the main controller and can be configured to run on a schedule (currently commented out but set for 5 PM daily).
2. Cron Controller (src/controllers/cron.ts
)
The controller orchestrates the entire workflow in a sequential process:
- Fetches source configurations
- Scrapes content from sources
- Generates an AI-analyzed draft
- Sends the results to Slack
3. Source Management (src/services/getCronSources.ts
)
This service manages the content sources, supporting two types of inputs:
- Websites (these will be scraped with Firecrawl)
- Twitter/X accounts (scraped with the X Developer API)
The service verifies API keys and filters content sources to prevent unauthorized access attempts. The configuration file includes multiple AI news websites and one X account. While the file lists several prominent AI news X accounts, most are currently disabled in comments because the X Developer API free tier restricts scraping to one account every 15 minutes.
This is where you would add your own sources if you wish to monitor a trend other than AI.
4. Content Scraping (src/services/scrapeSources.ts
)
A robust scraping service that:
- Handles Twitter/X API integration for social media content
- Uses Firecrawl for web page content extraction
- Implements strong typing and structured extraction with Zod schemas
- Normalizes data from different sources into a consistent format
It is in this file that the topic of interest is specified as “AI”. To change this, you need to update the lines 21 and 96.
5. Draft Generation (src/services/generateDraft.ts
)
The AI analysis component that:
- Uses Together AI’s Llama 3.1 model
- Processes raw content through structured prompts
- Implements JSON schema validation
- Formats content into readable Slack messages
This script has several parts that make it tailored for watching AI trends. You would need to change those parts as well to choose a different topic.
6. Notification Service (src/services/sendDraft.ts
)
A straightforward service that delivers the processed content to Slack via webhooks, with proper error handling and logging.
Infrastructure
The application is built with robust infrastructure and development tools to ensure reliability and maintainability:
Docker Support
The application includes comprehensive Docker support with:
- Multi-stage builds for optimization
- Environment variable management
- Docker Compose configuration for easy deployment
Configuration Management
The system uses:
- Environment variables for sensitive configuration
- TypeScript for type safety
- Proper error handling throughout the pipeline
Key Features
- Modular Architecture: Each component is self-contained and follows single-responsibility principles.
- Type Safety: Comprehensive TypeScript implementation with Zod schemas for runtime validation.
- Error Handling: Robust error handling at each step of the pipeline.
- Scalability: Docker support enables easy deployment and scaling.
- API Integration: Supports multiple data sources with extensible architecture.
- AI Analysis: Leverages advanced AI models for content analysis.
Development Tools
The project uses modern development tools:
- TypeScript for type safety
- Nodemon for development hot-reloading
- Docker for containerization
- Environment variable management
- Proper logging throughout the system
This architecture allows for easy maintenance, testing, and extension of functionality while maintaining robust error handling and type safety throughout the application pipeline.
In-depth Project Breakdown
In this section, we will analyze each component of the project in detail, breaking down the implementation steps and technical considerations for each major feature.
1. Specifying the resources to scrape
In src/services/getCronSources.ts
, we start by importing dotenv
:
import dotenv from "dotenv";
dotenv.config();
This allows the application to securely load configuration values from a .env
file, which is a common practice for managing sensitive information like API keys and credentials.
Then, we define a new function called getCronSources
:
export async function getCronSources() {...}
The function is async
because it needs to make network requests to fetch content from external sources. This allows the application to handle multiple requests efficiently without blocking.
In the function body, we start a parent try-catch
block:
export async function getCronSources() {
try {
console.log("Fetching sources...");
// Check for required API keys
const hasXApiKey = !!process.env.X_API_BEARER_TOKEN;
const hasFirecrawlKey = !!process.env.FIRECRAWL_API_KEY;
... // continued below
The code above performs important validation by checking for required API keys. It uses the double exclamation mark (!!) operator to convert the environment variables into boolean values, making it easy to verify if both the X API bearer token and Firecrawl API key are present. This validation step is crucial before attempting to make any API calls to ensure the application has proper authentication credentials.
// ... continuation of the above block
// Filter sources based on available API keys
const sources = [
// High priority sources (Only 1 x account due to free plan rate limits)
...(hasFirecrawlKey ? [
{ identifier: 'https://www.firecrawl.dev/blog' },
{ identifier: 'https://openai.com/news/' },
{ identifier: 'https://www.anthropic.com/news' },
{ identifier: 'https://news.ycombinator.com/' },
{ identifier: 'https://www.reuters.com/technology/artificial-intelligence/' },
{ identifier: 'https://simonwillison.net/' },
{ identifier: 'https://buttondown.com/ainews/archive/' },
] : []),
...(hasXApiKey ? [
{ identifier: 'https://x.com/skirano' },
] : []),
];
return sources.map(source => source.identifier);
} catch (error) {
console.error(error);
}
}
The code uses the ternary operator (? :
) to conditionally include sources based on available API keys. For each API key check, if the condition before the ?
is true (e.g. hasFirecrawlKey
is true), the array of sources after the ?
is included. Otherwise, if the condition is false, an empty array after the :
is used instead.
This conditional logic ensures we only try to fetch from sources where we have valid API credentials. The spread operator (...
) is used to flatten these conditional arrays into a single sources array.
For error handling, the entire function is wrapped in a try-catch
block. If any error occurs during execution, it will be caught and logged to the console via console.error()
. This prevents the application from crashing if there are issues with environment variables or API calls.
2. Scraping specified resources with X and Firecrawl
Inside src/services/scrapeSources.ts
, we write the functionality to scrape the resources specified in getCronSources.ts
with Firecrawl and X API.
The script starts with the following imports and setup:
import FirecrawlApp from "@mendable/firecrawl-js";
import dotenv from "dotenv";
import { z } from "zod";
dotenv.config();
These imports provide essential functionality for the scraping service:
FirecrawlApp
: A JavaScript client for interacting with the Firecrawl API to scrape web contentdotenv
: For loading environment variables from a.env
filezod
: A TypeScript-first schema validation library
The dotenv.config()
call loads environment variables at runtime, making them accessible via process.env
. This is important since we’ll need API keys and other configuration stored in environment variables.
// Initialize Firecrawl
const app = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY });
// 1. Define the schema for our expected JSON
const StorySchema = z.object({
headline: z.string().describe("Story or post headline"),
link: z.string().describe("A link to the post or story"),
date_posted: z.string().describe("The date the story or post was published"),
});
const StoriesSchema = z.object({
stories: z
.array(StorySchema)
.describe("A list of today's AI or LLM-related stories"),
});
The code above initializes Firecrawl with an API key and defines two Zod schemas for validating the data structure we expect to receive from our scraping operations.
The StorySchema
defines the shape of individual story objects, with three required string fields:
headline
: The title or headline of the story/postlink
: URL linking to the full contentdate_posted
: Publication timestamp
The StoriesSchema
wraps this in an array, expecting multiple story objects within a “stories” property. This schema will be used by Firecrawl’s scraping engine to format its output according to our needs.
The .describe()
method calls on each field are essential - they provide semantic descriptions that Firecrawl’s AI engine uses to intelligently identify and extract the correct data from web pages. By understanding these descriptions, the AI can automatically determine the appropriate HTML elements and CSS selectors to target when scraping content.
export async function scrapeSources(sources: string[]) {
// ... continued below
Then, we start a function scrapeSources
that takes an array of source URLs as input and will handle the scraping of content from each provided source.
const num_sources = sources.length;
console.log(`Scraping ${num_sources} sources...`);
let combinedText: { stories: any[] } = { stories: [] };
// Configure these if you want to toggle behavior
const useTwitter = true;
const useScrape = true;
// continued below ...
The code above sets up a few key variables in the body of the function:
num_sources
tracks how many URLs we’re processingcombinedText
initializes an empty array to store all scraped stories- Two boolean flags control which scraping methods to use:
useTwitter
enables Twitter API integrationuseScrape
enables direct web scraping
These variables will be used throughout the rest of the scraping process to control behavior and aggregate results.
// ... continuation of above
for (const source of sources) {
// --- 1) Handle x.com (Twitter) sources ---
if (source.includes("x.com")) {
if (useTwitter) {
const usernameMatch = source.match(/x\.com\/([^\/]+)/);
if (usernameMatch) {
const username = usernameMatch[1];
// Build the search query for tweets
const query = `from:${username} has:media -is:retweet -is:reply`;
const encodedQuery = encodeURIComponent(query);
// Get tweets from the last 24 hours
const startTime = new Date(
Date.now() - 24 * 60 * 60 * 1000
).toISOString();
const encodedStartTime = encodeURIComponent(startTime);
// x.com API URL
const apiUrl = `https://api.x.com/2/tweets/search/recent?query=${encodedQuery}&max_results=10&start_time=${encodedStartTime}`;
// Fetch recent tweets from the Twitter API
const response = await fetch(apiUrl, {
headers: {
Authorization: `Bearer ${process.env.X_API_BEARER_TOKEN}`,
},
});
// Continued below...
Next, we makes a request to the Twitter API to fetch recent tweets from a specific user. Let’s break down what’s happening:
- We check if the source URL contains “x.com” and if Twitter integration is enabled
- We extract the username from the URL using regex
- We construct a search query that:
- Gets tweets from that user
- Only includes tweets with media
- Excludes retweets and replies
- We calculate a timestamp from 24 hours ago to limit results
- We build the API URL with the encoded query parameters
- Finally, we make the authenticated request using the bearer token
// ... continuation of above
if (!response.ok) {
throw new Error(
`Failed to fetch tweets for ${username}: ${response.statusText}`,
);
}
After making the API request, we check if the response was successful. If not, we throw an error with details about what went wrong, including the username and the status text from the response.
const tweets = await response.json();
if (tweets.meta?.result_count === 0) {
console.log(`No tweets found for username ${username}.`);
} else if (Array.isArray(tweets.data)) {
console.log(`Tweets found from username ${username}`);
const stories = tweets.data.map((tweet: any) => {
return {
headline: tweet.text,
link: `https://x.com/i/status/${tweet.id}`,
date_posted: startTime,
};
});
combinedText.stories.push(...stories);
} else {
console.error(
"Expected tweets.data to be an array:",
tweets.data
);
}
}
}
}
// Continued below...
After parsing the tweets, we map them into story objects that contain:
- The tweet text as the headline
- A link to the original tweet
- The timestamp when it was posted
These story objects are then added to our combinedText
array which aggregates content from multiple sources.
If no tweets are found, we log a message. If there’s an unexpected response format where tweets.data
isn’t an array, we log an error with the actual data received.
The code handles all edge cases gracefully while maintaining a clean data structure for downstream processing.
// ... continuation of above
// --- 2) Handle all other sources with Firecrawl extract ---
else {
if (useScrape) {
// Firecrawl will both scrape and extract for you
// Provide a prompt that instructs Firecrawl what to extract
const currentDate = new Date().toLocaleDateString();
const promptForFirecrawl = `
Return only today's AI or LLM related story or post headlines and links in JSON format from the page content.
They must be posted today, ${currentDate}. The format should be:
{
"stories": [
{
"headline": "headline1",
"link": "link1",
"date_posted": "YYYY-MM-DD"
},
...
]
}
If there are no AI or LLM stories from today, return {"stories": []}.
The source link is ${source}.
If a story link is not absolute, prepend ${source} to make it absolute.
Return only pure JSON in the specified format (no extra text, no markdown, no \`\`\`).
`;
// continued below ...
The prompt instructs Firecrawl to extract AI/LLM related stories from the current day only. It specifies the exact JSON format required for the response, with each story containing a headline, link and date posted. The prompt ensures links are absolute by having Firecrawl prepend the source URL if needed. For clean parsing, it explicitly requests pure JSON output without any formatting or extra text.
// Use app.extract(...) directly
const scrapeResult = await app.extract(
[source],
{
prompt: promptForFirecrawl,
schema: StoriesSchema, // The Zod schema for expected JSON
}
);
if (!scrapeResult.success) {
throw new Error(`Failed to scrape: ${scrapeResult.error}`);
}
// The structured data
const todayStories = scrapeResult.data;
console.log(`Found ${todayStories.stories.length} stories from ${source}`);
combinedText.stories.push(...todayStories.stories);
}
}
}
// Continued below ...
The code above implements the core scraping functionality:
- It constructs a prompt for Firecrawl that specifies exactly what content to extract
- The prompt requests AI/LLM headlines from the current day only
- It defines the exact JSON structure expected in the response
- It handles relative URLs by having Firecrawl convert them to absolute
- The extracted data is validated against a Zod schema
- Valid results are accumulated into the
combinedText
array - Error handling ensures failed scrapes don’t crash the process
// ... continuation of above
// Return the combined stories from all sources
const rawStories = combinedText.stories;
console.log(rawStories);
return rawStories;
}
// End of script
Finally, this code returns the raw stories array containing all the scraped headlines and content from the various sources. The stories can then be processed further for trend analysis and summarization.
3. Synthesizing scraped contents into a summary
Inside src/services/generateDraft.ts
, we write the functionality to convert the raw stories scraped in the previous script into a summary message that will later be sent with Slack.
The script starts with the following imports:
import dotenv from "dotenv";
import Together from "together-ai";
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
dotenv.config();
The script imports several key dependencies:
- dotenv: For loading environment variables from a
.env
file - Together: The
Together.ai
client library for making API calls z
fromzod
: A TypeScript-first schema validation libraryzodToJsonSchema
: A utility to convert Zod schemas to JSON Schema format
/**
* Generate a post draft with trending ideas based on raw tweets.
*/
export async function generateDraft(rawStories: string) {
console.log(`Generating a post draft with raw stories (${rawStories.length} characters)...`)
// continued below ...
The generateDraft
function takes the raw stories as input and processes them to identify key trends and generate a summary. First, it prints a log message indicating the size of the input by showing the character count.
// ... continuation of above
try {
// Initialize Together client
const together = new Together();
// Define the schema for our response
const DraftPostSchema = z.object({
interestingTweetsOrStories: z.array(z.object({
story_or_tweet_link: z.string().describe("The direct link to the tweet or story"),
description: z.string().describe("A short sentence describing what's interesting about the tweet or story")
}))
}).describe("Draft post schema with interesting tweets or stories for AI developers.");
// Convert our Zod schema to JSON Schema
const jsonSchema = zodToJsonSchema(DraftPostSchema, {
name: 'DraftPostSchema',
nameStrategy: 'title'
});
// Create a date string if you need it in the post header
const currentDate = new Date().toLocaleDateString('en-US', {
timeZone: 'America/New_York',
month: 'numeric',
day: 'numeric',
});
// continued below ...
In this block, we set up the core functionality for generating the draft post. We initialize the Together AI client which will be used for making API calls. We then define a Zod schema that specifies the expected structure of our response - an array of interesting tweets/stories where each item has a link and description. This schema is converted to JSON Schema format which will help enforce the output structure. Finally, we create a formatted date string in US format (MM/DD) that can be used in the post header.
// ...continuation of above
// Use Together’s chat completion with the Llama 3.1 model
const completion = await together.chat.completions.create({
model: "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
messages: [
{
role: "system",
content: `You are given a list of raw AI and LLM-related tweets sourced from X/Twitter.
Only respond in valid JSON that matches the provided schema (no extra keys).
`,
},
{
role: "user",
content: `Your task is to find interesting trends, launches, or interesting examples from the tweets or stories.
For each tweet or story, provide a 'story_or_tweet_link' and a one-sentence 'description'.
Return all relevant tweets or stories as separate objects.
Aim to pick at least 10 tweets or stories unless there are fewer than 10 available. If there are less than 10 tweets or stories, return ALL of them. Here are the raw tweets or stories you can pick from:\n\n${rawStories}\n\n`,
},
],
// Tell Together to strictly enforce JSON output that matches our schema
// @ts-ignore
response_format: { type: "json_object", schema: jsonSchema },
});
// continued below ...
In this block, we make the API call to Together AI using their chat completions endpoint with the Llama 3.1 model. The system prompt instructs the model to only output valid JSON matching our schema. The user prompt provides the actual task - finding interesting trends, launches and examples from the raw tweets/stories. We request at least 10 items (or all available if less than 10) and pass in the raw content. The response_format
parameter enforces strict JSON output matching our defined schema.
The completion response will contain structured JSON data that we can parse and use to generate our draft post. Each item will have a link to the original tweet/story and a concise description of what makes it noteworthy.
// Check if we got a content payload in the first choice
const rawJSON = completion?.choices?.[0]?.message?.content;
if (!rawJSON) {
console.log("No JSON output returned from Together.");
return "No output.";
}
console.log(rawJSON);
// Parse the JSON to match our schema
const parsedResponse = JSON.parse(rawJSON);
// Construct the final post
const header = `🚀 AI and LLM Trends on X for ${currentDate}\n\n`;
const draft_post = header + parsedResponse.interestingTweetsOrStories
.map((tweetOrStory: any) => `• ${tweetOrStory.description}\n ${tweetOrStory.story_or_tweet_link}`)
.join('\n\n');
return draft_post;
} catch (error) {
console.error("Error generating draft post", error);
return "Error generating draft post.";
}
}
// End of script
This code block shows the final part of our script where we handle the Together AI API response. We first check if we received valid JSON content in the response. If not, we log an error and return early.
If we have valid JSON, we parse it into a structured object matching our schema. Then we construct the final post by adding a header with the current date and mapping over the interesting tweets/stories to create bullet points. Each bullet point contains the description and link.
The script includes error handling to catch and log any issues that occur during execution. If there’s an error, it returns a generic error message rather than failing silently.
This completes the core functionality of our trend finding script. The next sections will cover setting up notifications, scheduling, and deployment.
4. Setting up a notification system with Slack
Inside src/services/sendDraft.ts
, we write the functionality to send the composed final post as a Slack message through a webhook:
import axios from "axios";
import dotenv from "dotenv";
dotenv.config();
export async function sendDraft(draft_post: string) {
try {
const response = await axios.post(
process.env.SLACK_WEBHOOK_URL || "",
{
text: draft_post,
},
{
headers: {
"Content-Type": "application/json",
},
},
);
return `Success sending draft to webhook at ${new Date().toISOString()}`;
} catch (error) {
console.log("error sending draft to webhook");
console.log(error);
}
}
This script sets up a Slack notification system by creating a sendDraft
function that takes a draft post as input and sends it to a configured Slack webhook URL. The function uses axios
to make a POST request to the webhook with the draft text. It includes error handling to log any issues that occur during the sending process. The webhook URL is loaded from environment variables using dotenv
for security. On success, it returns a timestamp of when the draft was sent.
5. Writing a script to execute the system with Cron
The cron.ts file contains the main execution logic for our trend finding system. It exports a handleCron
function that orchestrates the entire workflow:
// src/controllers/cron.ts
import { scrapeSources } from "../services/scrapeSources";
import { getCronSources } from "../services/getCronSources";
import { generateDraft } from "../services/generateDraft";
import { sendDraft } from "../services/sendDraft";
export const handleCron = async (): Promise<void> => {
try {
const cronSources = await getCronSources();
const rawStories = await scrapeSources(cronSources!);
const rawStoriesString = JSON.stringify(rawStories);
const draftPost = await generateDraft(rawStoriesString);
const result = await sendDraft(draftPost!);
console.log(result);
} catch (error) {
console.error(error);
}
};
First, it retrieves the list of sources to scrape by calling getCronSources()
. Then it scrapes those sources using scrapeSources()
to get the raw story data. This raw data is stringified into JSON format.
Next, it generates a draft post from the story data by passing it to generateDraft()
. Finally, it sends the draft to Slack using sendDraft()
and logs the result.
The function includes error handling to catch and log any issues that occur during execution. This script ties together all the individual services we created to form a complete automated workflow.
6. Creating a project entrypoint
The src/index.ts
file serves as the main entry point for our application. It imports the handleCron
function from our cron controller and sets up the execution flow.
The file uses node-cron
for scheduling and dotenv
for environment variable management. The main function provides a simple way to run the draft generation process manually.
There’s also a commented-out cron schedule that can be uncommented to run the job automatically at 5 PM daily (0 17 ** *).
import { handleCron } from "./controllers/cron";
import cron from "node-cron";
import dotenv from "dotenv";
dotenv.config();
async function main() {
console.log(`Starting process to generate draft...`);
await handleCron();
}
main();
// If you want to run the cron job manually, uncomment the following line:
//cron.schedule(`0 17 * * *`, async () => {
// console.log(`Starting process to generate draft...`);
// await handleCron();
//});
When you npm run start
, this script is executed.
At this point, the project is ready for local use. You can modify the topic configurations at any time to track different subjects and generate Slack summaries on demand. While running locally is useful for testing, we’ll explore an even more powerful automation option using GitHub Actions in the next section.
7. Deploying the project with GitHub Actions
Now that we have our project working locally, let’s take it to the next level by automating it with GitHub Actions. GitHub Actions is a powerful CI/CD platform that allows us to automate workflows directly from our GitHub repository. Instead of running our trend finder manually or setting up a server to host it, we can leverage GitHub’s infrastructure to run our script on a schedule, completely free for public repositories. Let’s set it up.
First, create a new file in your repository at .github/workflows/trend-finder.yml
:
mkdir -p .github/workflows
touch .github/workflows/trend-finder.yml
Then, paste the following contents:
name: Run Trend Finder
on:
schedule:
- cron: "0 17 * * *" # Runs at 5 PM UTC daily
workflow_dispatch: # Allows manual trigger from GitHub UI
jobs:
find-trends:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: "18"
- name: Install dependencies
run: npm install
- name: Run trend finder
env:
X_API_BEARER_TOKEN: ${{ secrets.X_API_BEARER_TOKEN }}
FIRECRAWL_API_KEY: ${{ secrets.FIRECRAWL_API_KEY }}
TOGETHER_API_KEY: ${{ secrets.TOGETHER_API_KEY }}
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
run: npm run start
This workflow configuration does several important things:
- Scheduling: The
on.schedule
section sets up automatic daily runs at 5 PM UTC - Manual Triggers:
workflow_dispatch
allows you to run the workflow manually from GitHub’s UI - Environment: Uses Ubuntu as the runner environment
- Setup: Configures Node.js and installs dependencies
- Secrets: Securely passes API keys and tokens from GitHub Secrets to the application
To set up the secrets in your GitHub repository:
- Go to your repository’s Settings
- Click on “Secrets and variables” → “Actions”
- Add each required secret:
X_API_BEARER_TOKEN
FIRECRAWL_API_KEY
TOGETHER_API_KEY
SLACK_WEBHOOK_URL
The workflow will now run automatically every day at 5 PM UTC, scraping your configured sources and sending trend updates to your Slack channel. You can also trigger it manually:
- Go to your repository’s “Actions” tab
- Select “Run Trend Finder” workflow
- Click “Run workflow”
Some key benefits of using GitHub Actions:
- Zero Infrastructure: No need to maintain servers or worry about uptime
- Cost Effective: Free for public repositories (2000 minutes/month)
- Version Controlled: Your automation configuration lives with your code
- Easy Monitoring: Built-in logs and status checks
- Flexible Scheduling: Easy to modify run times or add multiple schedules
Next Steps
Now that your trend finder is fully automated, here are some ways to extend it:
- Custom Topics: Modify the scraping configurations to track different topics
- Additional Sources: Add more websites or social media accounts to monitor
- Enhanced Analysis: Customize the AI prompts for different types of trend analysis
- Multiple Channels: Set up different Slack channels for different topic categories
- Metrics: Add monitoring for successful runs and trend detection rates
The complete project provides a robust foundation for automated trend detection that you can build upon based on your specific needs.
Troubleshooting
If you encounter issues with the GitHub Actions workflow:
- Check Logs: Review the workflow run logs in the Actions tab
- Verify Secrets: Ensure all secrets are properly set and not expired
- Rate Limits: Monitor API rate limits, especially for the X API
- Timeout Issues: Consider breaking up large scraping jobs if runs timeout
- Dependencies: Keep Node.js dependencies updated to latest stable versions
For additional help, check the project’s GitHub Issues or create a new one with specific details about any problems you encounter.
Limitations of Free Tier Tools Used
While this project uses several free tier services to minimize costs, there are some limitations to be aware of:
-
X API Rate Limits
- Limited to 1 account scrape requests/15-minute window
- Some advanced filtering features not available
-
GitHub Actions Minutes
- 2,000 minutes/month for public repositories
- 3,000 minutes/month for private repositories
- Additional minutes require paid plan
-
Together AI Free Credits
- $1 in free credits for new accounts
- 600 requests per minute
-
Firecrawl API Limits
- 500 requests/month on free plan
To work within these constraints:
- Carefully plan scraping intervals
- Implement caching where possible
- Monitor usage to avoid hitting limits
- Consider paid tiers for production use
For most personal or small team use cases, the free tiers provide sufficient capacity. However, larger scale deployments may require upgrading to paid plans for higher limits and additional features.
Conclusion
You’ve now built and deployed a fully automated trend detection system that leverages AI, web scraping, and cloud automation. This solution provides real-time insights into emerging trends across your chosen sources, delivered directly to Slack. With the foundation in place, you can easily customize and expand the system to match your specific trend monitoring needs. The combination of GitHub Actions for automation, Firecrawl for AI web scraping, Together AI for analysis, and Slack for notifications creates a powerful, maintainable solution that will help you stay ahead of relevant trends in your field.
On this page
Introduction
Project Setup
Obtaining API Tokens
X (Twitter) Bearer Token
Firecrawl API Key
Together AI Token
Setting Up Slack Webhook
First project run
System Architecture Overview
Core Components
1. Entry Point ()
2. Cron Controller ()
3. Source Management ()
4. Content Scraping ()
5. Draft Generation ()
6. Notification Service ()
Infrastructure
Docker Support
Configuration Management
Key Features
Development Tools
In-depth Project Breakdown
1. Specifying the resources to scrape
2. Scraping specified resources with X and Firecrawl
3. Synthesizing scraped contents into a summary
4. Setting up a notification system with Slack
5. Writing a script to execute the system with Cron
6. Creating a project entrypoint
7. Deploying the project with GitHub Actions
Next Steps
Troubleshooting
Limitations of Free Tier Tools Used
Conclusion
About the Author
Bex is a Top 10 AI writer on Medium and a Kaggle Master with over 15k followers. He loves writing detailed guides, tutorials, and notebooks on complex data science and machine learning topics
More articles by Bex Tuychiev
Building an Automated Price Tracking Tool
Build an automated e-commerce price tracker in Python. Learn web scraping, price monitoring, and automated alerts using Firecrawl, Streamlit, PostgreSQL.
Web Scraping Automation: How to Run Scrapers on a Schedule
Learn how to automate web scraping in Python using free tools like schedule, asyncio, cron jobs and GitHub Actions. This comprehensive guide covers local and cloud-based scheduling methods to run scrapers reliably in 2025.
BeautifulSoup4 vs. Scrapy - A Comprehensive Comparison for Web Scraping in Python
Learn the key differences between BeautifulSoup4 and Scrapy for web scraping in Python. Compare their features, performance, and use cases to choose the right tool for your web scraping needs.
How to Build an Automated Competitor Price Monitoring System with Python
Learn how to build an automated competitor price monitoring system in Python that tracks prices across e-commerce sites, provides real-time comparisons, and maintains price history using Firecrawl, Streamlit, and GitHub Actions.
Data Enrichment: A Complete Guide to Enhancing Your Data Quality
Learn how to enrich your data quality with a comprehensive guide covering data enrichment tools, best practices, and real-world examples. Discover how to leverage modern solutions like Firecrawl to automate data collection, validation, and integration for better business insights.
How to Deploy Python Web Scrapers
Learn how to deploy Python web scrapers using GitHub Actions, Heroku, PythonAnywhere and more.
How to Generate Sitemaps Using Firecrawl's /map Endpoint: A Complete Guide
Learn how to generate XML and visual sitemaps using Firecrawl's /map endpoint. Step-by-step guide with Python code examples, performance comparisons, and interactive visualization techniques for effective website mapping.
How to Use Firecrawl's Scrape API: Complete Web Scraping Tutorial
Learn how to scrape websites using Firecrawl's /scrape endpoint. Master JavaScript rendering, structured data extraction, and batch operations with Python code examples.