Firecrawl

Arcade OptimizedBYOCPro

Description: Enable agents to scrape, crawl, and map websites.

Author: Arcade

Auth: API Key

The Arcade Firecrawl Server provides a pre-built set of tools for interacting with websites. These tools make it easy to build and AI apps that can:

Scrape web pages
Crawl websites
Map website structures
Retrieve crawl status and data
Cancel ongoing crawls

Available Tools

These tools are currently available in the Arcade Firecrawl Sever.

Tool Name	Description
Firecrawl.ScrapeUrl	Scrape a URL and return data in specified formats.
Firecrawl.CrawlWebsite	Crawl a website and return crawl status and data.
Firecrawl.GetCrawlStatus	Retrieve the status of a crawl job.
Firecrawl.GetCrawlData	Retrieve data from a completed crawl job.
Firecrawl.CancelCrawl	Cancel an ongoing crawl job.
Firecrawl.MapWebsite	Map a website from a single URL to a map of the entire website.

If you need to perform an action that’s not listed here, you can get in touch with us to request a new , or create your own tools.

Firecrawl.ScrapeUrl

Scrape a URL and return data in specified formats.

Auth:

Environment Variables Required:
- FIRECRAWL_API_KEY: Your Firecrawl .

Parameters

url (string, required) The URL to scrape.
formats (enum (Formats), optional) The format of the scraped web page. Defaults to Formats.MARKDOWN.
only_main_content (bool, optional) Only return the main content of the page. Defaults to True.
include_tags (list, optional) List of tags to include in the output.
exclude_tags (list, optional) List of tags to exclude from the output.
wait_for (int, optional) Delay in milliseconds before fetching content. Defaults to 10.
timeout (int, optional) Timeout in milliseconds for the request. Defaults to 30000.

Firecrawl.CrawlWebsite

Crawl a website and return crawl status and data.

Auth:

Environment Variables Required:
- FIRECRAWL_API_KEY: Your Firecrawl .

Parameters

url (string, required) The URL to crawl.
exclude_paths (list, optional) URL patterns to exclude from the crawl.
include_paths (list, optional) URL patterns to include in the crawl.
max_depth (int, required) Maximum depth to crawl. Defaults to 2.
ignore_sitemap (bool, required) Ignore the website sitemap. Defaults to True.
limit (int, required) Limit the number of pages to crawl. Defaults to 10.
allow_backward_links (bool, required) Enable navigation to previously linked pages. Defaults to False.
allow_external_links (bool, required) Allow following links to external websites. Defaults to False.
webhook (string, optional) URL to send a POST request when the crawl is started, updated, and completed.
async_crawl (bool, required) Run the crawl asynchronously. Defaults to True.

Firecrawl.GetCrawlStatus

Retrieve the status of a crawl job.

Auth:

Environment Variables Required:
- FIRECRAWL_API_KEY: Your Firecrawl .

Parameters

crawl_id (string, required) The ID of the crawl job.

Firecrawl.GetCrawlData

Retrieve data from a completed crawl job.

Auth:

Environment Variables Required:
- FIRECRAWL_API_KEY: Your Firecrawl .

Parameters

crawl_id (string, required) The ID of the crawl job.

Firecrawl.CancelCrawl

Cancel an ongoing crawl job.

Auth:

Environment Variables Required:
- FIRECRAWL_API_KEY: Your Firecrawl .

Parameters

crawl_id (string, required) The ID of the asynchronous crawl job to cancel.

Firecrawl.MapWebsite

Map a website from a single URL to a map of the entire website.

Auth:

Environment Variables Required:
- FIRECRAWL_API_KEY: Your Firecrawl .

Parameters

url (string, required) The base URL to start crawling from.
search (string, optional) Search query to use for mapping.
ignore_sitemap (bool, required) Ignore the website sitemap. Defaults to True.
include_subdomains (bool, required) Include subdomains of the website. Defaults to False.
limit (int, required) Maximum number of links to return. Defaults to 5000.

Auth

The Arcade Web Sever uses Firecrawl to scrape, crawl, and map websites.

Global Environment Variables:

FIRECRAWL_API_KEY: Your Firecrawl .

Get Building

Use tools hosted on Arcade Cloud

Arcade tools are hosted by our cloud platform and ready to be used in your agents. Learn how.

Self Host Arcade tools

Arcade tools can be self-hosted on your own infrastructure. Learn more about self-hosting.

pip install arcade_firecrawl