Tavily Crawl (BETA)
Traverse a site like a graph starting from a base URL.
Tavily Crawl is currently in invite-only beta. If you are interested in joining the beta, please email support@tavily.com.
Authorizations
Bearer authentication header in the form Bearer <token>, where <token> is your Tavily API key (e.g., Bearer tvly-YOUR_API_KEY).
Body
The root URL to begin the crawl.
"https://docs.tavily.com"
Max depth of the crawl. Defines how far from the base URL the crawler can explore.
x >= 1
Max number of links to follow per level of the tree (i.e., per page).
x >= 1
Total number of links the crawler will process before stopping.
x >= 1
Natural language instructions for the crawler.
"Python SDK"
Regex patterns to select only URLs with specific path patterns (e.g., /docs/.*
, /api/v1.*
).
Regex patterns to select crawling to specific domains or subdomains (e.g., ^docs\.example\.com$
).
Whether to allow following links that go to external domains.
Filter URLs using predefined categories like 'Documentation', 'Blog', 'API', etc.
Careers
, Blog
, Documentation
, About
, Pricing
, Community
, Developers
, Contact
, Media
Advanced extraction retrieves more data, including tables and embedded content, with higher success but may increase latency.
basic
, advanced