Skip to main content

What You’ll Learn

  • How to use Tavily Map to discover all URLs on a domain without extracting content
  • When to use Map vs Crawl (speed vs depth)
  • How to combine Map + Extract for targeted content retrieval
  • Filtering results with path and domain patterns

How Does It Work?

Tavily Map returns a list of URLs discovered from a starting URL. Unlike Crawl, it does not extract page content — it only discovers the structure. This makes it significantly faster and cheaper when you need to understand what’s on a site before deciding which pages to process.
FeatureMapCrawl
ReturnsURL list onlyURLs + full page content
SpeedFast (seconds)Slower (depends on page count)
CostLowerHigher
Best forSite discovery, URL filteringContent extraction, RAG pipelines

Getting Started

Get your Tavily API key

1

Install the Tavily Python SDK

uv venv
uv pip install tavily-python
2

Set up your client

import os
from tavily import TavilyClient

client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])
3

Map a website

import os
from tavily import TavilyClient

client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

response = client.map(url="https://docs.tavily.com")

print(f"Found {len(response['results'])} URLs")
for url in response["results"][:10]:
    print(url)
4

Output

Found 21 URLs
https://docs.tavily.com/
https://docs.tavily.com/changelog
https://docs.tavily.com/welcome
https://docs.tavily.com/documentation/api-credits
https://docs.tavily.com/documentation/help
...

Filtering with Path Patterns

Use select_paths and exclude_paths to focus the map on specific sections of a site. These accept regex patterns.
import os
from tavily import TavilyClient

client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

response = client.map(
    url="https://docs.tavily.com",
    select_paths=["/documentation/api-reference/.*", "/sdk/.*"],
    exclude_paths=["/changelog/.*"],
    max_depth=2,
    allow_external=False,
)
You can also use instructions for natural language guidance:
import os
from tavily import TavilyClient

client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

response = client.map(
    url="https://docs.tavily.com",
    instructions="Find pages related to the Python SDK",
    allow_external=False,
)

Map + Extract: Targeted Content Retrieval

The real power of Map is combining it with Extract. First discover the site structure, then extract only the pages you care about.
import os
from tavily import TavilyClient

client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

map_response = client.map(
    url="https://docs.tavily.com",
    select_paths=["/documentation/api-reference/endpoint/.*"],
    max_depth=2,
    allow_external=False,
)

api_urls = map_response["results"][:5]

extract_response = client.extract(
    urls=api_urls,
    extract_depth="advanced",
)

for result in extract_response["results"]:
    print(f"\n--- {result['url']} ---")
    print(result["raw_content"][:300])
This two-step approach lets you process only relevant pages instead of crawling an entire site.

Critical Knobs

  • Default: 1
  • Higher values discover more pages but take longer
  • Default: 50
  • Total URL cap before stopping
  • Regex patterns to include or exclude URL paths
  • Example: "/docs/.*" to target docs, "/blog/.*" to skip blog posts
  • Natural-language guidance for the mapper
  • Use when regex patterns aren’t enough and you need semantic filtering
  • Example: "Find pages related to the Python SDK"
For the complete parameter list, see the Map API reference.

Next Steps

Map API Reference

Full parameter list, response schema, and interactive playground.

Extract API Tutorial

Learn Extract in depth: batch processing, query-focused extraction, and more.

Python SDK Reference

Python client methods, async support, and type details.

JavaScript SDK Reference

JavaScript/TypeScript client methods and usage.