Instantiating a client
To interact with Tavily in Python, you must instatiate a client with your API key. For greater flexibility, we provide both a synchronous and an asynchronous client class. Once you have instantiated a client, call one of our supported methods (detailed below) to access the API.Synchronous Client
Asynchronous Client
Proxies
If you would like to specify a proxy to be used when making requests, you can do so by passing in a proxy parameter on client instantiation. Proxy configuration is available in both the synchronous and asynchronous clients.TAVILY_HTTP_PROXY and TAVILY_HTTPS_PROXY variables in your environment file.
Tavily Search
NEW! Try our interactive API
Playground to see each parameter in
action, and generate ready-to-use Python snippets.
search function.
Parameters
| Parameter | Type | Description | Default | |
|---|---|---|---|---|
| query(required) | str | The query to run a search on. | — | |
| auto_parameters | bool | When auto_parametersis enabled, Tavily automatically configures search parameters based on your query’s content and intent. You can still set other parameters manually, and your explicit values will override the automatic ones. The parametersinclude_answer,include_raw_content, andmax_resultsmust always be set manually, as they directly affect response size. Note:search_depthmay be automatically set to advanced when it’s likely to improve results. This uses 2 API credits per request. To avoid the extra cost, you can explicitly setsearch_depthtobasic. | "false" | |
| search_depth | str | The depth of the search. It can be "basic"or"advanced"."advanced"search is tailored to retrieve the most relevant sources andcontentsnippets for your query, while"basic"search provides generic content snippets from each source. | "basic" | |
| topic | str | The category of the search. Determines which agent will be used. Supported values are "general","news"and"finance". | "general" | |
| time_range | str | The time range back from the current date based on publish date or last updated date. Accepted values include "day","week","month","year"or shorthand values"d","w","m","y". | — | |
| start_date | str | Will return all results after the specified start date based on publish date or last updated date. Required to be written in the format YYYY-MM-DD | — | |
| end_date | str | Will return all results before the specified end date based on publish date or last updated date. Required to be written in the format YYYY-MM-DD. | — | |
| max_results | int | The maximum number of search results to return. It must be between 0and20. | 5 | |
| chunks_per_source | int | Chunks are short content snippets (maximum 500 characters each) pulled directly from the source. Use chunks_per_sourceto define the maximum number of relevant chunks returned per source and to control thecontentlength. Chunks will appear in thecontentfield as:<chunk 1> [...] <chunk 2> [...] <chunk 3>. Available only whensearch_depthis"advanced". | 3 | |
| include_images | bool | Include a list of query-related images in the response. | False | |
| include_image_descriptions | bool | Include a list of query-related images and their descriptions in the response. | False | |
| include_answer | boolorstr | Include an answer to the query generated by an LLM based on search results. A "basic"(orTrue) answer is quick but less detailed; an"advanced"answer is more detailed. | False | |
| include_raw_content | boolorstr | Include the cleaned and parsed HTML content of each search result. "markdown"orTruereturns search result content in markdown format."text"returns the plain text from the results and may increase latency. | False | |
| include_domains | list[str] | A list of domains to specifically include in the search results. Maximum 300 domains. | [] | |
| exclude_domains | list[str] | A list of domains to specifically exclude from the search results. Maximum 150 domains. | [] | |
| country | str | Boost search results from a specific country. This will prioritize content from the selected country in the search results. Available only if topic is general. | — | |
| timeout | int | A timeout to be used in requests to the Tavily API. | 60 | |
| include_favicon | bool | Whether to include the favicon URL for each result. | False | 
Response format
The response object you receive will be in the following format:| Key | Type | Description | 
|---|---|---|
| results | list[Result] | A list of sorted search results ranked by relevancy. | 
| query | str | Your search query. | 
| response_time | float | Your search result response time. | 
| answer(optional) | str | The answer to your search query, generated by an LLM based on Tavily’s search results. This is only available if include_answeris set toTrue. | 
| images(optional) | list[str]orlist[ImageResult] | This is only available if include_imagesis set toTrue. A list of query-related image URLs. Ifinclude_image_descriptionsis set toTrue, each entry will be anImageResult. | 
| request_id | str | A unique request identifier you can share with customer support to help resolve issues with specific requests. | 
Results
| Key | Type | Description | 
|---|---|---|
| title | str | The title of the search result. | 
| url | str | The URL of the search result. | 
| content | str | The most query-related content from the scraped URL. Tavily uses proprietary AI to extract the most relevant content based on context quality and size. | 
| score | float | The relevance score of the search result. | 
| raw_content(optional) | str | The parsed and cleaned HTML content of the site. This is only available if include_raw_contentis set toTrue. | 
| published_date(optional) | str | The publication date of the source. This is only available if the search topicis set to"news". | 
| favicon(optional) | str | The favicon URL for the search result. | 
Image Results
IfincludeImageDescriptions is set to true, each image in the images list will be in the following ImageResult format:
| Key | Type | Description | 
|---|---|---|
| url | string | The URL of the image. | 
| description | string | An LLM-generated description of the image. | 
Example
Request
Request
Response
Response
Tavily Extract
You can access Tavily Extract in Python through the client’sextract function.
Parameters
| Parameter | Type | Description | Default | |
|---|---|---|---|---|
| urls(required) | strorlist[str] | The URL (or URLs) you want to extract. If a list is provided, it must not contain more than 20 URLs. | — | |
| include_images | bool | Include a list of images extracted from the URLs in the response. | False | |
| extract_depth | str | The depth of the extraction process. You may experience higher latency with "advanced"extraction, but it offers a higher success rate and retrieves more data from the URL (e.g., tables, embedded content)."basic"extraction costs 1 API Credit per 5 successful URL extractions, whileadvancedextraction costs 2 API Credits per 5 successful URL extractions. | "basic" | |
| format | str | The format of the extracted web page content. "markdown"returns content in markdown format."text"returns plain text and may increase latency. | "markdown" | |
| timeout | float | A timeout to be used in requests to the Tavily API. Maximum time in seconds to wait for the URL extraction before timing out. Must be between 1.0 and 60.0 seconds. If not specified, default timeouts are applied based on extract_depth: 10 seconds for basic extraction and 30 seconds for advanced extraction. | None | |
| include_favicon | bool | Whether to include the favicon URL for each result. | False | 
Response format
The response object you receive will be in the following format:| Key | Type | Description | 
|---|---|---|
| results | list[SuccessfulResult] | A list of extracted content. | 
| failed_results | list[FailedResult] | A list of URLs that could not be processed. | 
| response_time | float | The search result response time. | 
| request_id | str | A unique request identifier you can share with customer support to help resolve issues with specific requests. | 
Successful Results
Each successful result in theresults list will be in the following SuccessfulResult format:
| Key | Type | Description | 
|---|---|---|
| url | str | The URL of the webpage. | 
| raw_content | str | The raw content extracted. | 
| images(optional) | list[str] | This is only available if include_imagesis set toTrue. A list of extracted image URLs. | 
| favicon(optional) | str | The favicon URL for the search result. | 
Failed Results
Each failed result in theresults list will be in the following FailedResult format:
| Key | Type | Description | 
|---|---|---|
| url | str | The URL that failed. | 
| error | str | An error message describing why it could not be processed. | 
Example
Request
Request
Response
Response
Tavily Crawl
Our agent-first crawl endpoint is currently in open beta. Please repost any issues you encounter on our community page.
crawl function.
Parameters
| Parameter | Type | Description | Default | 
|---|---|---|---|
| url(required) | str | The root URL to begin the crawl. | — | 
| max_depth | int | Max depth of the crawl. Defines how far from the base URL the crawler can explore. | 1 | 
| max_breadth | int | Max number of links to follow per level of the tree (i.e., per page). | 20 | 
| limit | int | Total number of links the crawler will process before stopping. | 50 | 
| instructions | str | Natural language instructions for the crawler. | — | 
| select_paths | list[str] | Regex patterns to select only URLs with specific path patterns (e.g., "/docs/.*","/api/v1.*"). | None | 
| select_domains | list[str] | Regex patterns to select crawling to specific domains or subdomains (e.g., "^docs\.example\.com$"). | None | 
| exclude_paths | list[str] | Regex patterns to exclude URLs with specific path patterns (e.g., "/private/.*","/admin/.*"). | None | 
| exclude_domains | list[str] | Regex patterns to exclude specific domains or subdomains from crawling (e.g., "^private\.example\.com$"). | None | 
| allow_external | bool | Whether to allow following links that go to external domains. | True | 
| include_images | bool | Whether to extract image URLs from the crawled pages. | False | 
| extract_depth | str | Advanced extraction retrieves more data, including tables and embedded content, with higher success but may increase latency. Options: "basic"or"advanced". | "basic" | 
| format | str | The format of the extracted web page content. markdownreturns content in markdown format.textreturns plain text and may increase latency. | "markdown" | 
| include_favicon | bool | Whether to include the favicon URL for each result. | False | 
Response format
The response object you receive will be in the following format:| Key | Type | Description | 
|---|---|---|
| base_url | str | The URL you started the crawl from. | 
| results | list[Result] | A list of crawled pages. | 
| response_time | float | The crawl response time. | 
| request_id | str | A unique request identifier you can share with customer support to help resolve issues with specific requests. | 
Results
Each successful result in theresults list will be in the following Result format:
| Key | Type | Description | 
|---|---|---|
| url | str | The URL of the webpage. | 
| raw_content | str | The raw content extracted. | 
| images | list[str] | Image URLs extracted from the page. | 
| favicon(optional) | str | The favicon URL for the search result. | 
Example
Request
Request
Response
Response
Tavily Map
Our agent-first mapping endpoint is currently in open beta. Please repost any issues you encounter on our community page.
map function.
Parameters
| Parameter | Type | Description | Default | 
|---|---|---|---|
| url(required) | str | The root URL to begin the mapping. | — | 
| max_depth | int | Max depth of the mapping. Defines how far from the base URL the crawler can explore. | 1 | 
| max_breadth | int | Max number of links to follow per level of the tree (i.e., per page). | 20 | 
| limit | int | Total number of links the crawler will process before stopping. | 50 | 
| instructions | str | Natural language instructions for the crawler | — | 
| select_paths | list[str] | Regex patterns to select only URLs with specific path patterns (e.g., "/docs/.*","/api/v1.*"). | None | 
| select_domains | list[str] | Regex patterns to select crawling to specific domains or subdomains (e.g., "^docs\.example\.com$"). | None | 
| exclude_paths | list[str] | Regex patterns to exclude URLs with specific path patterns (e.g., "/private/.*","/admin/.*"). | None | 
| exclude_domains | list[str] | Regex patterns to exclude specific domains or subdomains from crawling (e.g., "^private\.example\.com$"). | None | 
| allow_external | bool | Whether to allow following links that go to external domains. | True | 
Response format
The response object you receive will be in the following format:| Key | Type | Description | 
|---|---|---|
| base_url | str | The URL you started the mapping from. | 
| results | list[str] | A list of URLs that were discovered during the mapping. | 
| response_time | float | The mapping response time. | 
| request_id | str | A unique request identifier you can share with customer support to help resolve issues with specific requests | 
Example
Request
Request
Response
Response
Tavily Hybrid RAG
Tavily Hybrid RAG is an extension of the Tavily Search API built to retrieve relevant data from both the web and an existing database collection. This way, a RAG agent can combine web sources and locally available data to perform its tasks. Additionally, data queried from the web that is not yet in the database can optionally be inserted into it. This will allow similar searches in the future to be answered faster, without the need to query the web again.Parameters
The TavilyHybridClient class is your gateway to Tavily Hybrid RAG. There are a few important parameters to keep in mind when you are instantiating a Tavily Hybrid Client.| Parameter | Type | Description | Default | 
|---|---|---|---|
| api_key | str | Your Tavily API Key | |
| db_provider | str | Your database provider. Currently, only "mongodb"is supported. | |
| collection | str | A reference to the MongoDB collection that will be used for local search. | |
| embeddings_field(optional) | str | The name of the field that stores the embeddings in the specified collection. This field MUST be the same one used in the specified index. This will also be used when inserting web search results in the database using our default function. | "embeddings" | 
| content_field(optional) | str | The name of the field that stores the text content in the specified collection. This will also be used when inserting web search results in the database using our default function. | "content" | 
| embedding_function(optional) | function | A custom embedding function (if you want to use one). The function must take in a list[str]corresponding to the list of strings to be embedded, as well as an additional string defining the type of document. It must return alist[list[float]], one embedding per input string. If no function is provided, defaults to Cohere’s Embed. Keep in mind that you shouldn’t mix different embeddings in the same database collection. | |
| ranking_function(optional) | function | A custom ranking function (if you want to use one). If no function is provided, defaults to Cohere’s Rerank. It should return an ordered list[dict]where the documents are sorted by decreasing relevancy to your query. Each returned document will have two properties -content, which is astr, andscore, which is afloat. The function MUST accept the following parameters:query:str- This is the query you are executing. When your ranking function is called during Hybrid RAG, the query parameter of your search call (more details below) will be passed as query.documents:List[Dict]: - This is the list of documents that are returned by your Hybrid RAG call and that you want to sort. Each document will have two properties -content, which is astr, andscore, which is afloat.top_n:int- This is the number of results you want to return after ranking. When your ranking function is called during Hybrid RAG, the max_results value will be passed astop_n. | 
Methods
search(query, max_results=10, max_local=None, max_foreign=None, save_foreign=False, **kwargs)
Performs a Tavily Hybrid RAG query and returns the retrieved documents as a list[dict] where the documents are sorted by decreasing relevancy to your query. Each returned document will have three properties - content (str), score (float), and origin, which is either local or foreign.
| Parameter | Type | Description | Default | |
|---|---|---|---|---|
| query | str | The query you want to search for. | — | |
| max_results | int | The maximum number of total search results to return. | 10 | |
| max_local | int | The maximum number of local search results to return. | None, which defaults tomax_results. | |
| max_local | int | The maximum number of local search results to return. | None, which defaults tomax_results. | |
| max_foreign | int | The maximum number of web search results to return. | None, which defaults tomax_results. | |
| save_foreign | Union[bool, function] | Save documents from the web search in the local database. If Trueis passed, our default saving function (which only saves the contentstrand the embeddinglist[float]will be used.) IfFalseis passed, no web search result documents will be saved in the local database. If a function is passed, that function MUST take in adictas a parameter, and return anotherdict. The inputdictcontains all properties of the returned Tavily result object. The output dict is the final document that will be inserted in the database. You are free to add to it any fields that are supported by the database, as well as remove any of the default ones. If this function returnsNone, the document will not be saved in the database. | — | 
search_depth, topic, include_raw_content, include_domains,exclude_domains.
Setup
MongoDB setup
You will need to have a MongoDB collection with a vector search index. You can follow the MongoDB Documentation to learn how to set this up.Cohere API Key
By default, embedding and ranking use the Cohere API, our recommended option. Unless you want to provide a custom embedding and ranking function, you’ll need to get an API key from Cohere and set it as an environment variable namedCO_API_KEY
If you decide to stick with Cohere, please note that you’ll need to install the Cohere Python package as well:
Tavily Hybrid RAG Client setup
Once you are done setting up your database, you’ll need to create a MongoDB Client as well as a Tavily Hybrid RAG Client. A minimal setup would look like this:Usage
Once you create the proper clients, you can easily start searching. A few simple examples are shown below. They assume you’ve followed earlier steps. You can use most of the Tavily Search parameters with Tavily Hybrid RAG as well.Simple Tavily Hybrid RAG example
This example will look for context about Leo Messi on the web and in the local database. Here, we get 5 sources, both from our database and from the web, but we want to exclude unwanted-domain.com from our web search results:max_local and max_foreign can exceed max_results, but only the top max_results results will be returned.