Learn how to optimize your queries, refine search filters, and leverage advanced parameters for better performance.
For efficient processing, keep your query concise—under 400 characters. Think of it as a query for an agent performing web search, not long-form prompts. If your query exceeds the limit, you’ll see this error:
If your query is complex or covers multiple topics, consider breaking it into smaller, more focused sub-queries and sending them as separate requests.
max_results
(Limiting the number of results)5
).content
(NLP-based snippet)search_depth
is set to advanced
, it extracts content closely aligned with your query, surfacing the most valuable sections of a web page rather than a generic summary. Additionally, it uses chunks_per_source
to determine the number of content chunks to return per source.search_depth=advanced
(Ideal for higher relevance in search results)include_raw_content
to true
, you can increase the likelihood of enhancing retrieval precision and retrieving the desired number of chunks_per_source
.time_range
(Filtering by Date)include_raw_content
(Extracted web content)Set to true to return the full extracted content of the web page, useful for deeper content analysis. However, the most recommended approach for extracting web page content is using a two-step process:
For more information on this two-step process, please refer to the Best Practices for the Extract API.
topic=news
(Filtering news sources)published_date
metadata.days
specifies the number of days back from the current date to include in the results. The default is 3
.include_domains
(Restricting searches to specific domains)include_domains
list and make sure they are relevant to your search query.exclude_domains
(Excluding specific domains)exclude_domains
list to ensure you only exclude domains that are truly irrelevant to your query.Example: Limit to U.S.-based websites (.com
domain):
Example: Exclude Icelandic websites (.is
domain):
Restrict search to .com
but exclude example.com
:
async/await
to ensure non-blocking API requests.AsyncTavilyClient
once and reuse it for multiple requests.asyncio.gather
for handling multiple queries concurrently.Example:
When working with Tavily’s Search API, refining search results through post-processing techniques can significantly enhance the relevance of the retrieved information.
One of the most effective ways to refine search results is by using a combination of LLMs and deterministic keyword filtering.
By applying keyword filters before or after processing results with an LLM, you can:
Tavily’s Search API provides rich metadata that can be leveraged to refine and prioritize search results. By incorporating metadata into post-processing logic, you can improve precision in selecting the most relevant content.
title
: Helps in identifying articles that are more likely to be relevant based on their headlines. Filtering results by keyword occurrences in the title can improve result relevancy.raw_content
: Provides the extracted content from the web page, allowing deeper analysis. If the content
does not provide enough information, raw content can be useful for further filtering and ranking. You can also use the Extract API with a two-step extraction process. For more information, see Best Practices for Extract API.score
: Represents the relevancy between the query and the retrieved content snippet. Higher scores typically indicate better matches.content
: Offers a general summary of the webpage, providing a quick way to gauge relevance without processing the full content. When search_depth
is set to advanced
, the content is more closely aligned with the query, offering valuable insights.By leveraging these metadata elements, you can:
score
ParameterTavily assigns a score
to each search result, indicating how well the content aligns with the query. This score helps in ranking and selecting the most relevant results.
score
mean?score
is a numerical measure of relevance between the content and the query.In addition to leveraging LLMs and metadata for refining search results, Python’s re.search
and re.findall
methods can play a crucial role in post-processing by allowing you to parse and extract specific data from the raw_content
. These methods enable pattern-based filtering and extraction, enhancing the precision and relevance of the processed results.
re.search
and re.findall
re.search
: Scans the content for the first occurrence of a specified pattern and returns a match object, which can be used to extract specific parts of the text.Example:
re.findall
: Returns a list of all non-overlapping matches of a pattern in the content, making it suitable for extracting multiple instances of a pattern.Example:
Content Filtering: Use re.search
to identify sections or specific patterns in content (e.g., dates, locations, company names).
Data Extraction: Use re.findall
to extract multiple instances of specific data points (e.g., phone numbers, emails).
Improving Relevance: Apply regex patterns to remove irrelevant content, ensuring that only the most pertinent information remains.
By leveraging post-processing techniques such as LLM-assisted filtering, metadata analysis, and score-based ranking, along with regex-based data extraction, you can optimize Tavily’s Search API results for better relevance. Incorporating these methods into your workflow will help you extract high-quality insights tailored to your needs.
Learn how to optimize your queries, refine search filters, and leverage advanced parameters for better performance.
For efficient processing, keep your query concise—under 400 characters. Think of it as a query for an agent performing web search, not long-form prompts. If your query exceeds the limit, you’ll see this error:
If your query is complex or covers multiple topics, consider breaking it into smaller, more focused sub-queries and sending them as separate requests.
max_results
(Limiting the number of results)5
).content
(NLP-based snippet)search_depth
is set to advanced
, it extracts content closely aligned with your query, surfacing the most valuable sections of a web page rather than a generic summary. Additionally, it uses chunks_per_source
to determine the number of content chunks to return per source.search_depth=advanced
(Ideal for higher relevance in search results)include_raw_content
to true
, you can increase the likelihood of enhancing retrieval precision and retrieving the desired number of chunks_per_source
.time_range
(Filtering by Date)include_raw_content
(Extracted web content)Set to true to return the full extracted content of the web page, useful for deeper content analysis. However, the most recommended approach for extracting web page content is using a two-step process:
For more information on this two-step process, please refer to the Best Practices for the Extract API.
topic=news
(Filtering news sources)published_date
metadata.days
specifies the number of days back from the current date to include in the results. The default is 3
.include_domains
(Restricting searches to specific domains)include_domains
list and make sure they are relevant to your search query.exclude_domains
(Excluding specific domains)exclude_domains
list to ensure you only exclude domains that are truly irrelevant to your query.Example: Limit to U.S.-based websites (.com
domain):
Example: Exclude Icelandic websites (.is
domain):
Restrict search to .com
but exclude example.com
:
async/await
to ensure non-blocking API requests.AsyncTavilyClient
once and reuse it for multiple requests.asyncio.gather
for handling multiple queries concurrently.Example:
When working with Tavily’s Search API, refining search results through post-processing techniques can significantly enhance the relevance of the retrieved information.
One of the most effective ways to refine search results is by using a combination of LLMs and deterministic keyword filtering.
By applying keyword filters before or after processing results with an LLM, you can:
Tavily’s Search API provides rich metadata that can be leveraged to refine and prioritize search results. By incorporating metadata into post-processing logic, you can improve precision in selecting the most relevant content.
title
: Helps in identifying articles that are more likely to be relevant based on their headlines. Filtering results by keyword occurrences in the title can improve result relevancy.raw_content
: Provides the extracted content from the web page, allowing deeper analysis. If the content
does not provide enough information, raw content can be useful for further filtering and ranking. You can also use the Extract API with a two-step extraction process. For more information, see Best Practices for Extract API.score
: Represents the relevancy between the query and the retrieved content snippet. Higher scores typically indicate better matches.content
: Offers a general summary of the webpage, providing a quick way to gauge relevance without processing the full content. When search_depth
is set to advanced
, the content is more closely aligned with the query, offering valuable insights.By leveraging these metadata elements, you can:
score
ParameterTavily assigns a score
to each search result, indicating how well the content aligns with the query. This score helps in ranking and selecting the most relevant results.
score
mean?score
is a numerical measure of relevance between the content and the query.In addition to leveraging LLMs and metadata for refining search results, Python’s re.search
and re.findall
methods can play a crucial role in post-processing by allowing you to parse and extract specific data from the raw_content
. These methods enable pattern-based filtering and extraction, enhancing the precision and relevance of the processed results.
re.search
and re.findall
re.search
: Scans the content for the first occurrence of a specified pattern and returns a match object, which can be used to extract specific parts of the text.Example:
re.findall
: Returns a list of all non-overlapping matches of a pattern in the content, making it suitable for extracting multiple instances of a pattern.Example:
Content Filtering: Use re.search
to identify sections or specific patterns in content (e.g., dates, locations, company names).
Data Extraction: Use re.findall
to extract multiple instances of specific data points (e.g., phone numbers, emails).
Improving Relevance: Apply regex patterns to remove irrelevant content, ensuring that only the most pertinent information remains.
By leveraging post-processing techniques such as LLM-assisted filtering, metadata analysis, and score-based ranking, along with regex-based data extraction, you can optimize Tavily’s Search API results for better relevance. Incorporating these methods into your workflow will help you extract high-quality insights tailored to your needs.