Hybrid Research

What You’ll Build

A research agent that merges your internal data (from a vector store, database, or any retrieval system) with live web research from Tavily. You provide a simple RAG function — the agent identifies gaps in your internal knowledge and fills them with web data, producing a comprehensive report with citations.

View Source on GitHub

Why Hybrid Research?

Your internal data is your competitive edge — customer records, product specs, domain expertise. But it’s never complete. Markets shift, competitors launch products, and your knowledge base can’t keep up. The hybrid approach gives you:

Grounded answers rooted in your proprietary data
Complete coverage with real-time web context
Enrichment opportunities by storing relevant web findings back into your knowledge base

Modes

Fast Mode
Multi-Agent Mode

Best for quick answers, lower latency, and cost-sensitive applications.

Query your internal RAG
Generate subqueries based on what’s missing
Parallel web search with deduplication
Synthesize everything into a report

from tavily_agent_toolkit import hybrid_research, ModelConfig, ModelObject

def my_rag(query: str) -> str:
    results = vector_store.similarity_search(query, k=5)
    return "\n".join([doc.page_content for doc in results])

result = await hybrid_research(
    api_key="tvly-xxx",
    query="What's our competitor's current pricing strategy?",
    model_config=ModelConfig(
        model=ModelObject(model="openai:gpt-5.2")
    ),
    internal_rag_function=my_rag,
    mode="fast",
)

print(result["report"])
print(f"Sources: {len(result['web_sources'])} web pages")

Best for comprehensive research, complex topics, and when accuracy matters more than speed.This mode uses Tavily’s deep research endpoint — a multi-agent system that orchestrates sub-agents to iteratively search, extract, and analyze.

Query your internal RAG
LLM identifies knowledge gaps
Tavily’s deep research endpoint fills those gaps
Synthesize into a comprehensive report

result = await hybrid_research(
    api_key="tvly-xxx",
    query="Full competitive analysis of the AI search market",
    model_config=ModelConfig(
        model=ModelObject(model="anthropic:claude-sonnet-4-20250514")
    ),
    internal_rag_function=my_rag,
    mode="multi_agent",
)

Parameters

Parameter	Type	Default	Description
`api_key`	str	Required	Tavily API key
`query`	str	Required	The research question
`model_config`	ModelConfig	Required	Your LLM configuration
`internal_rag_function`	Callable	Required	Function that takes a query string and returns relevant context
`mode`	str	`"fast"`	`"fast"` or `"multi_agent"`
`output_schema`	OutputSchema	None	Pydantic model for structured output
`research_synthesis_prompt`	str	None	Custom instructions for how the report is structured

Return Value

{
    "report": str | BaseModel,  # Synthesized report (or structured output)
    "web_sources": [            # Sources used from web research
        {"title": "...", "url": "..."},
        ...
    ]
}

Structured Output

Use output_schema to get consistent, parseable results:

from pydantic import Field
from tavily_agent_toolkit import OutputSchema

class CompetitorAnalysis(OutputSchema):
    company_name: str = Field(description="Name of the competitor")
    products: list[str] = Field(description="Main products or services")
    pricing: str = Field(description="Pricing strategy or model")
    strengths: list[str] = Field(description="Key competitive strengths")
    weaknesses: list[str] = Field(description="Known weaknesses or gaps")

result = await hybrid_research(
    api_key="tvly-xxx",
    query="Analyze Perplexity as a competitor",
    model_config=ModelConfig(
        model=ModelObject(model="groq:openai/gpt-oss-120b")
    ),
    internal_rag_function=my_rag,
    mode="fast",
    output_schema=CompetitorAnalysis,
)

analysis = CompetitorAnalysis.model_validate_json(result["report"])
print(f"Strengths: {analysis.strengths}")

Custom Synthesis

Guide how the report is structured with research_synthesis_prompt:

result = await hybrid_research(
    api_key="tvly-xxx",
    query="What are the latest developments in AI agents?",
    model_config=ModelConfig(
        model=ModelObject(model="groq:llama-3.3-70b-versatile")
    ),
    internal_rag_function=my_rag,
    mode="fast",
    research_synthesis_prompt="""
    Structure the report as:
    1. Executive Summary (2-3 sentences)
    2. Key Developments (bullet points)
    3. Impact on Our Product (specific recommendations)
    4. Sources

    Keep it under 500 words. Focus on actionable insights.
    """,
)

Data Enrichment Pattern

When your agent searches the web to fill knowledge gaps, those results are relevant to your users — otherwise the agent wouldn’t have needed them. This creates a flywheel:

Agent queries internal data and finds gaps
Agent searches the web to fill gaps
Web results get synthesized into the answer
Store those web results internally for future queries

result = await hybrid_research(...)

for source in result["web_sources"]:
    store_in_knowledge_base(
        url=source["url"],
        title=source["title"],
        query_context=original_query,
    )

Over time, your knowledge base grows with exactly the information your users need.

Implementing Your RAG Function

The internal_rag_function is simple: take a query, return relevant context as a string.

def my_rag(query: str) -> str:
    results = your_retrieval_method(query)
    return "\n\n".join([
        f"Source: {r.source}\n{r.content}"
        for r in results
    ])

Tips:

Return 3-10 relevant chunks — enough context without overwhelming
Include source metadata (file names, URLs, doc IDs) for traceability
The hybrid researcher handles the rest: gap detection, web search, synthesis

Next Steps

Tools Reference

Deep dive into search_and_answer, search_dedup, and the other retrieval primitives.

Chatbot

See how the chatbot routes between quick search and deep research.

Hub

Agent Toolkit

Apps

Cookbook

Open Source

What You’ll Build

View Source on GitHub

Why Hybrid Research?

Modes

Parameters

Return Value

Structured Output

Custom Synthesis

Data Enrichment Pattern

Implementing Your RAG Function

Next Steps

Tools Reference

Chatbot

​What You’ll Build

View Source on GitHub

​Why Hybrid Research?

​Modes

​Parameters

​Return Value

​Structured Output

​Custom Synthesis

​Data Enrichment Pattern

​Implementing Your RAG Function

​Next Steps

Tools Reference

Chatbot

What You’ll Build

Why Hybrid Research?

Modes

Parameters

Return Value

Structured Output

Custom Synthesis

Data Enrichment Pattern

Implementing Your RAG Function

Next Steps