Documentation Index
Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
Cartesia Line is an SDK for building low-latency voice agents. Pairing Line with Tavily gives your voice agent live web access — use Tavily Search for fast, voice-friendly lookups and Tavily Extract for deep-dives into specific pages.
A complete reference implementation lives in the Cartesia Line repo.
Step-by-Step Integration Guide
Step 1: Install Required Packages
uv venv
uv add cartesia-line tavily-python loguru python-dotenv
Step 2: Set Up API Keys
Create a .env file:
TAVILY_API_KEY=tvly-your-api-key
OPENAI_API_KEY=your-openai-api-key
Wrap Tavily’s AsyncTavilyClient in two @loopback_tool functions so the voice agent can call them mid-conversation. Reusing a single AsyncTavilyClient across calls keeps the underlying HTTP session warm, which matters for latency on a live call.
from typing import Annotated, Optional
from loguru import logger
from tavily import AsyncTavilyClient
from line.llm_agent import ToolEnv, loopback_tool
EXTRACT_MAX_CHARS = 3000
class TavilyTools:
def __init__(self, api_key: str):
self._client = AsyncTavilyClient(
api_key=api_key,
client_source="cartesia-line-agent",
)
@loopback_tool
async def web_search(
self,
ctx: ToolEnv,
query: Annotated[str, "The search query. Be specific and include key terms."],
time_range: Annotated[
Optional[str],
"Optional time filter: 'day', 'week', 'month', or 'year'.",
] = None,
) -> str:
"""Search the web for current information."""
kwargs: dict = {"query": query, "search_depth": "fast", "max_results": 5}
if time_range is not None:
kwargs["time_range"] = time_range
response = await self._client.search(**kwargs)
results = response.get("results", [])
if not results:
return "No relevant information found."
parts = [f"Search Results for: '{query}'\n"]
for i, result in enumerate(results):
score = result.get("score", 0)
parts.append(f"\n--- Source {i + 1}: {result['title']} (relevance: {score:.2f}) ---\n")
if result.get("content"):
parts.append(f"{result['content']}\n")
parts.append(f"URL: {result['url']}\n")
return "".join(parts)
@loopback_tool
async def web_extract(
self,
ctx: ToolEnv,
url: Annotated[str, "The URL to extract content from."],
) -> str:
"""Extract the full content of a webpage given its URL."""
response = await self._client.extract(urls=[url])
results = response.get("results", [])
if not results:
failed = response.get("failed_results", [])
if failed:
return f"Extraction failed for {url}: {failed[0].get('error', 'unknown error')}"
return "No content could be extracted from that URL."
raw_content = results[0].get("raw_content", "")
if not raw_content:
return "The page was reached but no readable content was found."
if len(raw_content) > EXTRACT_MAX_CHARS:
raw_content = raw_content[:EXTRACT_MAX_CHARS] + "\n\n[Content truncated]"
return f"Extracted content from {url}:\n\n{raw_content}"
import os
from datetime import datetime
from line.llm_agent import LlmAgent, LlmConfig, end_call
from line.voice_agent_app import AgentEnv, CallRequest, VoiceAgentApp
SYSTEM_PROMPT = """Today is {today}. You are a fast research assistant on a live voice call.
Use `web_search` for current events, facts, prices, or anything that needs fresh data.
Use `web_extract` only when a search snippet is too thin — pass it a URL from a prior search.
Lead with the answer. Keep replies to two or three sentences unless asked for more.
This is a voice call: speak in plain sentences, no markdown, no lists, no special characters."""
async def get_agent(env: AgentEnv, call_request: CallRequest):
api_key = os.environ["TAVILY_API_KEY"]
tavily = TavilyTools(api_key=api_key)
return LlmAgent(
model="openai/gpt-5.4-mini",
api_key=os.environ["OPENAI_API_KEY"],
tools=[tavily.web_search, tavily.web_extract, end_call],
config=LlmConfig(
system_prompt=SYSTEM_PROMPT.format(today=datetime.now().strftime("%Y-%m-%d")),
introduction="Hey! I'm your research assistant. Ask me anything.",
max_tokens=600,
temperature=0.7,
),
)
app = VoiceAgentApp(get_agent=get_agent)
if __name__ == "__main__":
app.run()
Run the agent and connect to it:
uv run main.py
# in another terminal
cartesia chat 8000
Choosing a Search Depth
Voice agents are latency-sensitive. Tavily exposes four search depths — for live calls, we recommend using fast or ultra-fast.
| Depth | Latency | Content Type | Cost | Best For |
|---|
ultra-fast | Lowest | NLP summary per URL | 1 credit | Voice agents, real-time chat |
fast | Low | Reranked chunks per URL | 1 credit | Chunk-based results with low latency |
basic | Medium | NLP summary per URL | 1 credit | General-purpose search |
advanced | Higher | Reranked chunks per URL | 2 credits | Precision-critical queries |
Additional Parameters
Extend web_search with any of Tavily’s search parameters:
time_range — "day", "week", "month", or "year" for recency filtering
include_domains / exclude_domains — restrict or block specific sources
include_answer — "basic" or "advanced" for an LLM-generated answer alongside results
See the Search API reference and the Python SDK reference for the full parameter list.
For web_extract, the most useful knobs are:
extract_depth — "basic" (default) or "advanced" for tables and embedded content
format — "markdown" (default) or "text"
See the Extract API reference for more.
Benefits of Tavily + Cartesia
- Voice-optimized latency:
fast and ultra-fast search depths keep round-trips short enough for live conversation.
- Fresh context: Voice agents can answer questions about today’s news, prices, and events without retraining.
- Targeted deep-dives: Providing URLs to
web_extract allows the agent to pull full-page content when a snippet isn’t enough.