The system operates through a two-step process:
1. Website Crawling & Vectorization:
Use Tavily’s crawling endpoint to extract and sitemap content from a webpage URL, then embed it into a MongoDB Atlas vector index for retrieval.
2. Intelligent Q&A Interface:
Query your crawled data through a conversational agent that provides citation-backed answers while maintaining conversation history and context. The agent intelligently distinguishes between informational questions (requiring vector search) and conversational queries (using general knowledge).
Try Our Crawl to RAG Use Case
Step 1: Get Your API Key
Step 2: Chat with Tavily
Step 3: Read The Open Source Code
Features
- Advanced Web Crawling: Deep website content extraction using Tavily’s crawling API
- Vector Search: MongoDB Atlas vector search with OpenAI embeddings for semantic content retrieval
- Smart Question Routing: Automatic detection of informational vs. conversational queries
- Persistent Memory: Conversation history and context preservation using LangGraph-MongoDB checkpointing
- Session Management: Thread-based conversational persistance and vector store management