Exa RAG Integration
Quick Reference
| Topic | When to Use | Reference | |-------|-------------|-----------| | LangChain | Building RAG chains with LangChain | langchain.md | | LlamaIndex | Using Exa as a LlamaIndex data source | llamaindex.md | | Vercel AI SDK | Adding web search to Next.js AI apps | vercel-ai.md | | MCP & Tools | Claude MCP server, OpenAI tools, function calling | mcp-tools.md |
Essential Patterns
LangChain Retriever
from langchain_exa import ExaSearchRetriever
retriever = ExaSearchRetriever(
exa_api_key="your-key",
k=5,
highlights=True
)
docs = retriever.invoke("latest AI research papers")
LlamaIndex Reader
from llama_index.readers.web import ExaReader
reader = ExaReader(api_key="your-key")
documents = reader.load_data(
query="machine learning best practices",
num_results=10
)
Vercel AI SDK Tool
import { exa } from "@agentic/exa";
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";
const result = await generateText({
model: openai("gpt-4"),
tools: { search: exa.searchAndContents },
prompt: "Search for the latest TypeScript features",
});
OpenAI-Compatible Endpoint
from openai import OpenAI
client = OpenAI(
base_url="https://api.exa.ai/v1",
api_key="your-exa-key"
)
response = client.chat.completions.create(
model="exa",
messages=[{"role": "user", "content": "What are the latest AI trends?"}]
)
Integration Selection
| Framework | Best For | Key Feature | |-----------|----------|-------------| | LangChain | Complex chains, agents | ExaSearchRetriever, tool integration | | LlamaIndex | Document indexing, Q&A | ExaReader, query engines | | Vercel AI SDK | Next.js apps, streaming | Tool definitions, edge-ready | | OpenAI Compat | Drop-in replacement | Minimal code changes | | Claude MCP | Claude Desktop, Claude Code | Native tool calling |
Common Mistakes
- Not using highlights for RAG - Full text wastes context; use
highlights=Truefor relevant snippets - Missing source attribution - Always include
result.urlin citations for grounded responses - Ignoring summaries -
summary=Trueprovides concise context without full page overhead - Over-fetching results - Start with 3-5 results; more isn't always better for RAG quality
- Not filtering domains - Use
include_domainsto limit to authoritative sources - Skipping date filters - For current events, always add
start_published_dateto avoid stale info - Forgetting async patterns - Use async retrievers in production for better throughput