📚 RAG Fundamentals
RAG cho phép AI trả lời dựa trên dữ liệu của bạn - documents, knowledge bases, databases.
RAG là gì?
RAG Definition
Retrieval-Augmented Generation kết hợp:
- Retrieval: Tìm thông tin liên quan từ knowledge base
- Generation: Dùng LLM để tạo response dựa trên thông tin đó
Diagram
graph LR
Q[User Query] --> E[Embed Query]
E --> S[Search Vector DB]
S --> R[Relevant Chunks]
R --> C[Context + Query]
C --> L[LLM]
L --> A[Answer]Tại sao cần RAG?
LLM Limitations
| Problem | RAG Solution |
|---|---|
| Knowledge cutoff | Real-time data access |
| Hallucinations | Grounded in documents |
| No private data | Access your docs |
| Generic answers | Specific context |
Use Cases
- Internal wiki chatbot: Answer questions about company docs
- Customer support: Product documentation Q&A
- Legal/compliance: Search contracts, policies
- Research: Query academic papers
Core Concepts
1. Embeddings
Convert text thành vectors (numbers):
Text
1"Machine learning is amazing" 2 → [0.12, -0.34, 0.78, ...] (1536 dimensions)Similar texts → Similar vectors
2. Vector Database
Store và search embeddings:
| Database | Type | Features |
|---|---|---|
| Pinecone | Cloud | Managed, scalable |
| Supabase Vector | Cloud | PostgreSQL-based |
| Qdrant | Self-hosted | Fast, efficient |
| ChromaDB | Local | Easy setup |
3. Chunking
Split documents thành smaller pieces:
Text
1Long Document (10,000 words)2 → Chunk 1 (500 words)3 → Chunk 2 (500 words)4 → ...5 → Chunk 20 (500 words)4. Similarity Search
Find most relevant chunks:
Text
1Query: "What's the return policy?"2 3Results (by similarity):41. "Returns accepted within 30 days..." (0.92)52. "For refunds, customers must..." (0.87)63. "Shipping policy states..." (0.65)RAG in n8n
Available Nodes
- Document Loaders: PDF, web, text files
- Text Splitters: Chunk documents
- Embeddings: OpenAI, Hugging Face
- Vector Stores: Pinecone, Supabase, Qdrant
- Retrievers: Search vector stores
Basic RAG Workflow
Diagram
graph TD
subgraph "Indexing (One-time)"
D[Documents] --> L[Load]
L --> S[Split]
S --> E[Embed]
E --> V[(Vector Store)]
end
subgraph "Query (Each request)"
Q[Query] --> QE[Embed Query]
QE --> R[Retrieve]
R --> |Top K chunks| C[Combine]
C --> LLM[Generate]
LLM --> A[Answer]
end
V --> RImplementation
Step 1: Document Indexing
Workflow: Index Documents
Text
1File Trigger (new file)2 ↓3Document Loader (PDF/Text)4 ↓5Text Splitter6 ↓7Embeddings (OpenAI)8 ↓9Vector Store (Insert)Text Splitter Config:
JavaScript
1Chunk Size: 500 // characters2Chunk Overlap: 50 // overlap between chunks3Separator: "\n\n" // split on paragraphsEmbeddings Node:
JavaScript
1Model: text-embedding-3-small // cheaper2// or text-embedding-3-large // better qualityStep 2: Query Pipeline
Workflow: Answer Questions
Text
1Webhook (question)2 ↓3Embeddings (query)4 ↓5Vector Store (search)6 ↓7Format Context8 ↓9OpenAI (generate)10 ↓11Return AnswerVector Store Search:
JavaScript
1Top K: 5 // retrieve 5 most similar2Similarity Threshold: 0.7Format Context:
JavaScript
1const chunks = $input.all();23const context = chunks4 .map(c => c.json.text)5 .join('\n\n---\n\n');67return [{8 json: {9 context,10 question: $json.question11 }12}];OpenAI Prompt:
JavaScript
1System: `You are a helpful assistant. Answer questions based ONLY on the provided context. If the answer is not in the context, say "I don't have that information."23Context:4${context}5`67User: `Question: ${question}`Advanced RAG Techniques
1. Metadata Filtering
Add metadata to chunks:
JavaScript
1{2 text: "Return policy content...",3 metadata: {4 source: "policy.pdf",5 category: "returns",6 date: "2024-01-15"7 }8}910// Query with filter11Filter: { category: "returns" }2. Hybrid Search
Combine vector search + keyword search:
JavaScript
1// Vector search: semantic similarity2// Keyword search: exact matches34// Combine results với weighted scoring3. Re-ranking
Use another model to re-rank results:
JavaScript
1// After vector search2Chunks: [A, B, C, D, E]34// Re-rank với cross-encoder5Re-ranked: [C, A, E, B, D] // C is most relevantn8n Vector Store Nodes
Pinecone
JavaScript
1// Setup2Index Name: my-knowledge-base3Namespace: company-docs4Dimensions: 1536 // OpenAI embeddings56// Insert7await pinecone.upsert([{8 id: "doc-123",9 values: embeddings,10 metadata: { source: "policy.pdf" }11}]);1213// Query14const results = await pinecone.query({15 vector: queryEmbedding,16 topK: 5,17 includeMetadata: true18});Supabase Vector
JavaScript
1// Uses pgvector extension2// Integrated với Supabase Auth/RLS34// Insert5await supabase6 .from('documents')7 .insert({8 content: chunk.text,9 embedding: chunk.vector,10 metadata: chunk.metadata11 });1213// Query14const { data } = await supabase.rpc('match_documents', {15 query_embedding: queryVector,16 match_count: 517});Best Practices
RAG Tips
- Chunk size matters - Too small = no context, too large = noise
- Overlap chunks - Preserve context across splits
- Include sources - Show where info came from
- Test queries - Evaluate retrieval quality
- Update regularly - Keep knowledge base current
- Handle no-results - Graceful fallback
Evaluation
Metrics
- Retrieval: Are correct chunks retrieved?
- Generation: Is answer accurate and helpful?
- Relevance: Does answer address the question?
JavaScript
1// Simple evaluation2const isRelevant = answer.includes(expectedInfo);3const isAccurate = !answer.includes("I don't know");4const hasSource = answer.includes("According to");Bài tập thực hành
Hands-on Exercise
Build Company Wiki Chatbot:
- Collect 5-10 documents (policies, FAQs)
- Create indexing workflow
- Create query workflow
- Test với various questions
- Evaluate answer quality
Target: Chatbot trả lời chính xác từ documents
Tiếp theo
Bài tiếp theo: Vector Database Setup - Deep dive vào Pinecone/Supabase.
