MinAI - Về trang chủ
Lý thuyết
6/1335 phút
Đang tải...

Cloud Vector Databases

Pinecone, Weaviate, và cách chọn vector database cho production RAG

0

🎯 Mục tiêu bài học

TB5 min

ChromaDB tốt cho development, nhưng production cần cloud-managed solutions. Bài này cover Pinecone, Weaviate, và cách chọn DB phù hợp.

Sau bài này, bạn sẽ:

✅ Pinecone: setup, indexing, querying ✅ Weaviate: hybrid search, schema design ✅ So sánh và chọn vector DB cho use case

1

🛠️ Pinecone

TB5 min

Setup

python.py
1# pip install pinecone-client
2from pinecone import Pinecone, ServerlessSpec
3
4pc = Pinecone(api_key="your-api-key")
5
6# Create index
7pc.create_index(
8 name="rag-knowledge-base",
9 dimension=1536, # OpenAI text-embedding-3-small
10 metric="cosine",
11 spec=ServerlessSpec(
12 cloud="aws",
13 region="us-east-1"
14 )
15)
16
17# Connect to index
18index = pc.Index("rag-knowledge-base")
19print(index.describe_index_stats())

Upsert Vectors

python.py
1from openai import OpenAI
2
3openai_client = OpenAI()
4
5def get_embedding(text, model="text-embedding-3-small"):
6 response = openai_client.embeddings.create(input=text, model=model)
7 return response.data[0].embedding
8
9# Prepare documents
10documents = [
11 {"id": "doc1", "text": "RAG architecture combines retrieval and generation.", "source": "rag-guide"},
12 {"id": "doc2", "text": "Pinecone is a managed vector database service.", "source": "pinecone-docs"},
13 {"id": "doc3", "text": "Embeddings represent text as numerical vectors.", "source": "ml-basics"},
14]
15
16# Upsert with embeddings + metadata
17vectors = []
18for doc in documents:
19 embedding = get_embedding(doc["text"])
20 vectors.append({
21 "id": doc["id"],
22 "values": embedding,
23 "metadata": {
24 "text": doc["text"],
25 "source": doc["source"]
26 }
27 })
28
29index.upsert(vectors=vectors)
30print(f"Upserted {len(vectors)} vectors")

Query

python.py
1# Semantic search
2query = "How does RAG work?"
3query_embedding = get_embedding(query)
4
5results = index.query(
6 vector=query_embedding,
7 top_k=3,
8 include_metadata=True
9)
10
11for match in results['matches']:
12 print(f"Score: {match['score']:.4f}")
13 print(f"Text: {match['metadata']['text']}")
14 print(f"Source: {match['metadata']['source']}")
15 print()

Namespaces

python.py
1# Namespaces = logical partitions (multi-tenant)
2# Upsert to specific namespace
3index.upsert(vectors=vectors, namespace="company-a")
4index.upsert(vectors=vectors, namespace="company-b")
5
6# Query within namespace
7results = index.query(
8 vector=query_embedding,
9 top_k=5,
10 namespace="company-a",
11 include_metadata=True
12)

Checkpoint

Bạn đã biết cách setup Pinecone, upsert vectors và query với namespaces chưa?

2

🛠️ Weaviate

TB5 min

Setup

python.py
1# pip install weaviate-client
2import weaviate
3from weaviate.classes.config import Configure, Property, DataType
4
5# Connect to Weaviate Cloud
6client = weaviate.connect_to_weaviate_cloud(
7 cluster_url="your-cluster-url",
8 auth_credentials=weaviate.auth.AuthApiKey("your-api-key")
9)
10
11# Or local Docker instance
12# client = weaviate.connect_to_local()

Schema & Collection

python.py
1# Create collection with vectorizer
2client.collections.create(
3 name="Document",
4 vectorizer_config=Configure.Vectorizer.text2vec_openai(
5 model="text-embedding-3-small"
6 ),
7 properties=[
8 Property(name="content", data_type=DataType.TEXT),
9 Property(name="source", data_type=DataType.TEXT),
10 Property(name="category", data_type=DataType.TEXT),
11 Property(name="page_number", data_type=DataType.INT),
12 ]
13)
14
15collection = client.collections.get("Document")

Add & Query

python.py
1# Add objects (auto-vectorized)
2collection.data.insert({
3 "content": "RAG pipelines use vector search to find relevant documents.",
4 "source": "rag-handbook",
5 "category": "architecture",
6 "page_number": 15
7})
8
9# Batch insert
10with collection.batch.dynamic() as batch:
11 for doc in documents:
12 batch.add_object(properties=doc)
13
14# Semantic search
15response = collection.query.near_text(
16 query="vector database comparison",
17 limit=5,
18 return_metadata=["distance"]
19)
20
21for obj in response.objects:
22 print(f"Distance: {obj.metadata.distance:.4f}")
23 print(f"Content: {obj.properties['content'][:100]}")

Hybrid Search (Keyword + Semantic)

python.py
1# Weaviate's killer feature: hybrid search
2response = collection.query.hybrid(
3 query="RAG chunking strategy",
4 alpha=0.5, # 0 = pure keyword (BM25), 1 = pure vector
5 limit=5,
6 return_metadata=["score"]
7)
8
9for obj in response.objects:
10 print(f"Score: {obj.metadata.score:.4f}")
11 print(f"Content: {obj.properties['content'][:100]}")
12 print()

Checkpoint

Bạn đã hiểu Weaviate hybrid search và cách tạo schema/collection chưa?

3

📊 Vector DB Comparison

TB5 min

Feature Matrix

FeatureChromaDBPineconeWeaviateQdrant
TypeLocal/embeddedCloud managedSelf-host/cloudSelf-host/cloud
PricingFreeFree tier + paidFree (open-source)Free (open-source)
Setuppip installAPI keyDocker/cloudDocker/cloud
Hybrid SearchNoNoYes (BM25 + vector)Yes
ScalabilitySmall-mediumHighHighHigh
Best ForPrototypingProduction SaaSHybrid searchPerformance

Decision Matrix

Use CaseRecommendedWhy
Learning/prototypingChromaDBZero setup, free
Multi-tenant SaaSPineconeNamespaces, managed
Keyword + semantic searchWeaviateNative hybrid search
High performance on-premQdrantRust-based, fast
Cost-sensitive productionWeaviate/QdrantOpen-source, self-host
Vietnamese text searchWeaviateHybrid search handles Vietnamese well

Embedding Model Comparison

ModelDimsSpeedQualityCostVietnamese
text-embedding-3-small1536FastGood$Good
text-embedding-3-large3072MediumBest$$Best
all-MiniLM-L6-v2384FastestOKFreeLimited
multilingual-e5-large1024MediumGreatFreeGreat
paraphrase-multilingual384FastGoodFreeGood

Checkpoint

Bạn đã so sánh được các cloud vector databases và biết cách chọn phù hợp chưa?

4

💻 LangChain Integration

TB5 min

ChromaDB with LangChain

python.py
1from langchain_community.vectorstores import Chroma
2from langchain_openai import OpenAIEmbeddings
3
4embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
5
6# Create vector store
7vectorstore = Chroma.from_texts(
8 texts=["doc1 content", "doc2 content", "doc3 content"],
9 embedding=embeddings,
10 metadatas=[{"source": "a"}, {"source": "b"}, {"source": "c"}],
11 persist_directory="./chroma_langchain"
12)
13
14# Search
15docs = vectorstore.similarity_search("query text", k=3)
16for doc in docs:
17 print(doc.page_content[:100])

Pinecone with LangChain

python.py
1from langchain_pinecone import PineconeVectorStore
2
3vectorstore = PineconeVectorStore(
4 index_name="rag-knowledge-base",
5 embedding=embeddings,
6 pinecone_api_key="your-key"
7)
8
9# Add documents
10vectorstore.add_texts(
11 texts=["content1", "content2"],
12 metadatas=[{"source": "a"}, {"source": "b"}]
13)
14
15# Search
16docs = vectorstore.similarity_search("query", k=5)

As Retriever (for RAG chain)

python.py
1# Convert to retriever
2retriever = vectorstore.as_retriever(
3 search_type="similarity",
4 search_kwargs={"k": 5}
5)
6
7# Or MMR (Maximal Marginal Relevance) for diversity
8retriever = vectorstore.as_retriever(
9 search_type="mmr",
10 search_kwargs={"k": 5, "fetch_k": 20, "lambda_mult": 0.7}
11)
12
13# Use in RAG chain
14relevant_docs = retriever.invoke("How to implement RAG?")

Checkpoint

Bạn đã biết cách tích hợp vector databases với LangChain và sử dụng MMR retrieval chưa?

5

🎯 Tổng kết

TB5 min

📝 Quiz

  1. Pinecone namespaces dùng cho?

    • Tăng tốc query
    • Logical partitioning (multi-tenant, separate data)
    • Backup data
    • Thay thế metadata
  2. Weaviate hybrid search kết hợp?

    • Hai vector models
    • BM25 keyword search + vector semantic search
    • Semantic + graph search
    • Không thực sự "hybrid"
  3. Khi nào chọn self-hosted vector DB?

    • Cost-sensitive, cần control data, on-premise requirement
    • Khi mới bắt đầu học
    • Luôn luôn tốt hơn cloud
    • Khi data ít

Key Takeaways

  1. Pinecone — Easiest managed cloud, tốt cho SaaS
  2. Weaviate — Best hybrid search, open-source
  3. Choose based on: scale, cost, search type, hosting preference
  4. LangChain — Unified API cho tất cả vector stores
  5. MMR retrieval — Cân bằng relevance và diversity

Câu hỏi tự kiểm tra

  1. Pinecone namespaces dùng để làm gì và khi nào nên sử dụng chúng?
  2. Weaviate hybrid search kết hợp BM25 và vector search như thế nào?
  3. Khi nào nên chọn self-hosted vector database thay vì cloud managed service?
  4. MMR (Maximal Marginal Relevance) retrieval hoạt động ra sao và tại sao nó cần thiết?

🎉 Tuyệt vời! Bạn đã hoàn thành bài học Cloud Vector Databases!

Tiếp theo: Hãy cùng tìm hiểu về Document Loaders & Formats trong bài tiếp theo!


🚀 Bài tiếp theo

Document Loaders & Formats — Load PDF, Word, Web, và xử lý multiple document formats!