✍️ Text Generation với LangChain

Trong bài này, chúng ta sẽ học cách sử dụng LangChain để tạo text với các LLMs như OpenAI, Claude, và local models.

LangChain là gì?

LangChain là framework để build LLM applications với các components:

Diagram

graph LR
    P[Prompts] --> L[LLMs]
    L --> C[Chains]
    C --> A[Agents]
    A --> T[Tools]

LangChain Components

Prompts: Template cho input
LLMs: Large Language Models
Chains: Kết nối nhiều components
Agents: LLMs với decision making
Tools: External capabilities

Setup

Bash

1pip install langchain langchain-openai langchain-anthropic python-dotenv

Python

1import os
2from dotenv import load_dotenv
3
4load_dotenv()
5
6# Set API keys
7os.environ["OPENAI_API_KEY"] = "your-api-key"
8os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

Basic Text Generation

OpenAI Models

Python

1from langchain_openai import ChatOpenAI
2
3# Initialize model
4llm = ChatOpenAI(
5    model="gpt-4o-mini",
6    temperature=0.7,
7    max_tokens=1000
8)
9
10# Simple generation
11response = llm.invoke("Viết một đoạn văn ngắn về AI")
12print(response.content)

Claude Models

Python

1from langchain_anthropic import ChatAnthropic
2
3llm = ChatAnthropic(
4    model="claude-3-5-sonnet-20241022",
5    temperature=0.7
6)
7
8response = llm.invoke("Giải thích machine learning cho người mới")
9print(response.content)

Prompt Templates

Basic Template

Python

1from langchain_core.prompts import PromptTemplate
2
3# Create template
4template = PromptTemplate.from_template("""
5Bạn là một chuyên gia về {topic}.
6
7Hãy trả lời câu hỏi sau một cách chi tiết:
8{question}
9
10Trả lời:
11""")
12
13# Format template
14prompt = template.format(
15    topic="machine learning",
16    question="Deep Learning khác gì với ML truyền thống?"
17)
18
19# Generate
20response = llm.invoke(prompt)

Chat Prompt Template

Python

1from langchain_core.prompts import ChatPromptTemplate
2
3template = ChatPromptTemplate.from_messages([
4    ("system", "Bạn là một trợ lý AI chuyên về {domain}. Trả lời bằng tiếng Việt."),
5    ("human", "{question}")
6])
7
8messages = template.format_messages(
9    domain="data science",
10    question="Làm sao để xử lý missing values?"
11)
12
13response = llm.invoke(messages)

LCEL (LangChain Expression Language)

Chain với Pipe Operator

Python

1from langchain_core.output_parsers import StrOutputParser
2
3# Create chain với |
4chain = template | llm | StrOutputParser()
5
6# Run chain
7result = chain.invoke({
8    "domain": "Python programming",
9    "question": "List comprehension hoạt động thế nào?"
10})
11print(result)

Multiple Steps

Python

1# Step 1: Generate ideas
2idea_template = ChatPromptTemplate.from_messages([
3    ("system", "Bạn là creative writer."),
4    ("human", "Đề xuất 3 ý tưởng cho bài blog về {topic}")
5])
6
7# Step 2: Expand idea
8expand_template = ChatPromptTemplate.from_messages([
9    ("system", "Bạn là content writer chuyên nghiệp."),
10    ("human", "Viết outline chi tiết cho ý tưởng sau:\n{idea}")
11])
12
13# Chain them
14idea_chain = idea_template | llm | StrOutputParser()
15expand_chain = expand_template | llm | StrOutputParser()
16
17# Run
18ideas = idea_chain.invoke({"topic": "AI trong giáo dục"})
19outline = expand_chain.invoke({"idea": ideas})

Structured Output

Pydantic Models

Python

1from pydantic import BaseModel, Field
2from typing import List
3
4class BlogPost(BaseModel):
5    title: str = Field(description="Tiêu đề bài viết")
6    summary: str = Field(description="Tóm tắt ngắn")
7    sections: List[str] = Field(description="Các phần chính")
8    keywords: List[str] = Field(description="Từ khóa SEO")
9
10# Use structured output
11structured_llm = llm.with_structured_output(BlogPost)
12
13result = structured_llm.invoke(
14    "Tạo cấu trúc bài blog về Machine Learning cơ bản"
15)
16
17print(f"Title: {result.title}")
18print(f"Summary: {result.summary}")
19print(f"Sections: {result.sections}")

JSON Output

Python

1from langchain_core.output_parsers import JsonOutputParser
2
3parser = JsonOutputParser()
4
5template = ChatPromptTemplate.from_messages([
6    ("system", "Trả lời bằng JSON format với các field: title, content, tags"),
7    ("human", "{query}")
8])
9
10chain = template | llm | parser
11
12result = chain.invoke({"query": "Viết về Python basics"})
13# result is a dict

Streaming

Basic Streaming

Python

1# Stream response
2for chunk in llm.stream("Viết một bài thơ về mùa xuân"):
3    print(chunk.content, end="", flush=True)

Async Streaming

Python

1import asyncio
2
3async def stream_response():
4    async for chunk in llm.astream("Giải thích quantum computing"):
5        print(chunk.content, end="", flush=True)
6
7asyncio.run(stream_response())

Content Generation Use Cases

1. Blog Generator

Python

1blog_template = ChatPromptTemplate.from_messages([
2    ("system", """Bạn là một content writer chuyên nghiệp.
3    Viết bài blog với format:
4    - Tiêu đề hấp dẫn
5    - Introduction hook
6    - 3-5 phần chính với headings
7    - Conclusion với call-to-action
8    Sử dụng tone thân thiện và dễ hiểu."""),
9    ("human", "Viết bài blog về: {topic}\nĐối tượng: {audience}\nĐộ dài: {length} từ")
10])
11
12blog_chain = blog_template | llm | StrOutputParser()
13
14blog = blog_chain.invoke({
15    "topic": "Cách học lập trình hiệu quả",
16    "audience": "người mới bắt đầu",
17    "length": "800"
18})

2. Email Writer

Python

1email_template = ChatPromptTemplate.from_messages([
2    ("system", """Bạn là chuyên gia viết email chuyên nghiệp.
3    Viết email với tone {tone}.
4    Format: Subject, Greeting, Body, Closing"""),
5    ("human", "Viết email {type} cho {recipient} về {topic}")
6])
7
8email_chain = email_template | llm | StrOutputParser()
9
10email = email_chain.invoke({
11    "tone": "formal",
12    "type": "request",
13    "recipient": "manager",
14    "topic": "xin nghỉ phép 3 ngày"
15})

3. Code Documentation

Python

1doc_template = ChatPromptTemplate.from_messages([
2    ("system", """Bạn là technical writer.
3    Tạo documentation cho code với:
4    - Mô tả function/class
5    - Parameters và return values
6    - Examples
7    - Notes nếu cần"""),
8    ("human", "Document code sau:\n```python\n{code}\n```")
9])
10
11code = """
12def calculate_metrics(y_true, y_pred, average='weighted'):
13    precision = precision_score(y_true, y_pred, average=average)
14    recall = recall_score(y_true, y_pred, average=average)
15    f1 = f1_score(y_true, y_pred, average=average)
16    return {'precision': precision, 'recall': recall, 'f1': f1}
17"""
18
19doc_chain = doc_template | llm | StrOutputParser()
20documentation = doc_chain.invoke({"code": code})

Best Practices

Text Generation Tips

Be specific trong prompts
Set temperature phù hợp:
- 0.0-0.3: Factual, consistent
- 0.5-0.7: Balanced
- 0.8-1.0: Creative
Use system prompts để set context
Validate output với structured output
Handle errors gracefully
Monitor token usage để control costs

Error Handling

Python

1from langchain_core.exceptions import OutputParserException
2
3def safe_generate(chain, input_data, retries=3):
4    for attempt in range(retries):
5        try:
6            return chain.invoke(input_data)
7        except OutputParserException as e:
8            print(f"Parse error (attempt {attempt + 1}): {e}")
9        except Exception as e:
10            print(f"Error (attempt {attempt + 1}): {e}")
11    
12    return None
13
14# Usage
15result = safe_generate(chain, {"topic": "AI"})

Bài tập thực hành

Hands-on Exercise

Build Content Generator:

Tạo multi-purpose content generator
Support các types:
- Blog posts
- Social media posts
- Email templates
- Product descriptions
Implement streaming output
Add structured output validation

Target: Generator có thể tạo nhiều loại content khác nhau

✍️ Text Generation với LangChain

LangChain là gì?

Setup

Basic Text Generation

OpenAI Models

Claude Models

Prompt Templates

Basic Template

Chat Prompt Template

LCEL (LangChain Expression Language)

Chain với Pipe Operator

Multiple Steps

Structured Output

Pydantic Models

JSON Output

Streaming

Basic Streaming

Async Streaming

Content Generation Use Cases

1. Blog Generator

2. Email Writer

3. Code Documentation

Best Practices

Error Handling

Bài tập thực hành

Tài liệu tham khảo