Planning & Self-Reflection

0

🎯 Mục tiêu bài học

TB5 min

Agents đơn giản react theo từng bước. Advanced agents biết plan trước, execute theo plan, và tự correct khi sai.

Sau bài này, bạn sẽ:

✅ Plan-and-Execute pattern ✅ Replanning khi context thay đổi ✅ Self-reflection và self-correction ✅ Critique-revise loop

Task 0

1

📐 Plan-and-Execute Pattern

TB5 min

Concept

Ví dụ

1ReAct Agent:     Think → Act → Observe → Think → Act → ... (step-by-step)
2 
3Plan-and-Execute: Plan ALL steps → Execute step 1 → Execute step 2 → ...
4                  If plan needs change → Replan → Continue
5 
6Plan-and-Execute is better for complex multi-step tasks.

Planner

python.py

1from langchain_openai import ChatOpenAI
2from langchain_core.prompts import ChatPromptTemplate
3
4llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
5
6planner_prompt = ChatPromptTemplate.from_template("""
7You are a planning agent. Create a step-by-step plan to accomplish the goal.
8
9Goal: {goal}
10
11Available tools:
12- search_web: Search for information online
13- calculate: Do math calculations
14- write_document: Create a document
15- send_email: Send email
16
17Rules:
181. Break the goal into 3-7 concrete steps
192. Each step should use exactly ONE tool
203. Steps should be in logical order
214. Be specific about what each step does
22
23Output format:
24Step 1: [tool_name] - [description]
25Step 2: [tool_name] - [description]
26...
27""")
28
29def create_plan(goal):
30    chain = planner_prompt | llm
31    result = chain.invoke({"goal": goal})
32    
33    # Parse steps
34    steps = []
35    for line in result.content.strip().split("\n"):
36        if line.startswith("Step"):
37            steps.append(line)
38    
39    return steps
40
41# Example
42plan = create_plan("Research AI market size 2025 and create a summary report")
43for step in plan:
44    print(step)
45# Step 1: search_web - Search for "AI market size 2025 report"
46# Step 2: search_web - Search for "AI market growth predictions 2025"
47# Step 3: calculate - Calculate CAGR from collected data
48# Step 4: write_document - Create summary report with findings

Executor

python.py

1from typing import TypedDict, Annotated
2from langgraph.graph import StateGraph, END
3import operator
4
5class PlanExecuteState(TypedDict):
6    messages: Annotated[list, operator.add]
7    goal: str
8    plan: list       # List of steps
9    current_step: int
10    results: dict    # Step index -> result
11    final_output: str
12
13def plan_node(state):
14    """Create initial plan."""
15    goal = state["goal"]
16    plan = create_plan(goal)
17    return {
18        "plan": plan,
19        "current_step": 0,
20        "results": {},
21        "messages": [{"role": "system", "content": f"Created plan with {len(plan)} steps"}]
22    }
23
24def execute_node(state):
25    """Execute current step."""
26    plan = state["plan"]
27    step_idx = state["current_step"]
28    
29    if step_idx >= len(plan):
30        return {"messages": [{"role": "system", "content": "All steps completed"}]}
31    
32    current_step = plan[step_idx]
33    
34    # Execute the step (simplified)
35    result = llm.invoke(
36        f"Execute this step: {current_step}\n"
37        f"Previous results: {state.get('results', {})}"
38    ).content
39    
40    results = state.get("results", {})
41    results[step_idx] = result
42    
43    return {
44        "current_step": step_idx + 1,
45        "results": results,
46        "messages": [{"role": "system", "content": f"Completed step {step_idx + 1}"}]
47    }
48
49def should_continue(state):
50    """Check if more steps to execute."""
51    if state["current_step"] >= len(state["plan"]):
52        return "summarize"
53    return "execute"
54
55def summarize_node(state):
56    """Create final output from all results."""
57    results = state["results"]
58    goal = state["goal"]
59    
60    summary = llm.invoke(
61        f"Summarize the results for goal: {goal}\nResults: {results}"
62    ).content
63    
64    return {"final_output": summary}
65
66# Build graph
67graph = StateGraph(PlanExecuteState)
68graph.add_node("plan", plan_node)
69graph.add_node("execute", execute_node)
70graph.add_node("summarize", summarize_node)
71
72graph.set_entry_point("plan")
73graph.add_edge("plan", "execute")
74graph.add_conditional_edges("execute", should_continue, {
75    "execute": "execute",
76    "summarize": "summarize"
77})
78graph.add_edge("summarize", END)
79
80app = graph.compile()

Checkpoint

Bạn đã hiểu Plan-and-Execute pattern và sự khác biệt với ReAct chưa?

Task 1

2

🔍 Replanning

TB5 min

Dynamic Replanning

python.py

1replan_prompt = ChatPromptTemplate.from_template("""
2Original goal: {goal}
3Original plan: {plan}
4Completed steps: {completed}
5Current results: {results}
6Issue encountered: {issue}
7
8Should we:
91. CONTINUE with the current plan
102. REPLAN with adjusted steps
11
12If REPLAN, provide the NEW remaining steps.
13""")
14
15def replan_node(state):
16    """Replan when execution hits an issue."""
17    chain = replan_prompt | llm
18    result = chain.invoke({
19        "goal": state["goal"],
20        "plan": state["plan"],
21        "completed": state["current_step"],
22        "results": state["results"],
23        "issue": state.get("last_error", "No specific issue")
24    })
25    
26    content = result.content
27    
28    if "REPLAN" in content:
29        # Parse new remaining steps
30        new_steps = []
31        for line in content.split("\n"):
32            if line.startswith("Step"):
33                new_steps.append(line)
34        
35        # Keep completed steps + new remaining
36        completed = state["plan"][:state["current_step"]]
37        new_plan = completed + new_steps
38        
39        return {
40            "plan": new_plan,
41            "messages": [{"role": "system", "content": f"Replanned: {len(new_steps)} new steps"}]
42        }
43    
44    return {"messages": [{"role": "system", "content": "Continuing with current plan"}]}

Checkpoint

Bạn đã hiểu khi nào và cách nào agent thực hiện replanning chưa?

Task 2

3

🔍 Self-Reflection

TB5 min

Reflection Pattern

python.py

1reflection_prompt = ChatPromptTemplate.from_template("""
2Review the agent's output for quality and correctness.
3
4Task: {task}
5Output: {output}
6
7Evaluate:
81. Is the output correct and complete?
92. Are there any errors or hallucinations?
103. Is anything missing?
11
12Response format:
13QUALITY: [GOOD/NEEDS_IMPROVEMENT/POOR]
14ISSUES: [list any issues found]
15SUGGESTIONS: [how to improve]
16""")
17
18def reflect_node(state):
19    """Self-reflect on output quality."""
20    chain = reflection_prompt | llm
21    reflection = chain.invoke({
22        "task": state["goal"],
23        "output": state.get("final_output", "")
24    })
25    
26    content = reflection.content
27    
28    if "NEEDS_IMPROVEMENT" in content or "POOR" in content:
29        return {
30            "messages": [{"role": "system", "content": f"Reflection: {content}"}],
31            "needs_revision": True
32        }
33    
34    return {
35        "messages": [{"role": "system", "content": "Reflection: Output looks good"}],
36        "needs_revision": False
37    }

Critique-Revise Loop

python.py

1class ReflectiveState(TypedDict):
2    messages: Annotated[list, operator.add]
3    task: str
4    draft: str
5    critique: str
6    revision_count: int
7    max_revisions: int
8    final_output: str
9
10def generate_node(state):
11    """Generate initial draft."""
12    result = llm.invoke(f"Complete this task:\n{state['task']}").content
13    return {
14        "draft": result,
15        "revision_count": state.get("revision_count", 0)
16    }
17
18def critique_node(state):
19    """Critique the draft."""
20    critique = llm.invoke(
21        f"Critically review this output. Be specific about issues:\n"
22        f"Task: {state['task']}\n"
23        f"Output: {state['draft']}"
24    ).content
25    return {"critique": critique}
26
27def revise_node(state):
28    """Revise based on critique."""
29    revised = llm.invoke(
30        f"Revise this output based on the critique:\n"
31        f"Original: {state['draft']}\n"
32        f"Critique: {state['critique']}\n"
33        f"Provide an improved version."
34    ).content
35    return {
36        "draft": revised,
37        "revision_count": state.get("revision_count", 0) + 1
38    }
39
40def should_revise(state):
41    """Decide if more revisions needed."""
42    if state.get("revision_count", 0) >= state.get("max_revisions", 2):
43        return "finalize"
44    
45    # Check if critique suggests issues
46    critique = state.get("critique", "")
47    if "no major issues" in critique.lower() or "looks good" in critique.lower():
48        return "finalize"
49    
50    return "revise"
51
52def finalize_node(state):
53    """Finalize the output."""
54    return {"final_output": state["draft"]}
55
56# Build critique-revise graph
57graph = StateGraph(ReflectiveState)
58graph.add_node("generate", generate_node)
59graph.add_node("critique", critique_node)
60graph.add_node("revise", revise_node)
61graph.add_node("finalize", finalize_node)
62
63graph.set_entry_point("generate")
64graph.add_edge("generate", "critique")
65graph.add_conditional_edges("critique", should_revise, {
66    "revise": "revise",
67    "finalize": "finalize"
68})
69graph.add_edge("revise", "critique")  # Critique again after revision
70graph.add_edge("finalize", END)
71
72app = graph.compile()
73
74# Run
75result = app.invoke({
76    "task": "Write a product description for a Vietnamese coffee brand",
77    "messages": [],
78    "revision_count": 0,
79    "max_revisions": 2
80})

Checkpoint

Bạn đã hiểu critique-revise loop và cách agent tự cải thiện output chưa?

Task 3

4

📐 Combined: Plan-Execute-Reflect

TB5 min

Plan-Execute-Reflect Workflow

📝Plan

⚡Execute

🔍Reflect

🔄Replan

✅Finalize

✏️Revise

Checkpoint

Bạn đã hiểu cách kết hợp Plan-Execute-Reflect thành workflow hoàn chỉnh chưa?

Task 4

Planning & Self-Reflection

🎯 Mục tiêu bài học

Sau bài này, bạn sẽ:

📐 Plan-and-Execute Pattern

Concept

Planner

Executor

Checkpoint

🔍 Replanning

Dynamic Replanning

Checkpoint

🔍 Self-Reflection

Reflection Pattern

Critique-Revise Loop

Checkpoint

📐 Combined: Plan-Execute-Reflect

Plan-Execute-Reflect Workflow

Checkpoint

🎯 Tổng kết

📝 Quiz

Key Takeaways

Câu hỏi tự kiểm tra

🚀 Bài tiếp theo

Khóa học

Mentor & Hỗ trợ

Blog

Giới thiệu