MinAI - Về trang chủ
Lý thuyết
10/1335 phút
Đang tải...

Planning & Self-Reflection

Plan-and-Execute agents và Self-Reflection cho autonomous systems

0

🎯 Mục tiêu bài học

TB5 min

Agents đơn giản react theo từng bước. Advanced agents biết plan trước, execute theo plan, và tự correct khi sai.

Sau bài này, bạn sẽ:

✅ Plan-and-Execute pattern ✅ Replanning khi context thay đổi ✅ Self-reflection và self-correction ✅ Critique-revise loop

1

📐 Plan-and-Execute Pattern

TB5 min

Concept

Ví dụ
1ReAct Agent: Think → Act → Observe → Think → Act → ... (step-by-step)
2
3Plan-and-Execute: Plan ALL steps → Execute step 1 → Execute step 2 → ...
4 If plan needs change → Replan → Continue
5
6Plan-and-Execute is better for complex multi-step tasks.

Planner

python.py
1from langchain_openai import ChatOpenAI
2from langchain_core.prompts import ChatPromptTemplate
3
4llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
5
6planner_prompt = ChatPromptTemplate.from_template("""
7You are a planning agent. Create a step-by-step plan to accomplish the goal.
8
9Goal: {goal}
10
11Available tools:
12- search_web: Search for information online
13- calculate: Do math calculations
14- write_document: Create a document
15- send_email: Send email
16
17Rules:
181. Break the goal into 3-7 concrete steps
192. Each step should use exactly ONE tool
203. Steps should be in logical order
214. Be specific about what each step does
22
23Output format:
24Step 1: [tool_name] - [description]
25Step 2: [tool_name] - [description]
26...
27""")
28
29def create_plan(goal):
30 chain = planner_prompt | llm
31 result = chain.invoke({"goal": goal})
32
33 # Parse steps
34 steps = []
35 for line in result.content.strip().split("\n"):
36 if line.startswith("Step"):
37 steps.append(line)
38
39 return steps
40
41# Example
42plan = create_plan("Research AI market size 2025 and create a summary report")
43for step in plan:
44 print(step)
45# Step 1: search_web - Search for "AI market size 2025 report"
46# Step 2: search_web - Search for "AI market growth predictions 2025"
47# Step 3: calculate - Calculate CAGR from collected data
48# Step 4: write_document - Create summary report with findings

Executor

python.py
1from typing import TypedDict, Annotated
2from langgraph.graph import StateGraph, END
3import operator
4
5class PlanExecuteState(TypedDict):
6 messages: Annotated[list, operator.add]
7 goal: str
8 plan: list # List of steps
9 current_step: int
10 results: dict # Step index -> result
11 final_output: str
12
13def plan_node(state):
14 """Create initial plan."""
15 goal = state["goal"]
16 plan = create_plan(goal)
17 return {
18 "plan": plan,
19 "current_step": 0,
20 "results": {},
21 "messages": [{"role": "system", "content": f"Created plan with {len(plan)} steps"}]
22 }
23
24def execute_node(state):
25 """Execute current step."""
26 plan = state["plan"]
27 step_idx = state["current_step"]
28
29 if step_idx >= len(plan):
30 return {"messages": [{"role": "system", "content": "All steps completed"}]}
31
32 current_step = plan[step_idx]
33
34 # Execute the step (simplified)
35 result = llm.invoke(
36 f"Execute this step: {current_step}\n"
37 f"Previous results: {state.get('results', {})}"
38 ).content
39
40 results = state.get("results", {})
41 results[step_idx] = result
42
43 return {
44 "current_step": step_idx + 1,
45 "results": results,
46 "messages": [{"role": "system", "content": f"Completed step {step_idx + 1}"}]
47 }
48
49def should_continue(state):
50 """Check if more steps to execute."""
51 if state["current_step"] >= len(state["plan"]):
52 return "summarize"
53 return "execute"
54
55def summarize_node(state):
56 """Create final output from all results."""
57 results = state["results"]
58 goal = state["goal"]
59
60 summary = llm.invoke(
61 f"Summarize the results for goal: {goal}\nResults: {results}"
62 ).content
63
64 return {"final_output": summary}
65
66# Build graph
67graph = StateGraph(PlanExecuteState)
68graph.add_node("plan", plan_node)
69graph.add_node("execute", execute_node)
70graph.add_node("summarize", summarize_node)
71
72graph.set_entry_point("plan")
73graph.add_edge("plan", "execute")
74graph.add_conditional_edges("execute", should_continue, {
75 "execute": "execute",
76 "summarize": "summarize"
77})
78graph.add_edge("summarize", END)
79
80app = graph.compile()

Checkpoint

Bạn đã hiểu Plan-and-Execute pattern và sự khác biệt với ReAct chưa?

2

🔍 Replanning

TB5 min

Dynamic Replanning

python.py
1replan_prompt = ChatPromptTemplate.from_template("""
2Original goal: {goal}
3Original plan: {plan}
4Completed steps: {completed}
5Current results: {results}
6Issue encountered: {issue}
7
8Should we:
91. CONTINUE with the current plan
102. REPLAN with adjusted steps
11
12If REPLAN, provide the NEW remaining steps.
13""")
14
15def replan_node(state):
16 """Replan when execution hits an issue."""
17 chain = replan_prompt | llm
18 result = chain.invoke({
19 "goal": state["goal"],
20 "plan": state["plan"],
21 "completed": state["current_step"],
22 "results": state["results"],
23 "issue": state.get("last_error", "No specific issue")
24 })
25
26 content = result.content
27
28 if "REPLAN" in content:
29 # Parse new remaining steps
30 new_steps = []
31 for line in content.split("\n"):
32 if line.startswith("Step"):
33 new_steps.append(line)
34
35 # Keep completed steps + new remaining
36 completed = state["plan"][:state["current_step"]]
37 new_plan = completed + new_steps
38
39 return {
40 "plan": new_plan,
41 "messages": [{"role": "system", "content": f"Replanned: {len(new_steps)} new steps"}]
42 }
43
44 return {"messages": [{"role": "system", "content": "Continuing with current plan"}]}

Checkpoint

Bạn đã hiểu khi nào và cách nào agent thực hiện replanning chưa?

3

🔍 Self-Reflection

TB5 min

Reflection Pattern

python.py
1reflection_prompt = ChatPromptTemplate.from_template("""
2Review the agent's output for quality and correctness.
3
4Task: {task}
5Output: {output}
6
7Evaluate:
81. Is the output correct and complete?
92. Are there any errors or hallucinations?
103. Is anything missing?
11
12Response format:
13QUALITY: [GOOD/NEEDS_IMPROVEMENT/POOR]
14ISSUES: [list any issues found]
15SUGGESTIONS: [how to improve]
16""")
17
18def reflect_node(state):
19 """Self-reflect on output quality."""
20 chain = reflection_prompt | llm
21 reflection = chain.invoke({
22 "task": state["goal"],
23 "output": state.get("final_output", "")
24 })
25
26 content = reflection.content
27
28 if "NEEDS_IMPROVEMENT" in content or "POOR" in content:
29 return {
30 "messages": [{"role": "system", "content": f"Reflection: {content}"}],
31 "needs_revision": True
32 }
33
34 return {
35 "messages": [{"role": "system", "content": "Reflection: Output looks good"}],
36 "needs_revision": False
37 }

Critique-Revise Loop

python.py
1class ReflectiveState(TypedDict):
2 messages: Annotated[list, operator.add]
3 task: str
4 draft: str
5 critique: str
6 revision_count: int
7 max_revisions: int
8 final_output: str
9
10def generate_node(state):
11 """Generate initial draft."""
12 result = llm.invoke(f"Complete this task:\n{state['task']}").content
13 return {
14 "draft": result,
15 "revision_count": state.get("revision_count", 0)
16 }
17
18def critique_node(state):
19 """Critique the draft."""
20 critique = llm.invoke(
21 f"Critically review this output. Be specific about issues:\n"
22 f"Task: {state['task']}\n"
23 f"Output: {state['draft']}"
24 ).content
25 return {"critique": critique}
26
27def revise_node(state):
28 """Revise based on critique."""
29 revised = llm.invoke(
30 f"Revise this output based on the critique:\n"
31 f"Original: {state['draft']}\n"
32 f"Critique: {state['critique']}\n"
33 f"Provide an improved version."
34 ).content
35 return {
36 "draft": revised,
37 "revision_count": state.get("revision_count", 0) + 1
38 }
39
40def should_revise(state):
41 """Decide if more revisions needed."""
42 if state.get("revision_count", 0) >= state.get("max_revisions", 2):
43 return "finalize"
44
45 # Check if critique suggests issues
46 critique = state.get("critique", "")
47 if "no major issues" in critique.lower() or "looks good" in critique.lower():
48 return "finalize"
49
50 return "revise"
51
52def finalize_node(state):
53 """Finalize the output."""
54 return {"final_output": state["draft"]}
55
56# Build critique-revise graph
57graph = StateGraph(ReflectiveState)
58graph.add_node("generate", generate_node)
59graph.add_node("critique", critique_node)
60graph.add_node("revise", revise_node)
61graph.add_node("finalize", finalize_node)
62
63graph.set_entry_point("generate")
64graph.add_edge("generate", "critique")
65graph.add_conditional_edges("critique", should_revise, {
66 "revise": "revise",
67 "finalize": "finalize"
68})
69graph.add_edge("revise", "critique") # Critique again after revision
70graph.add_edge("finalize", END)
71
72app = graph.compile()
73
74# Run
75result = app.invoke({
76 "task": "Write a product description for a Vietnamese coffee brand",
77 "messages": [],
78 "revision_count": 0,
79 "max_revisions": 2
80})

Checkpoint

Bạn đã hiểu critique-revise loop và cách agent tự cải thiện output chưa?

4

📐 Combined: Plan-Execute-Reflect

TB5 min

Plan-Execute-Reflect Workflow

📝Plan
Execute
🔍Reflect
🔄Replan
Finalize
✏️Revise

Checkpoint

Bạn đã hiểu cách kết hợp Plan-Execute-Reflect thành workflow hoàn chỉnh chưa?

5

🎯 Tổng kết

TB5 min

📝 Quiz

  1. Plan-and-Execute khác ReAct ở điểm gì?

    • Plan ALL steps trước, rồi execute — tốt hơn cho complex multi-step tasks
    • Nhanh hơn
    • Đơn giản hơn
    • Không khác
  2. Khi nào agent nên replan?

    • Khi execution result khác expected hoặc gặp error
    • Mỗi step
    • Chỉ khi user yêu cầu
    • Không bao giờ
  3. Self-reflection giúp agent thế nào?

    • Detect errors trong output và tự improve qua critique-revise loop
    • Chạy nhanh hơn
    • Dùng ít tokens hơn
    • Không có tác dụng

Key Takeaways

  1. Plan-and-Execute — Better than ReAct for complex tasks
  2. Replanning — Adapt when things change
  3. Self-Reflection — Agent reviews own output
  4. Critique-Revise — Iterative improvement loop
  5. Max revisions — Always set limits to avoid infinite loops

Câu hỏi tự kiểm tra

  1. Plan-and-Execute pattern khác biệt gì so với ReAct pattern?
  2. Khi nào agent cần thực hiện replanning trong quá trình execution?
  3. Self-reflection giúp agent cải thiện output như thế nào?
  4. Tại sao cần đặt giới hạn max_revisions cho critique-revise loop?

🎉 Tuyệt vời! Bạn đã hoàn thành bài học Planning & Self-Reflection!

Tiếp theo: Hãy cùng tìm hiểu Multi-Agent Systems — nhiều agents phối hợp giải quyết problems!


🚀 Bài tiếp theo

Multi-Agent Systems — Nhiều agents phối hợp giải quyết problems!