Error Handling & Monitoring
Production workflows PHẢI handle errors gracefully. Bài cuối này cover error handling, logging, monitoring, và capstone project.
🎯 Mục tiêu bài học
🎯 Mục tiêu
- Error handling patterns trong n8n
- Retry logic
- Logging & alerting
- Production best practices
- Capstone: Build complete automation system
Checkpoint
Tại sao error handling là bắt buộc cho production workflows?
❌ Error Types in n8n
1. Error Types in n8n
1.1 Common Errors
| Error | Cause | Fix |
|---|---|---|
| Connection errors | API down, network issue | Retry with backoff |
| Auth errors | Expired token, wrong key | Refresh credentials |
| Rate limit | Too many requests | Slow down, batch |
| Data errors | Missing field, wrong type | Validate before processing |
| Timeout | Slow API, large payload | Increase timeout, chunk data |
1.2 Error Flow in n8n
1Node executes → Success → Next node2 ↓3 Error → Error output (if configured)4 ↓5 OR → Workflow stops (default)Checkpoint
Kể ra 5 loại error phổ biến trong n8n và cách fix cho mỗi loại.
🛡️ Error Handling Patterns
2. Error Handling Patterns
2.1 Continue on Fail (Node Setting)
1Node Settings → Always Output Data → ON2Node Settings → Continue on Fail → ON3 4When error occurs:5- Workflow continues6- Error info available in $json.error7- Next node decides what to do2.2 Error Output (Branching)
Error output contains:
1{2 "error": {3 "message": "Request failed with status code 429",4 "stack": "..."5 }6}2.3 IF Node for Error Routing
1HTTP Request (Continue on Fail)2→ IF ($json.error exists?)3 → TRUE: Log error → Slack alert → Stop4 → FALSE: Process data → Continue2.4 Error Workflow (Global)
1Settings → Error Workflow → Select workflow2 3This workflow runs when ANY execution fails:4Error Trigger → Format error → Slack alert → Email adminCheckpoint
"Continue on Fail" và "Error Workflow" khác nhau thế nào? Khi nào dùng cái nào?
🔄 Retry Logic
3. Retry Logic
3.1 Built-in Retry
1Node Settings → Retry on Fail → ON2- Max Retries: 33- Wait Between (ms): 10004- Exponential Backoff: ON (1s, 2s, 4s)3.2 Custom Retry with Code
1// Code node: Custom retry with exponential backoff2const maxRetries = 3;3const baseDelay = 1000;45async function fetchWithRetry(url, options) {6 for (let attempt = 0; attempt < maxRetries; attempt++) {7 try {8 const response = await fetch(url, options);9 10 if (response.status === 429) {11 // Rate limited - wait and retry12 const delay = baseDelay * Math.pow(2, attempt);13 await new Promise(r => setTimeout(r, delay));14 continue;15 }16 17 if (!response.ok) {18 throw new Error(`HTTP ${response.status}`);19 }20 21 return await response.json();22 } catch (error) {23 if (attempt === maxRetries - 1) throw error;24 25 const delay = baseDelay * Math.pow(2, attempt);26 await new Promise(r => setTimeout(r, delay));27 }28 }29}3031const data = await fetchWithRetry(32 'https://api.example.com/data',33 { headers: { 'Authorization': `Bearer ${$env.API_KEY}` } }34);3536return { json: data };3.3 Dead Letter Queue
1Khi retry hết mà vẫn fail → Save to "dead letter" cho manual review2 3HTTP Request (3 retries)4→ Still fails5→ Google Sheets (log failed item + error details)6→ Slack alert "Manual review needed"Checkpoint
Exponential backoff nghĩa là gì? Dead Letter Queue dùng khi nào?
📋 Logging
4. Logging
4.1 Execution Log
n8n tự động log mọi execution:
- Settings → Executions → View all past runs
- Filter by: Success, Error, date range, workflow
- Click execution → See data at each node
4.2 Custom Logging
1Google Sheets "Automation Log":2| Timestamp | Workflow | Status | Details | Duration |3 4Log node (Set):5- timestamp: {{ $now.format('yyyy-MM-dd HH:mm:ss') }}6- workflow: {{ $workflow.name }}7- status: "success" / "error"8- details: {{ $json.message || 'OK' }}9- duration: {{ $execution.id }}4.3 Structured Logging Pattern
1Every workflow:21. Start → Log "Started"32. Key steps → Log progress43. Success → Log "Completed"54. Error → Log "Failed" + error details6 7Log to:8- Google Sheets (simple, visual)9- Airtable (structured, filterable)10- Database (production, queryable)11- File (local debugging)Checkpoint
Structured logging pattern gồm những bước nào? Log nên lưu ở đâu cho production?
📡 Monitoring & Alerting
5. Monitoring & Alerting
5.1 Health Check Workflow
1Schedule (every 5 min)2→ HTTP Request (check API endpoints)3→ IF (status !== 200)4 → Slack #alerts: "🔴 API down: {url}"5 → Email on-call6→ ELSE7 → Log "✅ All systems operational"5.2 Error Rate Monitoring
1Schedule (every hour)2→ n8n API: Get executions (last hour)3→ Code: Calculate error rate4 - Total executions5 - Failed executions6 - Error rate %7→ IF (error rate > 10%)8 → Slack alert: "⚠️ Error rate {rate}% in last hour"9→ Log metrics5.3 SLA Monitoring
1Track: "Are our automations running on time?"2 3Schedule (every morning)4→ Check: Did "Daily Report" workflow run successfully yesterday?5→ Check: Did "Data Sync" workflow complete within 10 min?6→ IF any SLA breached7 → Alert team8 → Log SLA breachCheckpoint
Health Check Workflow chạy mỗi bao lâu? Khi API down, workflow làm gì?
🏭 Production Best Practices
6. Production Best Practices
6.1 Workflow Checklist
1Pre-Deploy:2□ Error handling on every HTTP/API node3□ Retry logic for external calls4□ Input validation5□ Credentials stored properly (not hardcoded)6□ Timeouts configured7□ Logging at key steps8□ Error workflow connected9□ Tested with edge cases10□ Documented (notes on nodes)11 12Post-Deploy:13□ Monitor first 24h14□ Check error logs daily15□ Review execution times weekly16□ Update credentials before expiry17□ Backup workflow JSON monthly6.2 Naming Conventions
1Workflows:2 [Team] [Frequency] Description3 e.g., "Sales | Daily | Revenue Report"4 e.g., "HR | Webhook | New Employee Onboarding"5 6Nodes:7 Action + Target8 e.g., "Get Sales Data", "Filter Active", "Send Slack Alert"9 10Credentials:11 Service + Environment12 e.g., "Google Sheets - Production", "OpenAI - Dev"6.3 Version Control
1Export workflow JSON regularly:2- Settings → Download → Save as .json3- Commit to Git repository4- Tag versions: v1.0, v1.1, etc.5 6Benefits:7- Rollback khi có lỗi8- Code review cho workflow changes9- Team collaboration6.4 Security Practices
1✅ Store secrets in Credentials, not workflow2✅ Limit webhook access (auth, IP whitelist)3✅ Use read-only API keys khi có thể4✅ Rotate API keys quarterly5✅ Audit who has n8n access6✅ Enable 2FA cho n8n instance7 8❌ Share API keys qua Slack/email9❌ Use admin credentials cho automations10❌ Skip error handling "because it works"Checkpoint
Kể ra 5 items quan trọng nhất trong Pre-Deploy checklist cho production workflows.
🎓 Capstone Project
7. Capstone Project
7.1 Build: Automated Business Dashboard
Objective: Tạo complete automation system thu thập, xử lý, và report business data.
7.2 Requirements
| # | Feature | Nodes Used |
|---|---|---|
| 1 | Collect data từ 3 sources | Google Sheets, HTTP Request, Airtable |
| 2 | Clean & transform data | Set, Filter, Code |
| 3 | Merge data sources | Merge node |
| 4 | Calculate KPIs | Code node (aggregation) |
| 5 | Generate report | Code (format), Set |
| 6 | Distribute qua 2 channels | Slack + Email |
| 7 | Error handling | Error workflow, retry, logging |
| 8 | Schedule tự động | Schedule Trigger |
7.3 Architecture
Automated Business Dashboard
7.4 KPIs to Calculate
1📈 Revenue: Total, by category, growth %2📦 Inventory: Stock levels, low stock alerts3👥 Customers: New, returning, churn rate4📊 Top Products: By revenue, by quantity5⚠️ Alerts: Low stock, high returns, unusual activity7.5 Rubric
| Criteria | Points |
|---|---|
| Data collection (3 sources working) | 20 |
| Data transformation (clean, merge, calculate) | 20 |
| Report quality (formatted, readable) | 15 |
| Multi-channel delivery (Slack + Email) | 15 |
| Error handling (graceful, with alerts) | 15 |
| Logging & monitoring | 10 |
| Code quality (named nodes, notes) | 5 |
| Total | 100 |
7.6 Extension Challenges
- Interactive: Add Slack slash command
/report weekly - Smart alerts: AI analyze trends, alert anomalies
- Dashboard: Send data to Google Data Studio
- Multi-timezone: Schedule reports cho teams ở các timezone khác nhau
📝 Course Summary
Đã học trong 12 bài:
| Module | Bài | Topic |
|---|---|---|
| Fundamentals | 01 | n8n Overview |
| 02 | Workflow Automation Concepts | |
| 03 | Interface & Basic Nodes | |
| 04 | Google Workspace Integration | |
| Integrations | 05 | Trigger Types |
| 06 | Notion & Airtable | |
| 07 | Slack & Discord | |
| 08 | HTTP Requests & APIs | |
| Data | 09 | Expressions & Variables |
| 10 | Data Transformation | |
| 11 | Code Node | |
| Production | 12 | Error Handling & Monitoring |
Skills Acquired
1✅ Build automated workflows with n8n2✅ Connect Google, Notion, Airtable, Slack, Discord3✅ Use Schedule, Webhook, and App triggers4✅ Call any API with HTTP Request node5✅ Write expressions for dynamic data6✅ Transform data (merge, filter, sort, aggregate)7✅ Write custom JavaScript in Code node8✅ Handle errors and monitor production workflows🎯 What's Next?
1📚 Recommended learning paths:21. n8n Advanced → Sub-workflows, custom nodes, self-hosting32. AI Automation → Connect GPT/Claude to n8n workflows43. Database Integration → PostgreSQL, MongoDB, MySQL with n8n54. DevOps Automation → CI/CD, deployment, monitoring65. Business Process → CRM, billing, HR automationChúc mừng bạn đã hoàn thành khóa n8n Basics! 🎉
Checkpoint
Capstone Project yêu cầu bao nhiêu features? Mô tả architecture tổng quan của Automated Business Dashboard.
