Lý thuyết
55 phút
Bài 14/15

Scaling n8n

Scale n8n cho high workloads - Queue mode, horizontal scaling, load balancing và high availability

🚀 Scaling n8n

Server Scaling

Khi workflow executions tăng, single instance không đủ. Bài này covers queue mode, horizontal scaling, và high availability setup để handle production workloads.

Scaling Overview

When to Scale:

Text
1SCALING INDICATORS
2──────────────────
3
4👁️ Watch for:
5├── Response time increasing
6├── Executions taking longer
7├── Queue backing up
8├── CPU consistently > 80%
9├── Memory pressure
10└── Timeouts/failures increasing
11
12📊 Metrics thresholds:
13├── > 1000 executions/hour
14├── > 50 concurrent workflows
15├── > 10 long-running workflows
16└── Peak webhook traffic spikes

Scaling Strategies:

Text
1SCALING OPTIONS
2───────────────
3
4VERTICAL (Scale Up)
5├── More CPU
6├── More RAM
7├── Faster disk
8├── Simpler setup
9├── Limited ceiling
10└── Single point of failure
11
12HORIZONTAL (Scale Out)
13├── Multiple instances
14├── Load balancing
15├── Queue mode required
16├── Higher ceiling
17├── More complex
18└── Better availability

Queue Mode Architecture

How Queue Mode Works:

Text
1QUEUE MODE ARCHITECTURE
2───────────────────────
3
4 ┌──────────────┐
5 │ Trigger │
6 │ (Webhook/ │
7 │ Schedule) │
8 └──────┬───────┘
9
10
11 ┌──────────────┐
12 │ Main n8n │
13 │ (Queue jobs)│
14 └──────┬───────┘
15
16
17 ┌──────────────┐
18 │ Redis │
19 │ (Queue) │
20 └──────┬───────┘
21
22 ┌──────────┼──────────┐
23 ▼ ▼ ▼
24 ┌─────────┐┌─────────┐┌─────────┐
25 │ Worker 1││ Worker 2││ Worker 3│
26 └─────────┘└─────────┘└─────────┘
27 │ │ │
28 └──────────┼──────────┘
29
30 ┌──────────────┐
31 │ PostgreSQL │
32 │ (Results) │
33 └──────────────┘

Enable Queue Mode:

Bash
1# Main instance (receives triggers)
2N8N_MODE=main
3QUEUE_BULL_REDIS_HOST=redis
4QUEUE_BULL_REDIS_PORT=6379
5QUEUE_HEALTH_CHECK_ACTIVE=true
6
7# Worker instances (process executions)
8N8N_MODE=worker
9QUEUE_BULL_REDIS_HOST=redis
10QUEUE_BULL_REDIS_PORT=6379
11EXECUTIONS_MODE=queue

Docker Compose: Queue Mode

Complete Stack:

yaml
1version: '3.8'
2
3services:
4 # PostgreSQL Database
5 postgres:
6 image: postgres:15-alpine
7 restart: always
8 environment:
9 POSTGRES_USER: n8n
10 POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
11 POSTGRES_DB: n8n
12 volumes:
13 - postgres_data:/var/lib/postgresql/data
14 healthcheck:
15 test: ["CMD-SHELL", "pg_isready -U n8n"]
16 interval: 10s
17 timeout: 5s
18 retries: 5
19
20 # Redis Queue
21 redis:
22 image: redis:7-alpine
23 restart: always
24 command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
25 volumes:
26 - redis_data:/data
27 healthcheck:
28 test: ["CMD", "redis-cli", "ping"]
29 interval: 10s
30 timeout: 5s
31 retries: 5
32
33 # Main n8n (handles triggers, UI)
34 n8n-main:
35 image: n8nio/n8n:latest
36 restart: always
37 environment:
38 # Database
39 - DB_TYPE=postgresdb
40 - DB_POSTGRESDB_HOST=postgres
41 - DB_POSTGRESDB_PORT=5432
42 - DB_POSTGRESDB_DATABASE=n8n
43 - DB_POSTGRESDB_USER=n8n
44 - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}
45
46 # Queue mode
47 - EXECUTIONS_MODE=queue
48 - QUEUE_BULL_REDIS_HOST=redis
49 - QUEUE_BULL_REDIS_PORT=6379
50 - QUEUE_HEALTH_CHECK_ACTIVE=true
51
52 # Encryption
53 - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
54
55 # URLs
56 - N8N_HOST=n8n.yourdomain.com
57 - N8N_PROTOCOL=https
58 - WEBHOOK_URL=https://n8n.yourdomain.com
59
60 # Metrics
61 - N8N_METRICS=true
62 ports:
63 - "5678:5678"
64 depends_on:
65 postgres:
66 condition: service_healthy
67 redis:
68 condition: service_healthy
69 volumes:
70 - n8n_data:/home/node/.n8n
71
72 # Worker 1
73 n8n-worker-1:
74 image: n8nio/n8n:latest
75 restart: always
76 command: worker
77 environment:
78 - DB_TYPE=postgresdb
79 - DB_POSTGRESDB_HOST=postgres
80 - DB_POSTGRESDB_PORT=5432
81 - DB_POSTGRESDB_DATABASE=n8n
82 - DB_POSTGRESDB_USER=n8n
83 - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}
84 - EXECUTIONS_MODE=queue
85 - QUEUE_BULL_REDIS_HOST=redis
86 - QUEUE_BULL_REDIS_PORT=6379
87 - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
88 # Worker-specific
89 - QUEUE_BULL_REDIS_TIMEOUT_THRESHOLD=30000
90 - N8N_CONCURRENCY_PRODUCTION_LIMIT=10
91 depends_on:
92 - postgres
93 - redis
94 - n8n-main
95 volumes:
96 - n8n_data:/home/node/.n8n
97
98 # Worker 2
99 n8n-worker-2:
100 image: n8nio/n8n:latest
101 restart: always
102 command: worker
103 environment:
104 - DB_TYPE=postgresdb
105 - DB_POSTGRESDB_HOST=postgres
106 - DB_POSTGRESDB_PORT=5432
107 - DB_POSTGRESDB_DATABASE=n8n
108 - DB_POSTGRESDB_USER=n8n
109 - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}
110 - EXECUTIONS_MODE=queue
111 - QUEUE_BULL_REDIS_HOST=redis
112 - QUEUE_BULL_REDIS_PORT=6379
113 - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
114 - QUEUE_BULL_REDIS_TIMEOUT_THRESHOLD=30000
115 - N8N_CONCURRENCY_PRODUCTION_LIMIT=10
116 depends_on:
117 - postgres
118 - redis
119 - n8n-main
120 volumes:
121 - n8n_data:/home/node/.n8n
122
123volumes:
124 postgres_data:
125 redis_data:
126 n8n_data:

Worker Configuration

Concurrency Settings:

Bash
1# Concurrent executions per worker
2N8N_CONCURRENCY_PRODUCTION_LIMIT=10 # Default: 5
3
4# Concurrent polling
5QUEUE_BULL_REDIS_TIMEOUT_THRESHOLD=30000
6
7# Job timeouts
8EXECUTIONS_TIMEOUT=3600 # 1 hour max
9
10# Retry settings
11EXECUTIONS_RETRY_MAX_COUNT=5

Worker Resource Allocation:

yaml
1# docker-compose with resource limits
2n8n-worker-1:
3 image: n8nio/n8n:latest
4 command: worker
5 deploy:
6 resources:
7 limits:
8 cpus: '2.0'
9 memory: 2G
10 reservations:
11 cpus: '1.0'
12 memory: 1G
13 # ... other config

Scale Workers Dynamically:

Bash
1# Scale to 5 workers
2docker-compose up -d --scale n8n-worker=5
3
4# Check worker count
5docker-compose ps | grep worker
6
7# Scale down
8docker-compose up -d --scale n8n-worker=2

Load Balancing

Nginx Load Balancer:

nginx
1# /etc/nginx/nginx.conf
2
3upstream n8n_cluster {
4 least_conn; # or round-robin
5
6 server n8n-main-1:5678;
7 server n8n-main-2:5678;
8 server n8n-main-3:5678;
9
10 # Health checks
11 keepalive 32;
12}
13
14server {
15 listen 443 ssl http2;
16 server_name n8n.yourdomain.com;
17
18 ssl_certificate /etc/letsencrypt/live/n8n.yourdomain.com/fullchain.pem;
19 ssl_certificate_key /etc/letsencrypt/live/n8n.yourdomain.com/privkey.pem;
20
21 location / {
22 proxy_pass http://n8n_cluster;
23 proxy_http_version 1.1;
24 proxy_set_header Upgrade $http_upgrade;
25 proxy_set_header Connection "upgrade";
26 proxy_set_header Host $host;
27 proxy_set_header X-Real-IP $remote_addr;
28 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
29 proxy_set_header X-Forwarded-Proto $scheme;
30
31 # Timeouts for long executions
32 proxy_connect_timeout 300;
33 proxy_send_timeout 300;
34 proxy_read_timeout 300;
35 }
36}

Sticky Sessions (if needed):

nginx
1upstream n8n_cluster {
2 ip_hash; # Sticky sessions by IP
3
4 server n8n-main-1:5678;
5 server n8n-main-2:5678;
6}

High Availability Setup

HA Architecture:

Text
1HIGH AVAILABILITY SETUP
2───────────────────────
3
4 ┌─────────────┐
5 │ Load │
6 │ Balancer │
7 └──────┬──────┘
8
9 ┌───────────────┼───────────────┐
10 ▼ ▼ ▼
11 ┌─────────┐ ┌─────────┐ ┌─────────┐
12 │ Main 1 │ │ Main 2 │ │ Main 3 │
13 └────┬────┘ └────┬────┘ └────┬────┘
14 │ │ │
15 └──────────────┼──────────────┘
16
17 ┌─────┴─────┐
18 │ Redis │
19 │ Cluster │
20 └─────┬─────┘
21
22 ┌──────────────┼──────────────┐
23 ▼ ▼ ▼
24 ┌─────────┐ ┌─────────┐ ┌─────────┐
25 │Worker 1 │ │Worker 2 │ │Worker 3 │
26 └────┬────┘ └────┬────┘ └────┬────┘
27 │ │ │
28 └──────────────┼──────────────┘
29
30 ┌─────┴─────┐
31 │ PostgreSQL│
32 │ Primary │
33 └─────┬─────┘
34
35 ┌─────┴─────┐
36 │ PostgreSQL│
37 │ Replica │
38 └───────────┘

Redis Cluster Setup:

yaml
1# Redis Sentinel for HA
2services:
3 redis-master:
4 image: redis:7-alpine
5 command: redis-server --appendonly yes
6
7 redis-slave-1:
8 image: redis:7-alpine
9 command: redis-server --slaveof redis-master 6379
10
11 redis-slave-2:
12 image: redis:7-alpine
13 command: redis-server --slaveof redis-master 6379
14
15 redis-sentinel-1:
16 image: redis:7-alpine
17 command: redis-sentinel /etc/redis/sentinel.conf
18 volumes:
19 - ./sentinel.conf:/etc/redis/sentinel.conf

Sentinel Config:

conf
1# sentinel.conf
2sentinel monitor mymaster redis-master 6379 2
3sentinel down-after-milliseconds mymaster 5000
4sentinel failover-timeout mymaster 60000
5sentinel parallel-syncs mymaster 1

Kubernetes Deployment

n8n Kubernetes Manifest:

yaml
1# n8n-deployment.yaml
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5 name: n8n-main
6spec:
7 replicas: 2
8 selector:
9 matchLabels:
10 app: n8n
11 component: main
12 template:
13 metadata:
14 labels:
15 app: n8n
16 component: main
17 spec:
18 containers:
19 - name: n8n
20 image: n8nio/n8n:latest
21 ports:
22 - containerPort: 5678
23 env:
24 - name: DB_TYPE
25 value: "postgresdb"
26 - name: DB_POSTGRESDB_HOST
27 value: "postgres-service"
28 - name: EXECUTIONS_MODE
29 value: "queue"
30 - name: QUEUE_BULL_REDIS_HOST
31 value: "redis-service"
32 - name: N8N_ENCRYPTION_KEY
33 valueFrom:
34 secretKeyRef:
35 name: n8n-secrets
36 key: encryption-key
37 resources:
38 requests:
39 memory: "512Mi"
40 cpu: "500m"
41 limits:
42 memory: "2Gi"
43 cpu: "2"
44 livenessProbe:
45 httpGet:
46 path: /healthz
47 port: 5678
48 initialDelaySeconds: 30
49 periodSeconds: 10
50 readinessProbe:
51 httpGet:
52 path: /healthz
53 port: 5678
54 initialDelaySeconds: 5
55 periodSeconds: 5
56---
57# Worker deployment
58apiVersion: apps/v1
59kind: Deployment
60metadata:
61 name: n8n-worker
62spec:
63 replicas: 3
64 selector:
65 matchLabels:
66 app: n8n
67 component: worker
68 template:
69 metadata:
70 labels:
71 app: n8n
72 component: worker
73 spec:
74 containers:
75 - name: n8n-worker
76 image: n8nio/n8n:latest
77 command: ["n8n", "worker"]
78 env:
79 - name: DB_TYPE
80 value: "postgresdb"
81 - name: EXECUTIONS_MODE
82 value: "queue"
83 # ... same env vars as main
84 resources:
85 requests:
86 memory: "512Mi"
87 cpu: "500m"
88 limits:
89 memory: "2Gi"
90 cpu: "2"

Horizontal Pod Autoscaler:

yaml
1# hpa.yaml
2apiVersion: autoscaling/v2
3kind: HorizontalPodAutoscaler
4metadata:
5 name: n8n-worker-hpa
6spec:
7 scaleTargetRef:
8 apiVersion: apps/v1
9 kind: Deployment
10 name: n8n-worker
11 minReplicas: 2
12 maxReplicas: 10
13 metrics:
14 - type: Resource
15 resource:
16 name: cpu
17 target:
18 type: Utilization
19 averageUtilization: 70
20 - type: Resource
21 resource:
22 name: memory
23 target:
24 type: Utilization
25 averageUtilization: 80

Performance Optimization

Queue Optimization:

Bash
1# Increase concurrency for light workflows
2N8N_CONCURRENCY_PRODUCTION_LIMIT=20
3
4# Decrease for heavy/memory-intensive workflows
5N8N_CONCURRENCY_PRODUCTION_LIMIT=5
6
7# Job timeout (prevent stuck jobs)
8EXECUTIONS_TIMEOUT=1800 # 30 minutes
9EXECUTIONS_TIMEOUT_MAX=7200 # 2 hours absolute max

Redis Optimization:

Bash
1# redis.conf
2maxmemory 2gb
3maxmemory-policy allkeys-lru
4
5# Persistence
6appendonly yes
7appendfsync everysec
8
9# Connections
10tcp-keepalive 300
11timeout 0
12
13# Performance
14io-threads 4
15io-threads-do-reads yes

PostgreSQL Optimization:

SQL
1-- Connection pooling
2-- Use PgBouncer for many workers
3
4-- Indexes for execution queries
5CREATE INDEX CONCURRENTLY idx_execution_started
6ON execution_entity("startedAt" DESC);
7
8CREATE INDEX CONCURRENTLY idx_execution_workflow_status
9ON execution_entity("workflowId", finished);
10
11-- Vacuum settings
12ALTER TABLE execution_entity SET (autovacuum_vacuum_scale_factor = 0.05);

Monitoring Scaled Setup

Queue Metrics:

Bash
1#!/bin/bash
2# queue-stats.sh
3
4# Check Redis queue depth
5docker exec redis redis-cli LLEN bull:jobs:wait
6docker exec redis redis-cli LLEN bull:jobs:active
7docker exec redis redis-cli LLEN bull:jobs:delayed
8docker exec redis redis-cli LLEN bull:jobs:failed

Worker Health:

Bash
1#!/bin/bash
2# worker-health.sh
3
4echo "=== Worker Status ==="
5docker-compose ps | grep worker
6
7echo ""
8echo "=== Worker Logs (last errors) ==="
9docker-compose logs --tail=50 n8n-worker-1 | grep -i error
10
11echo ""
12echo "=== Queue Depth ==="
13docker exec redis redis-cli LLEN bull:jobs:wait

Grafana Dashboard Additions:

Text
1Queue Panels:
2├── Queue Depth (wait + active)
3├── Processing Rate (jobs/sec)
4├── Worker Count
5├── Job Duration Distribution
6└── Failed Jobs Rate

Troubleshooting Scale Issues

Queue Backlog Growing:

Text
1Problem: Jobs queuing faster than processing
2
3Solutions:
41. Add more workers
52. Increase concurrency per worker
63. Optimize slow workflows
74. Check for stuck jobs

Workers Crashing:

Text
1Problem: Workers running out of memory
2
3Solutions:
41. Reduce concurrency
52. Add resource limits
63. Check for memory leaks in workflows
74. Split heavy workflows

Database Bottleneck:

Text
1Problem: DB connections maxed out
2
3Solutions:
41. Implement connection pooling
52. Reduce worker count
63. Add read replicas
74. Optimize queries

Bài Tập Thực Hành

Scaling Challenge

Implement scaled n8n:

  1. Set up queue mode with Redis
  2. Deploy 3 workers
  3. Configure load balancing (nginx)
  4. Test failover (stop one worker)
  5. Monitor queue depth
  6. Load test with many webhook requests

Master n8n scaling! 🚀

Scaling Checklist

Production Checklist

Before going to scaled production:

  • Queue mode enabled and tested
  • Redis properly configured with persistence
  • Workers have resource limits
  • Load balancer health checks working
  • Database connection pooling in place
  • Monitoring covers all components
  • Backup strategy includes Redis
  • Failover tested and documented

Key Takeaways

Remember
  • 📊 Queue mode required - For horizontal scaling
  • 🔄 Workers are stateless - Scale easily
  • ⚖️ Balance resources - Don't over-provision
  • 🏥 Health checks - Essential for HA
  • 📈 Monitor everything - Queue, workers, database

Tổng Kết Khóa Học

Chúc mừng! Bạn đã hoàn thành n8n Deployment course. Bây giờ bạn có thể:

  • ✅ Deploy n8n với Docker/Cloud platforms
  • ✅ Configure production environment
  • ✅ Secure your instance properly
  • ✅ Manage users và permissions
  • ✅ Monitor và alert on issues
  • ✅ Scale for production workloads

Next steps: Deploy your production n8n và build amazing automations! 🎉