Stable Diffusion và ComfyUI

🎯 Mục tiêu bài học

TB5 min

Stable Diffusion là open-source model mạnh nhất cho image generation. ComfyUI cung cấp giao diện node-based để build complex image workflows.

Sau bài này, bạn sẽ:

✅ Hiểu Stable Diffusion architecture (U-Net, VAE, CLIP) ✅ Sử dụng SDXL với Python và thư viện diffusers ✅ Làm quen ComfyUI và workflow-based generation ✅ Thực hiện Image-to-Image và LoRA fine-tuning

Task 0

🔍 Stable Diffusion Architecture

TB5 min

Diagram

Đang vẽ diagram...

Key Concepts

Latent Space: Không gian nén của image
U-Net: Neural network để khôi phục lỗi nhiễu
VAE: Encoder/Decoder chuyển đổi giữa pixel và latent
CLIP: Text encoder kết nối text với image

Checkpoint

Bạn đã hiểu vai trò của U-Net, VAE và CLIP trong Stable Diffusion chưa?

Task 1

💻 Stable Diffusion với Python

TB5 min

Basic Generation

python.py

1from diffusers import StableDiffusionXLPipeline
2import torch
3
4pipe = StableDiffusionXLPipeline.from_pretrained(
5    "stabilityai/stable-diffusion-xl-base-1.0",
6    torch_dtype=torch.float16
7)
8pipe = pipe.to("cuda")
9
10image = pipe(
11    prompt="Professional headshot photo, studio lighting, clean background",
12    negative_prompt="cartoon, illustration, low quality, blurry",
13    num_inference_steps=30,
14    guidance_scale=7.5,
15    width=1024,
16    height=1024
17).images[0]
18
19image.save("headshot.png")

Parameters quan trọng

python.py

1# Guidance Scale: Muc do theo prompt
2# Low (1-5): Creative, loose interpretation
3# Medium (6-8): Balanced
4# High (9-15): Strict adherence to prompt
5
6# Steps: So buoc denoise
7# 20-30: Nhanh, chat luong tot
8# 30-50: Chat luong cao
9# 50+: Diminishing returns
10
11# Scheduler: Thuat toan sampling
12from diffusers import DPMSolverMultistepScheduler
13pipe.scheduler = DPMSolverMultistepScheduler.from_config(
14    pipe.scheduler.config
15)

Checkpoint

Bạn đã thử generate hình ảnh với SDXL pipeline và điều chỉnh parameters chưa?

Task 2

🛠️ ComfyUI Overview

TB5 min

ComfyUI là node-based GUI cho Stable Diffusion:

Diagram

Đang vẽ diagram...

Cài đặt ComfyUI

Bash

1# Clone repository
2git clone https://github.com/comfyanonymous/ComfyUI
3cd ComfyUI
4 
5# Install dependencies
6pip install -r requirements.txt
7 
8# Download models vao ComfyUI/models/checkpoints/
9# Run
10python main.py

API Integration

python.py

1import requests
2import json
3
4COMFYUI_URL = "http://localhost:8188"
5
6# Basic workflow
7workflow = {
8    "3": {
9        "class_type": "KSampler",
10        "inputs": {
11            "seed": 42,
12            "steps": 30,
13            "cfg": 7.5,
14            "sampler_name": "dpmpp_2m",
15            "scheduler": "karras",
16            "denoise": 1.0,
17            "model": ["4", 0],
18            "positive": ["6", 0],
19            "negative": ["7", 0],
20            "latent_image": ["5", 0]
21        }
22    }
23}
24
25# Queue prompt
26response = requests.post(
27    f"{COMFYUI_URL}/prompt",
28    json={"prompt": workflow}
29)
30prompt_id = response.json()["prompt_id"]

Checkpoint

Bạn đã cài đặt và thử nghiệm ComfyUI với node-based workflow chưa?

Task 3

🎨 Image-to-Image

TB5 min

python.py

1from diffusers import StableDiffusionXLImg2ImgPipeline
2from PIL import Image
3
4img2img = StableDiffusionXLImg2ImgPipeline.from_pretrained(
5    "stabilityai/stable-diffusion-xl-base-1.0",
6    torch_dtype=torch.float16
7)
8img2img = img2img.to("cuda")
9
10init_image = Image.open("sketch.png").resize((1024, 1024))
11
12result = img2img(
13    prompt="Professional illustration, detailed, vibrant colors",
14    image=init_image,
15    strength=0.7,  # 0=no change, 1=complete regeneration
16    guidance_scale=7.5,
17    num_inference_steps=30
18).images[0]
19
20result.save("refined.png")

Checkpoint

Bạn đã hiểu cách sử dụng img2img pipeline với tham số strength chưa?

Task 4

📐 LoRA và Custom Models

TB5 min

python.py

1from diffusers import StableDiffusionXLPipeline
2
3pipe = StableDiffusionXLPipeline.from_pretrained(
4    "stabilityai/stable-diffusion-xl-base-1.0",
5    torch_dtype=torch.float16
6)
7
8# Load LoRA adapter
9pipe.load_lora_weights("path/to/lora/model.safetensors")
10pipe = pipe.to("cuda")
11
12# Generate with LoRA style
13image = pipe(
14    prompt="photo of product on white background, professional lighting",
15    num_inference_steps=30
16).images[0]

Checkpoint

Bạn đã hiểu cách load và sử dụng LoRA adapters để tuỳ chỉnh style chưa?

Task 5

🎯 Tổng kết

TB5 min

Best Practices

SD Tips

Prompt engineering: Cụ thể, chi tiết, sử dụng style keywords
Negative prompts: Loại bỏ những gì không muốn
Seed: Cố định seed để reproducible results
CFG Scale: 7-8 là sweet spot cho hầu hết use cases
Steps: 25-30 là đủ cho SDXL

Bài tập thực hành

Hands-on Exercise

Setup Stable Diffusion local hoặc dùng API
Generate 5 images với các styles khác nhau
Thử img2img để biến sketch thành illustration
Khảo sát ComfyUI workflows

Bonus: Tìm và sử dụng LoRA model cho specific style

Câu hỏi tự kiểm tra

Stable Diffusion khác gì so với DALL-E 3 về kiến trúc, chi phí và mức độ kiểm soát?
LoRA adapter là gì và nó giúp tùy chỉnh style của image generation như thế nào?
Các tham số guidance_scale và num_inference_steps ảnh hưởng đến chất lượng ảnh đầu ra ra sao?
Img2img pipeline hoạt động như thế nào và khi nào nên sử dụng nó thay vì text-to-image?

🎉 Tuyệt vời! Bạn đã hoàn thành bài học Stable Diffusion va ComfyUI!

Tiếp theo: Chúng ta sẽ học các kỹ thuật prompt nâng cao để tạo hình ảnh chất lượng cao và nhất quán.

Task 6

🚀 Bài tiếp theo

Advanced Prompting Techniques →

Stable Diffusion và ComfyUI

🎯 Mục tiêu bài học

Sau bài này, bạn sẽ:

🔍 Stable Diffusion Architecture

Checkpoint

💻 Stable Diffusion với Python

Basic Generation

Parameters quan trọng

Checkpoint

🛠️ ComfyUI Overview

Cài đặt ComfyUI

API Integration

Checkpoint

🎨 Image-to-Image

Checkpoint

📐 LoRA và Custom Models

Checkpoint

🎯 Tổng kết

Best Practices

Bài tập thực hành

Câu hỏi tự kiểm tra

🚀 Bài tiếp theo

Khóa học

Mentor & Hỗ trợ

Blog

Giới thiệu