Data Visualization Fundamentals
Trước khi học Tableau hay Power BI, bạn cần nắm vững nguyên tắc visualization. Một chart đẹp nhưng sai chart type có thể mislead hoàn toàn.
🎯 Mục tiêu
- Hiểu tại sao visualization quan trọng
- Nắm nguyên tắc chọn chart phù hợp
- Áp dụng color theory hiệu quả
- Storytelling với data
1. Tại Sao Visualization Quan Trọng?
1.1. Con số vs Hình ảnh
1Raw data:2Q1: 150, Q2: 180, Q3: 210, Q4: 1953 4Observation từ số: "Q3 cao nhất"5 6Observation từ chart: 7- Q3 cao nhất8- Uptrend Q1→Q39- Q4 giảm nhẹ (potential concern)10- Growth rate giảm dần1.2. Anscombe's Quartet
4 datasets có cùng thống kê (mean, variance, correlation) nhưng hoàn toàn khác khi visualize.
1Dataset 1: Linear relationship2Dataset 2: Quadratic relationship 3Dataset 3: Perfect line + 1 outlier4Dataset 4: Vertical line + 1 outlier5 6Statistics: Identical7Visualization: Completely different stories!Lesson: Luôn visualize data, không chỉ dựa vào statistics.
1.3. Cognitive Load
1Table với 1000 rows2→ Brain: "Overload, cannot process"3 4Chart summary5→ Brain: "Got it in 3 seconds"Visualization giúp:
- ✅ Pattern recognition nhanh
- ✅ Outlier detection dễ dàng
- ✅ Comparison trực quan
- ✅ Trend identification rõ ràng
2. Chọn Chart Phù Hợp
2.1. Decision Framework
1Bạn muốn show gì?2 │3 ├── Comparison → Bar/Column Chart4 │5 ├── Trend over time → Line Chart6 │7 ├── Part-to-whole → Pie/Donut (≤5 parts)8 │9 ├── Distribution → Histogram/Box Plot10 │11 ├── Relationship → Scatter Plot12 │13 └── Composition → Stacked Bar/Area2.2. Chart Selection Guide
| Purpose | Best Charts | Avoid |
|---|---|---|
| Compare categories | Bar, Column | Pie (nhiều categories) |
| Show trend | Line, Area | Pie, Bar |
| Part of whole | Pie (≤5), Treemap, Stacked | Line |
| Distribution | Histogram, Box, Violin | Bar |
| Correlation | Scatter, Bubble | Line, Bar |
| Geographic | Map, Choropleth | Bar |
2.3. Common Mistakes
❌ Mistake 1: Pie chart cho nhiều categories
1Bad: Pie với 15 slices → Không đọc được2Good: Bar chart horizontal → So sánh dễ❌ Mistake 2: 3D charts
1Bad: 3D Pie → Distort proportions2Good: 2D variants → Accurate reading❌ Mistake 3: Dual-axis abuse
1Bad: 2 Y-axes với scales khác nhau → Misleading2Good: Separate charts hoặc normalize data❌ Mistake 4: Truncated Y-axis
1Bad: Y-axis bắt đầu từ 95 → Small change trông lớn2Good: Y-axis từ 0 hoặc clearly labeled3. Color Theory for Data
3.1. Color Purposes
| Purpose | Color Type | Example |
|---|---|---|
| Categorical | Distinct colors | Products: Blue, Red, Green |
| Sequential | Light to dark | Revenue: Light blue → Dark blue |
| Diverging | Two-directional | Profit/Loss: Red ← Gray → Green |
| Highlight | Accent color | One bar highlighted in orange |
3.2. Color Best Practices
DO:
1✅ Limit to 5-7 colors max2✅ Use colorblind-friendly palettes3✅ Consistent colors across dashboard4✅ Gray for context, color for focusDON'T:
1❌ Rainbow colors (hard to distinguish)2❌ Red = Good, Green = Bad (reverse psychology)3❌ Same color, different meanings4❌ Bright colors for background3.3. Accessible Color Palettes
1Colorblind-safe palette:2#1f77b4 (Blue)3#ff7f0e (Orange) 4#2ca02c (Green)5#d62728 (Red)6#9467bd (Purple)7#8c564b (Brown)8 9Sequential (single hue):10#deebf7 → #9ecae1 → #3182bd → #08519c11 12Diverging:13#d7191c → #fdae61 → #ffffbf → #a6d96a → #1a964114(Red) (Yellow) (Green)3.4. Testing Accessibility
Tools để check:
- Coblis Color Blindness Simulator
- Viz Palette (Tableau)
- Color Oracle (desktop app)
4. Design Principles
4.1. Data-Ink Ratio
Concept: Maximize data, minimize non-data ink.
1Bad (low data-ink ratio):2┌──────────────────────────────────────┐3│ ████ Heavy gridlines │4│ ████ 3D effects │5│ ████ Decorative elements │6│ ████ Redundant labels │7└──────────────────────────────────────┘8 9Good (high data-ink ratio):10┌──────────────────────────────────────┐11│ │12│ ▬▬▬ Clean, minimal design │13│ ▬▬▬▬▬ Focus on data │14│ ▬▬▬▬▬▬▬ │15└──────────────────────────────────────┘4.2. Gestalt Principles
Proximity: Nhóm related items gần nhau
1[Chart A] [Chart B] ← Related metrics2 3[Chart C] ← Different categorySimilarity: Same color/shape = Same category
1● Sales Q1 ● Sales Q2 ● Sales Q3 ← Blue dots2■ Costs Q1 ■ Costs Q2 ■ Costs Q3 ← Red squaresEnclosure: Border groups related info
1┌─────────────────┐2│ Revenue Section │3│ [Chart] [KPI] │4└─────────────────┘4.3. Visual Hierarchy
1Most Important2 ↓3████████████ Large, bold, top position4 ↓5████████ Medium, prominent6 ↓7████ Smaller, supporting8 ↓9Least Important5. Data Storytelling
5.1. Story Structure
11. SETUP (Context)2 "Q3 revenue was $2.5M..."3 42. CONFLICT (Problem/Opportunity)5 "...but growth slowed to 5% vs 15% Q2"6 73. RESOLUTION (Insight/Action)8 "Analysis shows: Marketing spend dropped 20%9 Recommendation: Increase budget by $50K"5.2. Annotation Techniques
1Chart với annotations:2 3 Revenue Trend4 │5150 ┤ ★ Campaign launched6 │ ╱7100 ┤ ←─ Seasonal dip8 │ ╱9 50 ┤──────╱10 └────────────────────────11 J F M A M J J AAnnotation types:
- Callouts: Highlight specific points
- Reference lines: Targets, averages
- Trend lines: Show direction
- Notes: Context, caveats
5.3. Dashboard Flow
1┌─────────────────────────────────────────────────────┐2│ TITLE │3│ Key insight in subtitle │4├────────────────┬───────────────┬───────────────────┤5│ │ │ │6│ KPI 1 │ KPI 2 │ KPI 3 │ ← At-a-glance7│ $2.5M ▲12% │ 150K ▼3% │ 95% ▬ │8│ │ │ │9├────────────────┴───────────────┴───────────────────┤10│ │11│ MAIN VISUALIZATION │ ← Core story12│ │13│ │14├────────────────────────┬────────────────────────────┤15│ │ │16│ Supporting Chart 1 │ Supporting Chart 2 │ ← Details17│ │ │18└────────────────────────┴────────────────────────────┘19│ Filters: Date | Region | Product │ ← Interactivity20└─────────────────────────────────────────────────────┘6. Pre-attentive Attributes
Attributes Brain Processes Instantly
| Attribute | Best For |
|---|---|
| Position | Comparing values |
| Length | Quantities |
| Color hue | Categories |
| Color intensity | Magnitude |
| Size | Relative amounts |
| Orientation | Direction |
Using Pre-attentive Cues
1Make important data POP:2 3Before: ■ ■ ■ ■ ■ ■ ■ ■ (all same)4 5After: ■ ■ ■ ■ █ ■ ■ ■ (target highlighted)6 ↑7 "This one!"7. Common Visualization Mistakes
7.1. The Hall of Shame
1. Misleading Y-axis
1Bad: Good:2 100│ ▲ 100│3 98│ ╱ 50│ slight4 96│___╱ 0│___▬▬▬▬ increase5 Looks like 50% Actually 4%6 increase2. Cherry-picked timeframe
1Show only good months2→ "Sales are up!"3 4Show full year5→ "Actually, we're down 10% YoY"3. Wrong chart type
1Pie chart for:2- More than 5 categories3- Values that don't sum to 100%4- Showing trends4. Overcrowded design
110 charts on 1 screen2→ Information overload3→ No clear message7.2. How to Avoid
✅ Ask: "What's the ONE thing I want viewers to understand?" ✅ Simplify: Remove anything that doesn't support the message ✅ Test: Show to someone unfamiliar, ask what they see ✅ Iterate: First draft is rarely final
8. Hands-on Exercise
Exercise: Critique This Dashboard
1┌────────────────────────────────────────┐2│ 3D Pie Chart with 12 slices │3│ Rainbow colors │4│ Legend far from chart │5│ No title │6│ Y-axis starts at 1000 │7│ Dual axis with different scales │8└────────────────────────────────────────┘9 10List 6 problems and suggest fixes.Exercise: Choose the Right Chart
For each scenario, select best chart type:
- "Compare revenue across 10 regions"
- "Show how market share changed over 5 years"
- "Display correlation between ad spend and sales"
- "Break down expenses by category"
- "Show distribution of customer ages"
📝 Quiz
-
Khi nào KHÔNG nên dùng Pie chart?
- Show part-to-whole
- So sánh hơn 5 categories
- Có 3 segments
- Show percentages
-
Data-ink ratio cao nghĩa là?
- Dùng nhiều colors
- Chart phức tạp hơn
- Tối đa data, tối thiểu decorations
- Chart 3D
-
Sequential color palette dùng khi?
- Categories khác nhau
- Values từ thấp đến cao
- Positive vs Negative
- Random selection
-
Pre-attentive attribute nào brain xử lý nhanh nhất?
- Text labels
- Footnotes
- Color và Position
- 3D effects
🎯 Key Takeaways
- Chart selection matters - Wrong chart = Wrong message
- Less is more - High data-ink ratio
- Color with purpose - Categorical, sequential, diverging
- Tell a story - Setup, conflict, resolution
- Design for humans - Use pre-attentive attributes
🚀 Bài tiếp theo
Tableau Fundamentals - Bắt đầu với Tableau Desktop và tạo visualization đầu tiên!
