Data Storytelling & Executive Reporting

Data Storytelling and Visualization

🎯 Mục tiêu bài học

TB5 min

Sau bài học này, bạn sẽ có thể:

Xây dựng data stories hiệu quả với SCQA framework
Chọn chart types phù hợp và design visualizations chuyên nghiệp
Viết insights theo công thức Observation + Implication + Action
Tạo executive reports với KPI framework và traffic light indicators
Trình bày cho executives theo 5-minute rule
Thiết kế automated report generation pipeline

Thông tin bài học

⏱️ Thời lượng: 2.5 giờ | 📊 Cấp độ: Nâng cao | 🛠️ Công cụ: Python, Matplotlib, Jinja2

Task 0

📖 Thuật ngữ quan trọng

TB5 min

Thuật ngữ	Tiếng Việt	Mô tả
Data Storytelling	Kể chuyện bằng dữ liệu	Kết hợp Data + Visuals + Narrative để tạo impact
SCQA	SCQA Framework	Situation → Complication → Question → Answer
Pyramid Principle	Nguyên tắc Kim tự tháp	Start with conclusion, support with evidence
Narrative Arc	Cung kịch bản	Setup → Conflict → Resolution structure
KPI	Chỉ số hiệu suất	Key Performance Indicator — metrics tied to strategy
Traffic Light	Đèn giao thông	🟢🟡🔴 indicators cho quick status assessment
Executive Summary	Tóm tắt điều hành	1-page overview: what, so what, now what
Sparkline	Đường spark	Mini chart trong text cho quick trend visualization
Vanity Metric	Chỉ số phù phiếm	Metrics look good nhưng không actionable
North Star Metric	Chỉ số Ngôi Sao Bắc	Single metric đại diện core value cho users

Checkpoint

Data Storytelling = Data + Visuals + Narrative — thiếu bất kỳ pillar nào đều giảm impact đáng kể. Bạn có thể phân biệt giữa "report" và "data story" không?

Task 1

👥 Audience & Story Structure

TB5 min

Audience Segmentation

👥Audience Types

👔Executives

💡Want: Bottom line impact, strategic insights

⏱️Time: 2-5 minutes

❓Question: So what? What should we do?

📊Analysts / Data Team

💡Want: Methodology, details, validation

⏱️Time: 15-30 minutes

❓Question: How did you get this? Is it correct?

🎯Operations / Managers

💡Want: Actionable steps, team impact

⏱️Time: 5-10 minutes

❓Question: What do we need to change?

Tailoring Your Message

Python

1# Same data, different presentations for different audiences
2
3# For Executives
4"""
5Executive Summary:
6Revenue grew 15% ($2.3M) year-over-year, exceeding target by 3%.
7Key driver: New product line contributed 40% of growth.
8Recommendation: Increase marketing budget by $500K for Q2.
9"""
10
11# For Operations Team
12"""
13Q1 Performance:
14- Total revenue: $17.6M (+15% YoY)
15- Average order value: $127 (+8%)
16- Order volume: 138,500 (+6.5%)
17Action items:
181. Scale fulfillment capacity by 10%
192. Prioritize inventory for Product Line X
20"""
21
22# For Data Team
23"""
24Analysis Details:
25- Period: Q1 2024 vs Q1 2023, seasonally adjusted
26- Confidence: 95% CI [14.2%, 15.8%]
27- Caveats: Excludes one-time enterprise deal ($400K)
28"""

SCQA Framework

Ví dụ

1S - SITUATION (Context)
2    "Currently, our customer churn rate is 8%..."
3 
4C - COMPLICATION (Problem)
5    "...but this has increased 40% in the last quarter,
6     threatening annual revenue by $2M."
7 
8Q - QUESTION (What we need to solve)
9    "What's driving this increase and how can we
10     reduce churn back to historical levels?"
11 
12A - ANSWER (Your insight + recommendation)
13    "Analysis shows 60% of churners cite pricing.
14     Introducing a loyalty tier could reduce churn by 25%,
15     saving $500K."

Pyramid Principle

🎯Key Message (So what?)

📌Supporting Point 1

📊Data Evidence

📌Supporting Point 2

📊Data Evidence

📌Supporting Point 3

📊Data Evidence

Three-Act Structure

Python

1# Act 1: Setup (Where are we?)
2"""
3Context: Our e-commerce platform serves 500K active customers.
4         Average order value is $85 with 2.3 orders per customer/year.
5"""
6
7# Act 2: Conflict (What's the problem?)
8"""
9Challenge: Q4 data shows concerning trends:
10- Cart abandonment increased from 68% to 76%
11- Mobile conversion dropped 15%
12Root cause: Checkout flow has 7 steps vs. industry standard of 3-4.
13"""
14
15# Act 3: Resolution (What should we do?)
16"""
17Solution: Redesign checkout to 4 steps
18Expected impact: Reduce abandonment by 10pp, recover $2.5M annually.
19Timeline: 6 weeks development, 2 weeks testing.
20"""

Don't Bury the Lead

❌ Bad: Methodology → Data → Analysis → Findings → Recommendation (ở cuối!)

✅ Good: Recommendation + Impact → Key Findings → Evidence → Methodology (appendix)

Checkpoint

Lead with the conclusion (Pyramid Principle), structure with SCQA, và always tailor depth/language theo audience type. Khi presenting churn analysis cho CEO, bạn nên start với gì?

Task 2

📊 Visualization Best Practices

TB5 min

Chart Selection Guide

Ví dụ

1COMPARISON        → Bar Chart (categories), Line Chart (over time)
2COMPOSITION       → Pie/Donut (parts of whole), Stacked Area (over time)
3DISTRIBUTION      → Histogram, Box Plot, Scatter Plot
4RELATIONSHIP      → Scatter Plot, Heatmap (correlation)
5TREND             → Line Chart, Multi-line, with Reference Line

Before and After

Python

1import matplotlib.pyplot as plt
2import numpy as np
3
4categories = ['Product A', 'Product B', 'Product C', 'Product D', 'Product E']
5values = [23, 45, 12, 38, 29]
6
7fig, axes = plt.subplots(1, 2, figsize=(14, 5))
8
9# ❌ BAD: Pie chart for comparison
10axes[0].pie(values, labels=categories, autopct='%1.1f%%')
11axes[0].set_title('❌ BAD: Pie Chart for Comparison')
12
13# ✅ GOOD: Sorted bar chart
14sorted_idx = np.argsort(values)[::-1]
15sorted_cats = [categories[i] for i in sorted_idx]
16sorted_vals = [values[i] for i in sorted_idx]
17
18bars = axes[1].barh(sorted_cats, sorted_vals, color='steelblue')
19axes[1].set_xlabel('Sales ($K)')
20axes[1].set_title('✅ GOOD: Sorted Bar Chart')
21axes[1].bar_label(bars, fmt='$%dK')
22axes[1].invert_yaxis()
23plt.tight_layout()
24plt.show()

Focus Attention

Python

1fig, axes = plt.subplots(1, 2, figsize=(14, 5))
2
3months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
4sales = [120, 135, 128, 142, 138, 165]
5target = 150
6
7# ❌ BAD: No visual hierarchy
8axes[0].plot(months, sales, 'o-', linewidth=2)
9axes[0].axhline(target, linestyle='--', label='Target')
10axes[0].set_title('❌ BAD: No Visual Hierarchy')
11axes[0].legend()
12
13# ✅ GOOD: Highlight the insight
14colors = ['gray' if v < target else 'green' for v in sales]
15axes[1].bar(months, sales, color=colors)
16axes[1].axhline(target, color='red', linestyle='--', label='Target: $150K')
17axes[1].annotate('Hit target! ↑12%', xy=(5, 165), xytext=(4, 175),
18                 fontsize=11, fontweight='bold', color='green',
19                 arrowprops=dict(arrowstyle='->', color='green'))
20axes[1].set_ylabel('Sales ($K)')
21axes[1].set_title('✅ GOOD: Highlight Key Insight')
22axes[1].legend()
23plt.tight_layout()
24plt.show()

Checkpoint

Visualization phải highlight insight, không chỉ show data — use color, annotation, và sorting để guide viewer's eye đến key message. Khi nào nên dùng pie chart và khi nào nên tránh?

Task 3

✍️ Writing Insights & Presentations

TB5 min

The Insight Formula

Ví dụ

1INSIGHT = OBSERVATION + IMPLICATION + ACTION
2 
3❌ Weak: "Sales increased 15% in Q4."
4 
5✅ Strong: "Sales increased 15% in Q4, driven primarily by
6           the holiday promotion (contributing 60% of growth).
7           To sustain this, extend the promotion into Q1
8           with a 'New Year' theme."

Headline Hierarchy

📰Headline Hierarchy

🔑Level 1: KEY FINDING (Lead with insight)

📋Level 2: Supporting Evidence

📊Level 3: Details / Data

💡Example: Churn Is Costing $2M Annually

📊Churn rate jumped 40% in Q4 (8.2% vs 5.9%)

👥2,400 customers lost vs prior year

✅Loyalty program → 25% reduction → 10x ROI

Avoiding Common Pitfalls

Python

1# ❌ DON'T: Use jargon
2bad = "The YoY delta in the LTV:CAC ratio indicates deteriorating unit economics..."
3
4# ✅ DO: Use plain language
5good = """
6We're spending more to acquire customers who are worth less:
7- Cost to acquire a customer: up 23%
8- Revenue per customer: down 15%
9Bottom line: Each new customer takes 3 months longer to become profitable.
10"""

Effective Slide Structure

Section	Nội dung
HEADLINE	States insight (not just topic). ✅ "Q4 Sales Beat Target by 15%" thay vì ❌ "Q4 Sales Performance"
VISUAL	Chart hỗ trợ insight chính
KEY TAKEAWAYS	3 bullet points quan trọng
SOURCE	Data source và thời gian (e.g., Sales_Master database, Dec 2024)

Checkpoint

Mỗi insight cần 3 phần: What happened (observation), Why it matters (implication), What to do (action). Thiếu action = report, có action = story. Làm sao chuyển "Revenue tăng 15%" thành một insight actionable?

Task 4

📈 Executive Report Framework

TB5 min

KPI Selection & Dashboard

Python

1import pandas as pd
2import numpy as np
3
4kpis = {
5    'Revenue': {'current': 2.5, 'target': 2.3, 'previous': 2.1, 'unit': 'M', 'higher_is_better': True},
6    'Customer Count': {'current': 12500, 'target': 13000, 'previous': 11800, 'unit': '', 'higher_is_better': True},
7    'Churn Rate': {'current': 5.2, 'target': 4.0, 'previous': 4.8, 'unit': '%', 'higher_is_better': False},
8    'NPS': {'current': 42, 'target': 50, 'previous': 38, 'unit': '', 'higher_is_better': True},
9    'CAC': {'current': 125, 'target': 100, 'previous': 145, 'unit': '$', 'higher_is_better': False}
10}
11
12def calculate_kpi_status(kpi_data):
13    results = []
14    for name, data in kpi_data.items():
15        current, target, previous = data['current'], data['target'], data['previous']
16        higher_better = data['higher_is_better']
17        
18        vs_target = ((current - target) / target) * 100
19        vs_previous = ((current - previous) / previous) * 100
20        
21        on_track = current >= target if higher_better else current <= target
22        status = '✅' if on_track else '⚠️' if abs(vs_target) < 10 else '❌'
23        
24        results.append({
25            'KPI': name, 'Current': f"{current}{data['unit']}",
26            'Target': f"{target}{data['unit']}",
27            'vs Target': f"{'+' if vs_target > 0 else ''}{vs_target:.1f}%",
28            'vs Previous': f"{'+' if vs_previous > 0 else ''}{vs_previous:.1f}%",
29            'Status': status
30        })
31    return pd.DataFrame(results)
32
33print("Executive KPI Summary")
34print("=" * 70)
35print(calculate_kpi_status(kpis).to_string(index=False))

Report Templates

Weekly Flash Report Template

📊 Key Metrics at a Glance:

Metric	Actual	Target	Status
Revenue	$2.5M	$2.3M	✅
Orders	12,500	11,000	✅
Conversion	3.2%	3.5%	⚠️

🔴 Key Issues: Mobile conversion down 15%, Support tickets up 20%

🟢 Wins This Week: Record revenue day on Thursday

📋 Key Actions for Next Week: Deploy mobile checkout fix (Owner: Product)

KPI Hierarchy

North Star Metric (e.g., Monthly Active Users) → Revenue / Customers / Engagement / Efficiency / Quality → Specific sub-metrics per category

Checkpoint

Executive reports answer 4 questions: Are we on track? What's changed? What's at risk? What should we do? — everything else goes to appendix. Weekly Flash Report nên dài bao nhiêu trang?

Task 5

🎤 Presenting to Executives

TB5 min

The 5-Minute Rule

📰Minute 1: The Headline

📊Minute 2: The Evidence

💡Minute 3: The Insight

✅Minute 4: The Recommendation

🙋Minute 5: The Ask

Traffic Light Indicators

Python

1def create_status_indicator(value, target, higher_is_better=True):
2    if higher_is_better:
3        pct = (value - target) / target * 100
4    else:
5        pct = (target - value) / target * 100
6    
7    if pct >= 0:
8        return '🟢', 'On Track'
9    elif pct >= -10:
10        return '🟡', 'Monitor'
11    else:
12        return '🔴', 'At Risk'
13
14metrics = [
15    ('Revenue', 2.5, 2.3, True),
16    ('Customers', 12500, 13000, True),
17    ('Churn Rate', 5.2, 4.0, False),
18    ('CAC', 125, 100, False),
19    ('NPS', 42, 50, True)
20]
21
22print("KPI Status Summary")
23print("=" * 50)
24for name, actual, target, higher_better in metrics:
25    light, status = create_status_indicator(actual, target, higher_better)
26    print(f"{light} {name:15s} | Actual: {actual:>8} | Target: {target:>8} | {status}")

Handling Questions

Ví dụ

1Q: "What's driving this?"
2→ Have root cause analysis ready
3 
4Q: "How confident are you?"
5→ "80% confident. Best case: +20%, worst case: +5%"
6 
7Q: "What are the risks?"
8→ Top 3 risks + mitigation plans
9 
10Q: "What happens if we do nothing?"
11→ Quantify the cost of inaction

Executive Dashboard

Python

1import matplotlib.pyplot as plt
2import numpy as np
3
4fig = plt.figure(figsize=(16, 10))
5fig.suptitle('Q1 2024 Executive Dashboard', fontsize=16, fontweight='bold', y=0.98)
6gs = fig.add_gridspec(3, 4, hspace=0.3, wspace=0.3)
7
8# Row 1: Big Numbers
9for i, (label, value, change, color) in enumerate([
10    ('Revenue', '$8.2M', '+15% vs LY', 'green'),
11    ('Customers', '15,200', '+22% vs LY', 'steelblue'),
12    ('NPS', '48', '-2 vs target', 'orange'),
13    ('Margin', '68%', '+3pp vs LY', 'green')
14]):
15    ax = fig.add_subplot(gs[0, i])
16    ax.text(0.5, 0.7, value, fontsize=28, fontweight='bold', ha='center', va='center', color=color)
17    ax.text(0.5, 0.3, f'{label}\n{change}', fontsize=10, ha='center', va='center', color='gray')
18    ax.axis('off')
19
20# Row 2: Trend
21ax5 = fig.add_subplot(gs[1, :2])
22months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
23actual = [2.1, 2.3, 2.5, 2.6, 2.8, 3.0]
24target = [2.0, 2.2, 2.4, 2.5, 2.7, 2.9]
25ax5.fill_between(months, actual, alpha=0.3, color='green')
26ax5.plot(months, actual, 'go-', linewidth=2, label='Actual')
27ax5.plot(months, target, 'k--', label='Target')
28ax5.set_ylabel('Revenue ($M)')
29ax5.set_title('Revenue Trend', fontweight='bold')
30ax5.legend(loc='upper left')
31
32# Row 2: Segment breakdown
33ax6 = fig.add_subplot(gs[1, 2:])
34segments = ['Enterprise', 'Mid-Market', 'SMB', 'Consumer']
35values = [3.5, 2.8, 1.5, 0.4]
36colors = ['#2ecc71', '#3498db', '#9b59b6', '#e74c3c']
37bars = ax6.barh(segments, values, color=colors)
38ax6.set_xlabel('Revenue ($M)')
39ax6.set_title('Revenue by Segment', fontweight='bold')
40ax6.bar_label(bars, fmt='$%.1fM')
41
42# Row 3: Issues & Actions
43for i, (title, items) in enumerate([
44    ('🔴 KEY ISSUES', ['CAC increased 18%', 'Support backlog at 72hrs', 'Enterprise deal slipped to Q2']),
45    ('🟢 KEY ACTIONS', ['Launch loyalty program → +5% retention', 'Deploy mobile app v2 → +10% engagement', 'Close 3 enterprise deals → +$800K'])
46]):
47    ax = fig.add_subplot(gs[2, i*2:(i+1)*2])
48    ax.axis('off')
49    text = f"{title}\n" + "\n".join([f"  {j+1}. {item}" for j, item in enumerate(items)])
50    ax.text(0.05, 0.95, text, fontsize=10, fontfamily='monospace',
51            verticalalignment='top', transform=ax.transAxes)
52
53plt.savefig('executive_dashboard.png', dpi=150, bbox_inches='tight', facecolor='white')
54plt.show()

Checkpoint

Start with conclusion, present evidence in 5 minutes, always have backup slides for deep-dive questions. Never spend >1 minute on methodology. Khi CEO hỏi "What happens if we do nothing?" — bạn nên trả lời thế nào?

Task 6

🤖 Automated Report Generation

TB5 min

Jinja2 Templates

Python

1from jinja2 import Template
2from datetime import datetime
3
4report_template = Template("""
5━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
6                    {{ title }}
7                    {{ date }}
8━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
9
10�� KEY METRICS
11{% for metric in metrics %}
12{{ metric.status }} {{ metric.name }}: {{ metric.value }} (Target: {{ metric.target }})
13{% endfor %}
14
15�� HIGHLIGHTS
16{% for highlight in highlights %}
17• {{ highlight }}
18{% endfor %}
19
20�� ACTION ITEMS
21{% for action in actions %}
22{{ loop.index }}. {{ action.task }} (Owner: {{ action.owner }}, Due: {{ action.due }})
23{% endfor %}
24━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
25""")
26
27data = {
28    'title': 'WEEKLY SALES REPORT',
29    'date': datetime.now().strftime('%Y-%m-%d'),
30    'metrics': [
31        {'name': 'Revenue', 'value': '$2.5M', 'target': '$2.3M', 'status': '✅'},
32        {'name': 'Orders', 'value': '12,500', 'target': '11,000', 'status': '✅'},
33        {'name': 'Conversion', 'value': '3.2%', 'target': '3.5%', 'status': '⚠️'},
34    ],
35    'highlights': [
36        'Record Thursday revenue (+45% vs average)',
37        'Mobile app orders grew 25% MoM'
38    ],
39    'actions': [
40        {'task': 'Review mobile checkout UX', 'owner': 'Product', 'due': '3/22'},
41        {'task': 'Launch retargeting campaign', 'owner': 'Marketing', 'due': '3/20'},
42    ]
43}
44
45print(report_template.render(**data))

Complete Reporting Pipeline

Python

1import logging
2
3logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
4logger = logging.getLogger(__name__)
5
6class AutomatedReportingSystem:
7    def __init__(self, config):
8        self.config = config
9        self.data = {}
10        self.metrics = {}
11    
12    def extract_data(self):
13        logger.info("Extracting data...")
14        # In production: query database, APIs
15        import numpy as np
16        self.data['revenue'] = pd.DataFrame({
17            'date': pd.date_range(start='2024-01-01', periods=90, freq='D'),
18            'revenue': np.random.uniform(70000, 100000, 90),
19            'orders': np.random.randint(300, 500, 90)
20        })
21        logger.info("Data extraction complete")
22    
23    def calculate_metrics(self):
24        logger.info("Calculating metrics...")
25        rev_df = self.data['revenue']
26        self.metrics = {
27            'total_revenue': rev_df['revenue'].sum(),
28            'avg_daily_revenue': rev_df['revenue'].mean(),
29            'total_orders': rev_df['orders'].sum(),
30            'avg_order_value': rev_df['revenue'].sum() / rev_df['orders'].sum()
31        }
32        logger.info("Metrics calculation complete")
33    
34    def generate_report(self):
35        logger.info("Generating report...")
36        m = self.metrics
37        return f"""
38Total Revenue: ${m['total_revenue']:,.0f}
39Average Daily: ${m['avg_daily_revenue']:,.0f}
40Total Orders: {m['total_orders']:,}
41AOV: ${m['avg_order_value']:.2f}
42"""
43    
44    def run_pipeline(self):
45        logger.info("Starting pipeline...")
46        self.extract_data()
47        self.calculate_metrics()
48        report = self.generate_report()
49        logger.info("Pipeline completed!")
50        return report
51
52# Run
53system = AutomatedReportingSystem({'db': 'example'})
54print(system.run_pipeline())

Scheduling

Python

1# Schedule configuration
2schedule_config = {
3    'daily_flash': {'time': '07:00', 'recipients': ['ceo@company.com']},
4    'weekly_summary': {'day': 'Monday', 'time': '08:00', 'recipients': ['leadership@company.com']},
5    'monthly_review': {'day': 1, 'time': '09:00', 'recipients': ['board@company.com']}
6}
7
8# APScheduler example
9"""
10from apscheduler.schedulers.background import BackgroundScheduler
11from apscheduler.triggers.cron import CronTrigger
12
13scheduler = BackgroundScheduler()
14scheduler.add_job(daily_report, CronTrigger(hour=7, minute=0), id='daily')
15scheduler.add_job(weekly_report, CronTrigger(day_of_week='mon', hour=8), id='weekly')
16scheduler.start()
17"""

Pipeline Pattern

Extract (data sources) → Transform (calculate metrics) → Generate (template rendering) → Deliver (email/Slack) → Monitor (logging/alerts)

Checkpoint

Automated reports giảm manual work, đảm bảo consistency, và deliver insights đúng thời điểm. Pipeline pattern: Extract → Transform → Generate → Deliver. Tại sao nên dùng Jinja2 templates thay vì f-strings cho reports?

Task 7

📋 Tổng kết

TB5 min

Kiến thức đã học

Chủ đề	Nội dung chính
Audience	Executive, Technical, Operations — tailor accordingly
Story Structure	SCQA, Pyramid Principle, Three-Act narrative
Visualization	Right chart type, focus attention, highlight insights
Writing	Observation + Implication + Action, avoid jargon
Executive Reports	KPI framework, traffic lights, flash/monthly/QBR templates
Presenting	5-minute rule, prepare for questions, lead with conclusion
Automation	Jinja2 templates, scheduling, Extract→Transform→Generate→Deliver

Câu hỏi tự kiểm tra

SCQA framework gồm những phần nào?
Executive report khác technical report thế nào?
"Observation + Implication + Action" nghĩa là gì?
Jinja2 templates dùng để automate gì?

Hoàn thành!

Bạn đã nắm vững Data Storytelling & Executive Reporting — kỹ năng quan trọng nhất để biến analysis thành business impact. Data không tự nói — bạn phải kể câu chuyện cho nó.

Bài tiếp theo: Automated Reports & Portfolio Capstone

Task 8