Data Storytelling & Executive Reporting
🎯 Mục tiêu bài học
- Xây dựng data stories hiệu quả với SCQA framework
- Chọn chart types phù hợp và design visualizations chuyên nghiệp
- Viết insights theo công thức Observation + Implication + Action
- Tạo executive reports với KPI framework và traffic light indicators
- Trình bày cho executives theo 5-minute rule
- Thiết kế automated report generation pipeline
⏱️ Thời lượng: 2.5 giờ | 📊 Cấp độ: Nâng cao | 🛠️ Công cụ: Python, Matplotlib, Jinja2
📖 Thuật ngữ quan trọng
| Thuật ngữ | Tiếng Việt | Mô tả |
|---|---|---|
| Data Storytelling | Kể chuyện bằng dữ liệu | Kết hợp Data + Visuals + Narrative để tạo impact |
| SCQA | SCQA Framework | Situation → Complication → Question → Answer |
| Pyramid Principle | Nguyên tắc Kim tự tháp | Start with conclusion, support with evidence |
| Narrative Arc | Cung kịch bản | Setup → Conflict → Resolution structure |
| KPI | Chỉ số hiệu suất | Key Performance Indicator — metrics tied to strategy |
| Traffic Light | Đèn giao thông | 🟢🟡🔴 indicators cho quick status assessment |
| Executive Summary | Tóm tắt điều hành | 1-page overview: what, so what, now what |
| Sparkline | Đường spark | Mini chart trong text cho quick trend visualization |
| Vanity Metric | Chỉ số phù phiếm | Metrics look good nhưng không actionable |
| North Star Metric | Chỉ số Ngôi Sao Bắc | Single metric đại diện core value cho users |
Checkpoint
Data Storytelling = Data + Visuals + Narrative — thiếu bất kỳ pillar nào đều giảm impact đáng kể. Bạn có thể phân biệt giữa "report" và "data story" không?
👥 Audience & Story Structure
Audience Segmentation
Tailoring Your Message
1# Same data, different presentations for different audiences23# For Executives4"""5Executive Summary:6Revenue grew 15% ($2.3M) year-over-year, exceeding target by 3%.7Key driver: New product line contributed 40% of growth.8Recommendation: Increase marketing budget by $500K for Q2.9"""1011# For Operations Team12"""13Q1 Performance:14- Total revenue: $17.6M (+15% YoY)15- Average order value: $127 (+8%)16- Order volume: 138,500 (+6.5%)17Action items:181. Scale fulfillment capacity by 10%192. Prioritize inventory for Product Line X20"""2122# For Data Team23"""24Analysis Details:25- Period: Q1 2024 vs Q1 2023, seasonally adjusted26- Confidence: 95% CI [14.2%, 15.8%]27- Caveats: Excludes one-time enterprise deal ($400K)28"""SCQA Framework
1S - SITUATION (Context)2 "Currently, our customer churn rate is 8%..."3 4C - COMPLICATION (Problem)5 "...but this has increased 40% in the last quarter,6 threatening annual revenue by $2M."7 8Q - QUESTION (What we need to solve)9 "What's driving this increase and how can we10 reduce churn back to historical levels?"11 12A - ANSWER (Your insight + recommendation)13 "Analysis shows 60% of churners cite pricing.14 Introducing a loyalty tier could reduce churn by 25%,15 saving $500K."Pyramid Principle
Three-Act Structure
1# Act 1: Setup (Where are we?)2"""3Context: Our e-commerce platform serves 500K active customers.4 Average order value is $85 with 2.3 orders per customer/year.5"""67# Act 2: Conflict (What's the problem?)8"""9Challenge: Q4 data shows concerning trends:10- Cart abandonment increased from 68% to 76%11- Mobile conversion dropped 15%12Root cause: Checkout flow has 7 steps vs. industry standard of 3-4.13"""1415# Act 3: Resolution (What should we do?)16"""17Solution: Redesign checkout to 4 steps18Expected impact: Reduce abandonment by 10pp, recover $2.5M annually.19Timeline: 6 weeks development, 2 weeks testing.20"""❌ Bad: Methodology → Data → Analysis → Findings → Recommendation (ở cuối!)
✅ Good: Recommendation + Impact → Key Findings → Evidence → Methodology (appendix)
Checkpoint
Lead with the conclusion (Pyramid Principle), structure with SCQA, và always tailor depth/language theo audience type. Khi presenting churn analysis cho CEO, bạn nên start với gì?
📊 Visualization Best Practices
Chart Selection Guide
1COMPARISON → Bar Chart (categories), Line Chart (over time)2COMPOSITION → Pie/Donut (parts of whole), Stacked Area (over time)3DISTRIBUTION → Histogram, Box Plot, Scatter Plot4RELATIONSHIP → Scatter Plot, Heatmap (correlation)5TREND → Line Chart, Multi-line, with Reference LineBefore and After
1import matplotlib.pyplot as plt2import numpy as np34categories = ['Product A', 'Product B', 'Product C', 'Product D', 'Product E']5values = [23, 45, 12, 38, 29]67fig, axes = plt.subplots(1, 2, figsize=(14, 5))89# ❌ BAD: Pie chart for comparison10axes[0].pie(values, labels=categories, autopct='%1.1f%%')11axes[0].set_title('❌ BAD: Pie Chart for Comparison')1213# ✅ GOOD: Sorted bar chart14sorted_idx = np.argsort(values)[::-1]15sorted_cats = [categories[i] for i in sorted_idx]16sorted_vals = [values[i] for i in sorted_idx]1718bars = axes[1].barh(sorted_cats, sorted_vals, color='steelblue')19axes[1].set_xlabel('Sales ($K)')20axes[1].set_title('✅ GOOD: Sorted Bar Chart')21axes[1].bar_label(bars, fmt='$%dK')22axes[1].invert_yaxis()23plt.tight_layout()24plt.show()Focus Attention
1fig, axes = plt.subplots(1, 2, figsize=(14, 5))23months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']4sales = [120, 135, 128, 142, 138, 165]5target = 15067# ❌ BAD: No visual hierarchy8axes[0].plot(months, sales, 'o-', linewidth=2)9axes[0].axhline(target, linestyle='--', label='Target')10axes[0].set_title('❌ BAD: No Visual Hierarchy')11axes[0].legend()1213# ✅ GOOD: Highlight the insight14colors = ['gray' if v < target else 'green' for v in sales]15axes[1].bar(months, sales, color=colors)16axes[1].axhline(target, color='red', linestyle='--', label='Target: $150K')17axes[1].annotate('Hit target! ↑12%', xy=(5, 165), xytext=(4, 175),18 fontsize=11, fontweight='bold', color='green',19 arrowprops=dict(arrowstyle='->', color='green'))20axes[1].set_ylabel('Sales ($K)')21axes[1].set_title('✅ GOOD: Highlight Key Insight')22axes[1].legend()23plt.tight_layout()24plt.show()Checkpoint
Visualization phải highlight insight, không chỉ show data — use color, annotation, và sorting để guide viewer's eye đến key message. Khi nào nên dùng pie chart và khi nào nên tránh?
✍️ Writing Insights & Presentations
The Insight Formula
1INSIGHT = OBSERVATION + IMPLICATION + ACTION2 3❌ Weak: "Sales increased 15% in Q4."4 5✅ Strong: "Sales increased 15% in Q4, driven primarily by6 the holiday promotion (contributing 60% of growth).7 To sustain this, extend the promotion into Q18 with a 'New Year' theme."Headline Hierarchy
Avoiding Common Pitfalls
1# ❌ DON'T: Use jargon2bad = "The YoY delta in the LTV:CAC ratio indicates deteriorating unit economics..."34# ✅ DO: Use plain language5good = """6We're spending more to acquire customers who are worth less:7- Cost to acquire a customer: up 23%8- Revenue per customer: down 15%9Bottom line: Each new customer takes 3 months longer to become profitable.10"""Effective Slide Structure
| Section | Nội dung |
|---|---|
| HEADLINE | States insight (not just topic). ✅ "Q4 Sales Beat Target by 15%" thay vì ❌ "Q4 Sales Performance" |
| VISUAL | Chart hỗ trợ insight chính |
| KEY TAKEAWAYS | 3 bullet points quan trọng |
| SOURCE | Data source và thời gian (e.g., Sales_Master database, Dec 2024) |
Checkpoint
Mỗi insight cần 3 phần: What happened (observation), Why it matters (implication), What to do (action). Thiếu action = report, có action = story. Làm sao chuyển "Revenue tăng 15%" thành một insight actionable?
📈 Executive Report Framework
KPI Selection & Dashboard
1import pandas as pd2import numpy as np34kpis = {5 'Revenue': {'current': 2.5, 'target': 2.3, 'previous': 2.1, 'unit': 'M', 'higher_is_better': True},6 'Customer Count': {'current': 12500, 'target': 13000, 'previous': 11800, 'unit': '', 'higher_is_better': True},7 'Churn Rate': {'current': 5.2, 'target': 4.0, 'previous': 4.8, 'unit': '%', 'higher_is_better': False},8 'NPS': {'current': 42, 'target': 50, 'previous': 38, 'unit': '', 'higher_is_better': True},9 'CAC': {'current': 125, 'target': 100, 'previous': 145, 'unit': '$', 'higher_is_better': False}10}1112def calculate_kpi_status(kpi_data):13 results = []14 for name, data in kpi_data.items():15 current, target, previous = data['current'], data['target'], data['previous']16 higher_better = data['higher_is_better']17 18 vs_target = ((current - target) / target) * 10019 vs_previous = ((current - previous) / previous) * 10020 21 on_track = current >= target if higher_better else current <= target22 status = '✅' if on_track else '⚠️' if abs(vs_target) < 10 else '❌'23 24 results.append({25 'KPI': name, 'Current': f"{current}{data['unit']}",26 'Target': f"{target}{data['unit']}",27 'vs Target': f"{'+' if vs_target > 0 else ''}{vs_target:.1f}%",28 'vs Previous': f"{'+' if vs_previous > 0 else ''}{vs_previous:.1f}%",29 'Status': status30 })31 return pd.DataFrame(results)3233print("Executive KPI Summary")34print("=" * 70)35print(calculate_kpi_status(kpis).to_string(index=False))Report Templates
📊 Key Metrics at a Glance:
| Metric | Actual | Target | Status |
|---|---|---|---|
| Revenue | $2.5M | $2.3M | ✅ |
| Orders | 12,500 | 11,000 | ✅ |
| Conversion | 3.2% | 3.5% | ⚠️ |
🔴 Key Issues: Mobile conversion down 15%, Support tickets up 20%
🟢 Wins This Week: Record revenue day on Thursday
📋 Key Actions for Next Week: Deploy mobile checkout fix (Owner: Product)
North Star Metric (e.g., Monthly Active Users) → Revenue / Customers / Engagement / Efficiency / Quality → Specific sub-metrics per category
Checkpoint
Executive reports answer 4 questions: Are we on track? What's changed? What's at risk? What should we do? — everything else goes to appendix. Weekly Flash Report nên dài bao nhiêu trang?
🎤 Presenting to Executives
The 5-Minute Rule
The 5-Minute Rule
Traffic Light Indicators
1def create_status_indicator(value, target, higher_is_better=True):2 if higher_is_better:3 pct = (value - target) / target * 1004 else:5 pct = (target - value) / target * 1006 7 if pct >= 0:8 return '🟢', 'On Track'9 elif pct >= -10:10 return '🟡', 'Monitor'11 else:12 return '🔴', 'At Risk'1314metrics = [15 ('Revenue', 2.5, 2.3, True),16 ('Customers', 12500, 13000, True),17 ('Churn Rate', 5.2, 4.0, False),18 ('CAC', 125, 100, False),19 ('NPS', 42, 50, True)20]2122print("KPI Status Summary")23print("=" * 50)24for name, actual, target, higher_better in metrics:25 light, status = create_status_indicator(actual, target, higher_better)26 print(f"{light} {name:15s} | Actual: {actual:>8} | Target: {target:>8} | {status}")Handling Questions
1Q: "What's driving this?"2→ Have root cause analysis ready3 4Q: "How confident are you?"5→ "80% confident. Best case: +20%, worst case: +5%"6 7Q: "What are the risks?"8→ Top 3 risks + mitigation plans9 10Q: "What happens if we do nothing?"11→ Quantify the cost of inactionExecutive Dashboard
1import matplotlib.pyplot as plt2import numpy as np34fig = plt.figure(figsize=(16, 10))5fig.suptitle('Q1 2024 Executive Dashboard', fontsize=16, fontweight='bold', y=0.98)6gs = fig.add_gridspec(3, 4, hspace=0.3, wspace=0.3)78# Row 1: Big Numbers9for i, (label, value, change, color) in enumerate([10 ('Revenue', '$8.2M', '+15% vs LY', 'green'),11 ('Customers', '15,200', '+22% vs LY', 'steelblue'),12 ('NPS', '48', '-2 vs target', 'orange'),13 ('Margin', '68%', '+3pp vs LY', 'green')14]):15 ax = fig.add_subplot(gs[0, i])16 ax.text(0.5, 0.7, value, fontsize=28, fontweight='bold', ha='center', va='center', color=color)17 ax.text(0.5, 0.3, f'{label}\n{change}', fontsize=10, ha='center', va='center', color='gray')18 ax.axis('off')1920# Row 2: Trend21ax5 = fig.add_subplot(gs[1, :2])22months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']23actual = [2.1, 2.3, 2.5, 2.6, 2.8, 3.0]24target = [2.0, 2.2, 2.4, 2.5, 2.7, 2.9]25ax5.fill_between(months, actual, alpha=0.3, color='green')26ax5.plot(months, actual, 'go-', linewidth=2, label='Actual')27ax5.plot(months, target, 'k--', label='Target')28ax5.set_ylabel('Revenue ($M)')29ax5.set_title('Revenue Trend', fontweight='bold')30ax5.legend(loc='upper left')3132# Row 2: Segment breakdown33ax6 = fig.add_subplot(gs[1, 2:])34segments = ['Enterprise', 'Mid-Market', 'SMB', 'Consumer']35values = [3.5, 2.8, 1.5, 0.4]36colors = ['#2ecc71', '#3498db', '#9b59b6', '#e74c3c']37bars = ax6.barh(segments, values, color=colors)38ax6.set_xlabel('Revenue ($M)')39ax6.set_title('Revenue by Segment', fontweight='bold')40ax6.bar_label(bars, fmt='$%.1fM')4142# Row 3: Issues & Actions43for i, (title, items) in enumerate([44 ('🔴 KEY ISSUES', ['CAC increased 18%', 'Support backlog at 72hrs', 'Enterprise deal slipped to Q2']),45 ('🟢 KEY ACTIONS', ['Launch loyalty program → +5% retention', 'Deploy mobile app v2 → +10% engagement', 'Close 3 enterprise deals → +$800K'])46]):47 ax = fig.add_subplot(gs[2, i*2:(i+1)*2])48 ax.axis('off')49 text = f"{title}\n" + "\n".join([f" {j+1}. {item}" for j, item in enumerate(items)])50 ax.text(0.05, 0.95, text, fontsize=10, fontfamily='monospace',51 verticalalignment='top', transform=ax.transAxes)5253plt.savefig('executive_dashboard.png', dpi=150, bbox_inches='tight', facecolor='white')54plt.show()Checkpoint
Start with conclusion, present evidence in 5 minutes, always have backup slides for deep-dive questions. Never spend >1 minute on methodology. Khi CEO hỏi "What happens if we do nothing?" — bạn nên trả lời thế nào?
🤖 Automated Report Generation
Jinja2 Templates
1from jinja2 import Template2from datetime import datetime34report_template = Template("""5━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━6 {{ title }}7 {{ date }}8━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━910�� KEY METRICS11{% for metric in metrics %}12{{ metric.status }} {{ metric.name }}: {{ metric.value }} (Target: {{ metric.target }})13{% endfor %}1415�� HIGHLIGHTS16{% for highlight in highlights %}17• {{ highlight }}18{% endfor %}1920�� ACTION ITEMS21{% for action in actions %}22{{ loop.index }}. {{ action.task }} (Owner: {{ action.owner }}, Due: {{ action.due }})23{% endfor %}24━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━25""")2627data = {28 'title': 'WEEKLY SALES REPORT',29 'date': datetime.now().strftime('%Y-%m-%d'),30 'metrics': [31 {'name': 'Revenue', 'value': '$2.5M', 'target': '$2.3M', 'status': '✅'},32 {'name': 'Orders', 'value': '12,500', 'target': '11,000', 'status': '✅'},33 {'name': 'Conversion', 'value': '3.2%', 'target': '3.5%', 'status': '⚠️'},34 ],35 'highlights': [36 'Record Thursday revenue (+45% vs average)',37 'Mobile app orders grew 25% MoM'38 ],39 'actions': [40 {'task': 'Review mobile checkout UX', 'owner': 'Product', 'due': '3/22'},41 {'task': 'Launch retargeting campaign', 'owner': 'Marketing', 'due': '3/20'},42 ]43}4445print(report_template.render(**data))Complete Reporting Pipeline
1import logging23logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')4logger = logging.getLogger(__name__)56class AutomatedReportingSystem:7 def __init__(self, config):8 self.config = config9 self.data = {}10 self.metrics = {}11 12 def extract_data(self):13 logger.info("Extracting data...")14 # In production: query database, APIs15 import numpy as np16 self.data['revenue'] = pd.DataFrame({17 'date': pd.date_range(start='2024-01-01', periods=90, freq='D'),18 'revenue': np.random.uniform(70000, 100000, 90),19 'orders': np.random.randint(300, 500, 90)20 })21 logger.info("Data extraction complete")22 23 def calculate_metrics(self):24 logger.info("Calculating metrics...")25 rev_df = self.data['revenue']26 self.metrics = {27 'total_revenue': rev_df['revenue'].sum(),28 'avg_daily_revenue': rev_df['revenue'].mean(),29 'total_orders': rev_df['orders'].sum(),30 'avg_order_value': rev_df['revenue'].sum() / rev_df['orders'].sum()31 }32 logger.info("Metrics calculation complete")33 34 def generate_report(self):35 logger.info("Generating report...")36 m = self.metrics37 return f"""38Total Revenue: ${m['total_revenue']:,.0f}39Average Daily: ${m['avg_daily_revenue']:,.0f}40Total Orders: {m['total_orders']:,}41AOV: ${m['avg_order_value']:.2f}42"""43 44 def run_pipeline(self):45 logger.info("Starting pipeline...")46 self.extract_data()47 self.calculate_metrics()48 report = self.generate_report()49 logger.info("Pipeline completed!")50 return report5152# Run53system = AutomatedReportingSystem({'db': 'example'})54print(system.run_pipeline())Scheduling
1# Schedule configuration2schedule_config = {3 'daily_flash': {'time': '07:00', 'recipients': ['ceo@company.com']},4 'weekly_summary': {'day': 'Monday', 'time': '08:00', 'recipients': ['leadership@company.com']},5 'monthly_review': {'day': 1, 'time': '09:00', 'recipients': ['board@company.com']}6}78# APScheduler example9"""10from apscheduler.schedulers.background import BackgroundScheduler11from apscheduler.triggers.cron import CronTrigger1213scheduler = BackgroundScheduler()14scheduler.add_job(daily_report, CronTrigger(hour=7, minute=0), id='daily')15scheduler.add_job(weekly_report, CronTrigger(day_of_week='mon', hour=8), id='weekly')16scheduler.start()17"""Extract (data sources) → Transform (calculate metrics) → Generate (template rendering) → Deliver (email/Slack) → Monitor (logging/alerts)
Checkpoint
Automated reports giảm manual work, đảm bảo consistency, và deliver insights đúng thời điểm. Pipeline pattern: Extract → Transform → Generate → Deliver. Tại sao nên dùng Jinja2 templates thay vì f-strings cho reports?
📋 Tổng kết
Kiến thức đã học
| Chủ đề | Nội dung chính |
|---|---|
| Audience | Executive, Technical, Operations — tailor accordingly |
| Story Structure | SCQA, Pyramid Principle, Three-Act narrative |
| Visualization | Right chart type, focus attention, highlight insights |
| Writing | Observation + Implication + Action, avoid jargon |
| Executive Reports | KPI framework, traffic lights, flash/monthly/QBR templates |
| Presenting | 5-minute rule, prepare for questions, lead with conclusion |
| Automation | Jinja2 templates, scheduling, Extract→Transform→Generate→Deliver |
Câu hỏi tự kiểm tra
- SCQA framework gồm những phần nào?
- Executive report khác technical report thế nào?
- "Observation + Implication + Action" nghĩa là gì?
- Jinja2 templates dùng để automate gì?
Bạn đã nắm vững Data Storytelling & Executive Reporting — kỹ năng quan trọng nhất để biến analysis thành business impact. Data không tự nói — bạn phải kể câu chuyện cho nó.
Bài tiếp theo: Automated Reports & Portfolio Capstone
