MinAI - Về trang chủ
Hướng dẫn
9/132.5 giờ
Đang tải...

Funnel Analysis

Phân tích conversion rates, drop-off points, và tối ưu hóa user journeys

Funnel Analysis

Conversion Funnel Analytics

0

🎯 Mục tiêu bài học

TB5 min
Sau bài học này, bạn sẽ có thể:
  • Xây dựng conversion funnel từ event data
  • Tính conversion rates và drop-off tại mỗi stage
  • Phân tích funnel theo segments (device, channel)
  • Đo lường funnel velocity và time-to-convert
  • Viết SQL queries cho funnel analysis
  • Tạo visualizations chuyên nghiệp cho funnel
Thông tin bài học

⏱️ Thời lượng: 2.5 giờ | 📊 Cấp độ: Nâng cao | 🛠️ Công cụ: Python, SQL, Matplotlib

1

📖 Thuật ngữ quan trọng

TB5 min
Thuật ngữTiếng ViệtMô tả
FunnelPhễu chuyển đổiChuỗi steps mà users đi qua để hoàn thành mục tiêu
Conversion RateTỷ lệ chuyển đổiPhần trăm users chuyển từ step này sang step tiếp theo
Drop-offTỷ lệ rời bỏPhần trăm users không tiếp tục sang step kế tiếp
Sequential FunnelPhễu tuần tựFunnel yêu cầu users hoàn thành đúng thứ tự các bước
Segmented FunnelPhễu phân khúcPhân tích funnel theo nhóm users (device, channel, etc.)
Funnel VelocityTốc độ phễuThời gian trung bình để users di chuyển qua funnel
Time-to-ConvertThời gian chuyển đổiKhoảng thời gian từ first touch đến conversion
Health ScoreĐiểm sức khỏeChỉ số đánh giá hiệu quả funnel so với benchmarks
AIDAAIDAMô hình Marketing: Awareness → Interest → Desire → Action

Checkpoint

Funnel analysis theo dõi users qua các stages để identify drop-off points và optimize conversion rates — foundation của growth analytics. Bạn có thể giải thích sự khác nhau giữa overall conversion rate và step conversion rate không?

2

📊 Data Preparation & Basic Funnel

TB5 min

Event-based Data

Python
1import pandas as pd
2import numpy as np
3import matplotlib.pyplot as plt
4from datetime import datetime, timedelta
5
6# Generate sample event data
7np.random.seed(42)
8n_users = 10000
9
10users = list(range(1, n_users + 1))
11events = []
12base_date = datetime(2024, 1, 1)
13
14for user_id in users:
15 session_time = base_date + timedelta(days=np.random.randint(0, 90))
16
17 # Stage 1: Landing (100%)
18 events.append({
19 'user_id': user_id, 'event': 'landing', 'timestamp': session_time
20 })
21
22 # Stage 2: View Product (70%)
23 if np.random.random() < 0.70:
24 session_time += timedelta(seconds=np.random.randint(10, 120))
25 events.append({
26 'user_id': user_id, 'event': 'view_product', 'timestamp': session_time
27 })
28 else:
29 continue
30
31 # Stage 3: Add to Cart (55%)
32 if np.random.random() < 0.55:
33 session_time += timedelta(seconds=np.random.randint(30, 300))
34 events.append({
35 'user_id': user_id, 'event': 'add_to_cart', 'timestamp': session_time
36 })
37 else:
38 continue
39
40 # Stage 4: Checkout (60%)
41 if np.random.random() < 0.60:
42 session_time += timedelta(seconds=np.random.randint(60, 600))
43 events.append({
44 'user_id': user_id, 'event': 'checkout', 'timestamp': session_time
45 })
46 else:
47 continue
48
49 # Stage 5: Purchase (70%)
50 if np.random.random() < 0.70:
51 session_time += timedelta(seconds=np.random.randint(30, 180))
52 events.append({
53 'user_id': user_id, 'event': 'purchase', 'timestamp': session_time
54 })
55
56events_df = pd.DataFrame(events)
57print(f"Total events: {len(events_df)}")
58print(events_df['event'].value_counts())

Define Funnel Steps & Calculate

Python
1funnel_steps = ['landing', 'view_product', 'add_to_cart', 'checkout', 'purchase']
2step_order = {step: i for i, step in enumerate(funnel_steps)}
3events_df['step_order'] = events_df['event'].map(step_order)
4
5def calculate_funnel(events_df, funnel_steps):
6 """Calculate funnel metrics"""
7 user_steps = events_df.groupby('user_id')['event'].apply(set).reset_index()
8
9 results = []
10 for i, step in enumerate(funnel_steps):
11 users_at_step = user_steps[user_steps['event'].apply(lambda x: step in x)]
12 count = len(users_at_step)
13
14 overall_rate = count / len(user_steps) * 100
15 step_rate = count / results[i-1]['count'] * 100 if i > 0 else 100
16
17 results.append({
18 'step': step,
19 'step_number': i + 1,
20 'count': count,
21 'overall_conversion': round(overall_rate, 2),
22 'step_conversion': round(step_rate, 2),
23 'drop_off': round(100 - step_rate, 2) if i > 0 else 0
24 })
25
26 return pd.DataFrame(results)
27
28funnel_df = calculate_funnel(events_df, funnel_steps)
29print("Funnel Analysis:")
30print(funnel_df)
Overall vs Step Conversion
  • Overall conversion: % users so với step đầu tiên (landing)
  • Step conversion: % users so với step ngay trước đó
  • Drop-off: 100% - step_conversion — phần trăm mất đi ở mỗi bước

Checkpoint

Funnel analysis bắt đầu từ event data → define step sequence → count users tại mỗi step → tính conversion rates. Tại sao overall conversion rate thường quan trọng hơn step conversion rate khi báo cáo cho stakeholders?

3

📊 Funnel Visualization

TB5 min

Bar Chart & Conversion Rate

Python
1def plot_funnel(funnel_df):
2 """Create funnel visualization"""
3 fig, axes = plt.subplots(1, 2, figsize=(14, 6))
4
5 # Bar chart
6 bars = axes[0].barh(funnel_df['step'][::-1], funnel_df['count'][::-1])
7 axes[0].set_xlabel('Users')
8 axes[0].set_title('Funnel: User Counts')
9 for bar, count in zip(bars, funnel_df['count'][::-1]):
10 axes[0].text(bar.get_width() + 100, bar.get_y() + bar.get_height()/2,
11 f'{count:,}', va='center')
12
13 # Conversion rate chart
14 x = range(len(funnel_df))
15 axes[1].plot(x, funnel_df['overall_conversion'], marker='o', linewidth=2, label='Overall')
16 axes[1].bar(x, funnel_df['step_conversion'], alpha=0.3, label='Step')
17 axes[1].set_xticks(x)
18 axes[1].set_xticklabels(funnel_df['step'], rotation=45, ha='right')
19 axes[1].set_ylabel('Conversion Rate (%)')
20 axes[1].set_title('Conversion Rates')
21 axes[1].legend()
22 axes[1].set_ylim(0, 105)
23
24 for i, (overall, step) in enumerate(zip(funnel_df['overall_conversion'], funnel_df['step_conversion'])):
25 axes[1].text(i, overall + 2, f'{overall}%', ha='center', fontsize=9)
26
27 plt.tight_layout()
28 plt.savefig('funnel_analysis.png', dpi=150)
29 plt.show()
30
31plot_funnel(funnel_df)

Pyramid-style Funnel

Python
1def plot_funnel_pyramid(funnel_df):
2 """Create pyramid-style funnel"""
3 fig, ax = plt.subplots(figsize=(10, 8))
4
5 n_steps = len(funnel_df)
6 max_width = funnel_df['count'].max()
7 colors = plt.cm.Blues(np.linspace(0.3, 0.9, n_steps))
8
9 for i, (_, row) in enumerate(funnel_df.iterrows()):
10 width = row['count'] / max_width
11 left = (1 - width) / 2
12
13 rect = plt.Rectangle(
14 (left, n_steps - i - 1), width, 0.8,
15 facecolor=colors[i], edgecolor='white', linewidth=2
16 )
17 ax.add_patch(rect)
18
19 ax.text(0.5, n_steps - i - 0.6,
20 f"{row['step']}\n{row['count']:,} ({row['overall_conversion']}%)",
21 ha='center', va='center', fontsize=10, fontweight='bold')
22
23 if i > 0:
24 ax.annotate(f"↓ {row['drop_off']}% drop",
25 xy=(0.85, n_steps - i + 0.1), fontsize=9, color='red')
26
27 ax.set_xlim(0, 1)
28 ax.set_ylim(0, n_steps)
29 ax.axis('off')
30 ax.set_title('Conversion Funnel', fontsize=14, fontweight='bold')
31 plt.tight_layout()
32 plt.show()
33
34plot_funnel_pyramid(funnel_df)

Checkpoint

Funnel pyramid trực quan nhất cho stakeholders vì thể hiện rõ drop-off tại mỗi bước — width giảm dần thể hiện rõ nơi mất users. Khi nào nên dùng bar chart vs pyramid chart để trình bày funnel?

4

🔍 Segmented Funnel Analysis

TB5 min

Add Segments

Python
1np.random.seed(42)
2user_segments = pd.DataFrame({
3 'user_id': range(1, n_users + 1),
4 'device': np.random.choice(['Mobile', 'Desktop', 'Tablet'], n_users, p=[0.55, 0.35, 0.10]),
5 'channel': np.random.choice(['Organic', 'Paid', 'Social', 'Email'], n_users, p=[0.35, 0.30, 0.20, 0.15]),
6 'is_new': np.random.choice([True, False], n_users, p=[0.7, 0.3])
7})
8
9events_df = events_df.merge(user_segments, on='user_id')

Funnel by Segment

Python
1def funnel_by_segment(events_df, funnel_steps, segment_col):
2 """Calculate funnel for each segment"""
3 results = []
4 for segment in events_df[segment_col].unique():
5 segment_data = events_df[events_df[segment_col] == segment]
6 funnel = calculate_funnel(segment_data, funnel_steps)
7 funnel['segment'] = segment
8 results.append(funnel)
9 return pd.concat(results, ignore_index=True)
10
11# By device
12device_funnel = funnel_by_segment(events_df, funnel_steps, 'device')
13print("\nFunnel by Device:")
14device_pivot = device_funnel.pivot(index='step', columns='segment', values='overall_conversion')
15print(device_pivot)
16
17# By channel
18channel_funnel = funnel_by_segment(events_df, funnel_steps, 'channel')
19print("\nFunnel by Channel:")
20channel_pivot = channel_funnel.pivot(index='step', columns='segment', values='overall_conversion')
21print(channel_pivot)

Visualize Segment Comparison

Python
1def plot_segment_comparison(funnel_df, segment_col):
2 """Compare funnels across segments"""
3 segments = funnel_df['segment'].unique()
4 n_segments = len(segments)
5
6 fig, ax = plt.subplots(figsize=(12, 6))
7 x = np.arange(len(funnel_steps))
8 width = 0.8 / n_segments
9
10 for i, segment in enumerate(segments):
11 segment_data = funnel_df[funnel_df['segment'] == segment]
12 offset = (i - n_segments/2 + 0.5) * width
13 bars = ax.bar(x + offset, segment_data['overall_conversion'],
14 width, label=segment, alpha=0.8)
15
16 ax.set_ylabel('Conversion Rate (%)')
17 ax.set_title(f'Funnel Comparison by {segment_col}')
18 ax.set_xticks(x)
19 ax.set_xticklabels(funnel_steps, rotation=45, ha='right')
20 ax.legend(title=segment_col)
21 ax.set_ylim(0, 105)
22 plt.tight_layout()
23 plt.show()
24
25plot_segment_comparison(device_funnel, 'Device')
26plot_segment_comparison(channel_funnel, 'Channel')
Actionable Segmentation

Segment analysis giúp identify nhóm users có conversion thấp nhất → focus optimization efforts vào đúng nơi có impact lớn nhất.

Checkpoint

Segmented funnel reveals hidden problems — overall conversion có thể ổn nhưng Mobile users có thể có drop-off rất cao tại checkout. Nếu Mobile drop-off cao hơn Desktop 20% tại checkout step, bạn sẽ recommend gì?

5

⏱️ Time-based Analysis

TB5 min

Time Between Steps

Python
1def calculate_step_times(events_df, funnel_steps):
2 """Calculate time between funnel steps"""
3 user_events = events_df.pivot_table(
4 index='user_id', columns='event', values='timestamp', aggfunc='first'
5 )
6
7 time_diffs = {}
8 for i in range(len(funnel_steps) - 1):
9 step1 = funnel_steps[i]
10 step2 = funnel_steps[i + 1]
11 if step1 in user_events.columns and step2 in user_events.columns:
12 diff = (user_events[step2] - user_events[step1]).dt.total_seconds()
13 time_diffs[f'{step1}_to_{step2}'] = diff.dropna()
14
15 return time_diffs
16
17time_diffs = calculate_step_times(events_df, funnel_steps)
18
19print("Time Between Steps (seconds):")
20for step, times in time_diffs.items():
21 print(f"\n{step}:")
22 print(f" Median: {times.median():.1f}s")
23 print(f" Mean: {times.mean():.1f}s")
24 print(f" 95th percentile: {times.quantile(0.95):.1f}s")

Time-to-Convert Analysis

Python
1def time_to_convert_analysis(events_df):
2 """Analyze time from first touch to conversion"""
3 user_journey = events_df.groupby('user_id').agg({
4 'timestamp': ['min', 'max'],
5 'event': lambda x: list(x)
6 }).reset_index()
7 user_journey.columns = ['user_id', 'first_event', 'last_event', 'events']
8
9 purchasers = user_journey[user_journey['events'].apply(lambda x: 'purchase' in x)]
10 purchasers['time_to_convert'] = (purchasers['last_event'] - purchasers['first_event']).dt.total_seconds()
11
12 fig, axes = plt.subplots(1, 2, figsize=(12, 5))
13
14 axes[0].hist(purchasers['time_to_convert'] / 60, bins=50, edgecolor='black')
15 axes[0].set_xlabel('Time to Convert (minutes)')
16 axes[0].set_ylabel('Users')
17 axes[0].set_title('Time-to-Convert Distribution')
18 axes[0].axvline(purchasers['time_to_convert'].median() / 60, color='r',
19 linestyle='--', label=f"Median: {purchasers['time_to_convert'].median()/60:.1f}m")
20 axes[0].legend()
21
22 sorted_times = np.sort(purchasers['time_to_convert'].values) / 60
23 cumulative = np.arange(1, len(sorted_times) + 1) / len(sorted_times) * 100
24 axes[1].plot(sorted_times, cumulative)
25 axes[1].set_xlabel('Time (minutes)')
26 axes[1].set_ylabel('Cumulative % Converted')
27 axes[1].set_title('Cumulative Conversion by Time')
28 axes[1].axhline(50, color='r', linestyle='--', alpha=0.5)
29 axes[1].axhline(80, color='g', linestyle='--', alpha=0.5)
30
31 plt.tight_layout()
32 plt.show()
33 return purchasers
34
35purchasers_df = time_to_convert_analysis(events_df)

Funnel Trend Over Time

Python
1def funnel_trend(events_df, funnel_steps, freq='W'):
2 """Calculate funnel metrics over time"""
3 events_df['period'] = events_df['timestamp'].dt.to_period(freq)
4
5 results = []
6 for period in events_df['period'].unique():
7 period_data = events_df[events_df['period'] == period]
8 funnel = calculate_funnel(period_data, funnel_steps)
9 funnel['period'] = period
10 results.append(funnel)
11
12 trend_df = pd.concat(results, ignore_index=True)
13 return trend_df
14
15trend = funnel_trend(events_df, funnel_steps, 'W')
16
17fig, ax = plt.subplots(figsize=(14, 6))
18purchase_trend = trend[trend['step'] == 'purchase'].copy()
19purchase_trend['period'] = purchase_trend['period'].astype(str)
20
21ax.plot(purchase_trend['period'], purchase_trend['overall_conversion'], marker='o')
22ax.set_xlabel('Week')
23ax.set_ylabel('Overall Conversion Rate (%)')
24ax.set_title('Purchase Conversion Rate Over Time')
25ax.tick_params(axis='x', rotation=45)
26
27z = np.polyfit(range(len(purchase_trend)), purchase_trend['overall_conversion'], 1)
28p = np.poly1d(z)
29ax.plot(purchase_trend['period'], p(range(len(purchase_trend))), 'r--', label='Trend')
30ax.legend()
31plt.tight_layout()
32plt.show()

Checkpoint

Time analysis reveals conversion windows — biết 80% users convert trong 5 phút giúp bạn thiết kế urgency và interventions đúng thời điểm. Funnel trend giảm dần theo thời gian có thể do nguyên nhân gì?

6

🔬 Advanced Metrics & Funnel SQL

TB5 min

Funnel Velocity

Python
1def calculate_velocity(events_df, funnel_steps):
2 """Calculate funnel velocity metrics"""
3 time_diffs = calculate_step_times(events_df, funnel_steps)
4
5 metrics = {}
6 metrics['median_step_times'] = {k: v.median() for k, v in time_diffs.items()}
7
8 total_times = events_df.groupby('user_id').apply(
9 lambda x: (x['timestamp'].max() - x['timestamp'].min()).total_seconds()
10 )
11 metrics['total_funnel_time'] = {
12 'median': total_times.median(),
13 'mean': total_times.mean(),
14 'p75': total_times.quantile(0.75)
15 }
16
17 purchasers = events_df[events_df['event'] == 'purchase']['user_id'].unique()
18 converter_times = total_times[total_times.index.isin(purchasers)]
19 metrics['conversion_windows'] = {
20 '50%': converter_times.quantile(0.5),
21 '75%': converter_times.quantile(0.75),
22 '90%': converter_times.quantile(0.90)
23 }
24
25 return metrics
26
27velocity = calculate_velocity(events_df, funnel_steps)
28print("Funnel Velocity Metrics:")
29for step, time in velocity['median_step_times'].items():
30 print(f" {step}: {time:.1f}s")
31print(f"\nConversion Windows:")
32for pct, time in velocity['conversion_windows'].items():
33 print(f" {pct} of converters complete in: {time:.1f}s")

Drop-off & Health Score

Python
1def analyze_dropoff(events_df, funnel_steps):
2 """Detailed drop-off analysis"""
3 user_max_step = events_df.groupby('user_id')['step_order'].max().reset_index()
4 user_max_step = user_max_step.merge(
5 events_df[['user_id', 'device', 'channel', 'is_new']].drop_duplicates(),
6 on='user_id'
7 )
8
9 results = {}
10 for i, step in enumerate(funnel_steps[:-1]):
11 dropped = user_max_step[user_max_step['step_order'] == i]
12 if len(dropped) > 0:
13 results[step] = {
14 'count': len(dropped),
15 'by_device': dropped['device'].value_counts().to_dict(),
16 'by_channel': dropped['channel'].value_counts().to_dict()
17 }
18 return results
19
20dropoff = analyze_dropoff(events_df, funnel_steps)
21for step, details in dropoff.items():
22 print(f"\n{step.upper()} ({details['count']} users dropped)")
23 print(f" By Device: {details['by_device']}")
24
25def funnel_health_score(funnel_df, benchmarks=None):
26 """Calculate funnel health score"""
27 if benchmarks is None:
28 benchmarks = {'view_product': 75, 'add_to_cart': 35, 'checkout': 20, 'purchase': 15}
29
30 scores = []
31 for _, row in funnel_df.iterrows():
32 if row['step'] in benchmarks:
33 benchmark = benchmarks[row['step']]
34 actual = row['overall_conversion']
35 score = min(100, (actual / benchmark) * 100)
36 scores.append({
37 'step': row['step'], 'actual': actual,
38 'benchmark': benchmark, 'score': round(score, 1),
39 'status': '✅' if actual >= benchmark else '⚠️' if actual >= benchmark * 0.8 else '❌'
40 })
41
42 health_df = pd.DataFrame(scores)
43 print("Funnel Health Report:")
44 print(health_df.to_string(index=False))
45 print(f"\nOverall Health Score: {health_df['score'].mean():.1f}/100")
46
47funnel_health_score(funnel_df)

Funnel SQL Queries

SQL
1-- Basic Funnel Query
2WITH funnel_stages AS (
3 SELECT
4 user_id,
5 MAX(CASE WHEN event = 'landing' THEN 1 ELSE 0 END) AS landing,
6 MAX(CASE WHEN event = 'view_product' THEN 1 ELSE 0 END) AS view_product,
7 MAX(CASE WHEN event = 'add_to_cart' THEN 1 ELSE 0 END) AS add_to_cart,
8 MAX(CASE WHEN event = 'checkout' THEN 1 ELSE 0 END) AS checkout,
9 MAX(CASE WHEN event = 'purchase' THEN 1 ELSE 0 END) AS purchase
10 FROM events
11 GROUP BY user_id
12)
13SELECT
14 SUM(landing) AS landing_users,
15 SUM(purchase) AS purchasers,
16 ROUND(SUM(purchase) * 100.0 / SUM(landing), 2) AS overall_conversion
17FROM funnel_stages;
18
19-- Sequential Funnel (strict order)
20WITH user_paths AS (
21 SELECT user_id,
22 STRING_AGG(event, '->' ORDER BY timestamp) AS path
23 FROM events GROUP BY user_id
24),
25sequential_check AS (
26 SELECT user_id, path,
27 CASE WHEN path LIKE '%landing%->%view_product%->%add_to_cart%->%checkout%->%purchase%'
28 THEN 1 ELSE 0 END AS completed
29 FROM user_paths
30)
31SELECT SUM(completed) AS sequential_conversions,
32 COUNT(*) AS total_users
33FROM sequential_check;
34
35-- Funnel by Segment
36WITH funnel_by_device AS (
37 SELECT u.device, e.user_id,
38 MAX(CASE WHEN e.event = 'landing' THEN 1 ELSE 0 END) AS landing,
39 MAX(CASE WHEN e.event = 'purchase' THEN 1 ELSE 0 END) AS purchase
40 FROM events e JOIN users u ON e.user_id = u.user_id
41 GROUP BY u.device, e.user_id
42)
43SELECT device, COUNT(DISTINCT user_id) AS total_users,
44 SUM(purchase) AS purchasers,
45 ROUND(SUM(purchase) * 100.0 / SUM(landing), 2) AS conversion_rate
46FROM funnel_by_device
47GROUP BY device ORDER BY conversion_rate DESC;
Sequential vs Non-Sequential

Sequential funnel yêu cầu users hoàn thành đúng thứ tự → conversion thường thấp hơn. Non-sequential chỉ kiểm tra user có đạt step đó hay không, bất kể thứ tự.

Checkpoint

Health score so sánh funnel performance với industry benchmarks — giúp identify nhanh steps cần optimization và đo lường progress. Khi drop-off analysis cho thấy Mobile users drop nhiều nhất tại checkout, SQL query nào sẽ giúp xác nhận?

7

💻 Thực hành tổng hợp

TB5 min

Exercise: Complete Funnel Analysis

Xây dựng FunnelAnalyzer class hoàn chỉnh:

Python
1# Build comprehensive funnel analysis:
2# 1. Calculate basic funnel metrics
3# 2. Segment by device and channel
4# 3. Analyze drop-off points
5# 4. Calculate time metrics
6# 5. Generate recommendations
7
8# YOUR CODE HERE
💡 Xem đáp án
Python
1class FunnelAnalyzer:
2 def __init__(self, events_df, funnel_steps, user_segments=None):
3 self.events = events_df.copy()
4 self.steps = funnel_steps
5 self.step_order = {step: i for i, step in enumerate(funnel_steps)}
6 self.events['step_order'] = self.events['event'].map(self.step_order)
7
8 if user_segments is not None:
9 self.events = self.events.merge(user_segments, on='user_id', how='left')
10
11 def basic_funnel(self):
12 user_events = self.events.groupby('user_id')['event'].apply(set)
13 results = []
14 prev_count = len(user_events)
15
16 for i, step in enumerate(self.steps):
17 users = user_events[user_events.apply(lambda x: step in x)]
18 count = len(users)
19 results.append({
20 'step': step, 'users': count,
21 'overall_rate': round(count / len(user_events) * 100, 2),
22 'step_rate': round(count / prev_count * 100, 2) if prev_count > 0 else 0,
23 'dropoff': round((1 - count / prev_count) * 100, 2) if prev_count > 0 and i > 0 else 0
24 })
25 prev_count = count
26 return pd.DataFrame(results)
27
28 def segment_funnel(self, segment_col):
29 results = []
30 for segment in self.events[segment_col].dropna().unique():
31 segment_data = self.events[self.events[segment_col] == segment]
32 user_events = segment_data.groupby('user_id')['event'].apply(set)
33 for i, step in enumerate(self.steps):
34 users = user_events[user_events.apply(lambda x: step in x)]
35 results.append({
36 'segment': segment, 'step': step,
37 'users': len(users),
38 'rate': round(len(users) / len(user_events) * 100, 2)
39 })
40 return pd.DataFrame(results)
41
42 def dropoff_analysis(self):
43 user_max = self.events.groupby('user_id')['step_order'].max()
44 dropoff = []
45 for i, step in enumerate(self.steps[:-1]):
46 dropped = (user_max == i).sum()
47 dropoff.append({
48 'dropped_at': step,
49 'count': dropped,
50 'pct': round(dropped / len(user_max) * 100, 2)
51 })
52 return pd.DataFrame(dropoff)
53
54 def generate_report(self):
55 funnel = self.basic_funnel()
56 dropoff = self.dropoff_analysis()
57
58 print("=" * 60)
59 print("FUNNEL ANALYSIS REPORT")
60 print("=" * 60)
61 print(f"\nTotal Users: {funnel.iloc[0]['users']:,}")
62 print(f"Final Conversions: {funnel.iloc[-1]['users']:,}")
63 print(f"Overall Conversion Rate: {funnel.iloc[-1]['overall_rate']}%")
64
65 biggest_drop = dropoff.loc[dropoff['count'].idxmax()]
66 print(f"\nBiggest Drop: {biggest_drop['dropped_at']} ({biggest_drop['pct']}% of users)")
67
68# Run
69analyzer = FunnelAnalyzer(events_df, funnel_steps, user_segments)
70analyzer.generate_report()
71
72print("\nFunnel by Device:")
73print(analyzer.segment_funnel('device').pivot(index='step', columns='segment', values='rate'))
8

📋 Tổng kết

TB5 min

Kiến thức đã học

Chủ đềNội dung chính
Basic FunnelSteps, conversion rates, drop-off calculation
VisualizationBar charts, pyramid funnel, conversion curves
SegmentationDevice, channel, user type breakdown
Time AnalysisStep duration, velocity, conversion windows
Advanced MetricsHealth scores, drop-off profiling, trend analysis
SQLFunnel queries, sequential check, segment breakdown

Câu hỏi tự kiểm tra

  1. Conversion rate và drop-off rate tính thế nào?
  2. Funnel segmentation giúp phát hiện gì?
  3. Step duration analysis cho biết điều gì?
  4. Health score dùng để đánh giá gì?
Hoàn thành!

Bạn đã nắm vững Funnel Analysis — kỹ năng essential cho Product Analytics và Growth teams. Funnel giúp bạn answer "where are we losing users?" và "what should we fix first?"

Bài tiếp theo: Predictive Analytics Basics