⚙️ Hyperparameter Tuning

Hyperparameter tuning là quá trình tìm kiếm bộ hyperparameters tối ưu cho model. Bài này cover từ Grid Search đến Bayesian Optimization.

Hyperparameters vs Parameters

Phân biệt

Parameters: Học từ data (weights, biases)
Hyperparameters: Set trước khi train
- Learning rate
- Number of layers
- Regularization strength
- Batch size

Common Hyperparameters

1. Model Architecture

Model	Hyperparameters
Decision Tree	max_depth, min_samples_split, criterion
Random Forest	n_estimators, max_depth, max_features
SVM	C, kernel, gamma
Neural Network	layers, neurons, activation
XGBoost	n_estimators, max_depth, learning_rate

2. Training Process

Learning rate: Step size for gradient descent
Batch size: Samples per gradient update
Epochs: Number of training iterations
Early stopping patience: When to stop

Search Strategies

1. Grid Search

Tìm kiếm exhaustive trên tất cả combinations:

Python

1from sklearn.model_selection import GridSearchCV
2from sklearn.ensemble import RandomForestClassifier
3
4# Define parameter grid
5param_grid = {
6    'n_estimators': [100, 200, 300],
7    'max_depth': [5, 10, 15, None],
8    'min_samples_split': [2, 5, 10],
9    'max_features': ['sqrt', 'log2']
10}
11
12# Grid Search
13rf = RandomForestClassifier(random_state=42)
14grid_search = GridSearchCV(
15    rf,
16    param_grid,
17    cv=5,
18    scoring='accuracy',
19    n_jobs=-1,
20    verbose=2
21)
22
23grid_search.fit(X_train, y_train)
24
25# Best parameters
26print(f"Best params: {grid_search.best_params_}")
27print(f"Best score: {grid_search.best_score_:.4f}")

Pros: Simple, thorough Cons: Exponential complexity, slow

2. Random Search

Sample random combinations:

Python

1from sklearn.model_selection import RandomizedSearchCV
2from scipy.stats import randint, uniform
3
4# Define distributions
5param_distributions = {
6    'n_estimators': randint(100, 500),
7    'max_depth': randint(3, 20),
8    'min_samples_split': randint(2, 20),
9    'learning_rate': uniform(0.01, 0.3),
10    'subsample': uniform(0.6, 0.4)
11}
12
13# Random Search
14from xgboost import XGBClassifier
15xgb = XGBClassifier(random_state=42)
16
17random_search = RandomizedSearchCV(
18    xgb,
19    param_distributions,
20    n_iter=100,  # Number of random samples
21    cv=5,
22    scoring='accuracy',
23    n_jobs=-1,
24    random_state=42
25)
26
27random_search.fit(X_train, y_train)
28print(f"Best params: {random_search.best_params_}")

Pros: More efficient than grid, better coverage Cons: May miss optimal values

3. Bayesian Optimization

Sử dụng probabilistic model để guide search:

Python

1from skopt import BayesSearchCV
2from skopt.space import Real, Integer, Categorical
3
4# Define search space
5search_spaces = {
6    'n_estimators': Integer(100, 500),
7    'max_depth': Integer(3, 20),
8    'learning_rate': Real(0.01, 0.3, prior='log-uniform'),
9    'subsample': Real(0.6, 1.0),
10    'colsample_bytree': Real(0.6, 1.0)
11}
12
13# Bayesian Optimization
14bayes_search = BayesSearchCV(
15    XGBClassifier(random_state=42),
16    search_spaces,
17    n_iter=50,
18    cv=5,
19    scoring='accuracy',
20    n_jobs=-1,
21    random_state=42
22)
23
24bayes_search.fit(X_train, y_train)
25print(f"Best params: {bayes_search.best_params_}")

4. Optuna (Modern Approach)

Framework mạnh mẽ cho hyperparameter optimization:

Python

1import optuna
2from sklearn.model_selection import cross_val_score
3
4def objective(trial):
5    # Suggest hyperparameters
6    params = {
7        'n_estimators': trial.suggest_int('n_estimators', 100, 500),
8        'max_depth': trial.suggest_int('max_depth', 3, 20),
9        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
10        'subsample': trial.suggest_float('subsample', 0.6, 1.0),
11        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.6, 1.0),
12        'min_child_weight': trial.suggest_int('min_child_weight', 1, 10),
13        'reg_alpha': trial.suggest_float('reg_alpha', 1e-8, 10.0, log=True),
14        'reg_lambda': trial.suggest_float('reg_lambda', 1e-8, 10.0, log=True)
15    }
16    
17    model = XGBClassifier(**params, random_state=42)
18    scores = cross_val_score(model, X_train, y_train, cv=5, scoring='accuracy')
19    
20    return scores.mean()
21
22# Create study
23study = optuna.create_study(direction='maximize')
24study.optimize(objective, n_trials=100, show_progress_bar=True)
25
26# Best parameters
27print(f"Best params: {study.best_params}")
28print(f"Best score: {study.best_value:.4f}")
29
30# Visualize
31optuna.visualization.plot_optimization_history(study)
32optuna.visualization.plot_param_importances(study)

Neural Network Hyperparameter Tuning

Với Keras Tuner

Python

1import keras_tuner as kt
2from tensorflow import keras
3
4def build_model(hp):
5    model = keras.Sequential()
6    
7    # Tune number of layers
8    for i in range(hp.Int('num_layers', 1, 4)):
9        model.add(keras.layers.Dense(
10            units=hp.Int(f'units_{i}', min_value=32, max_value=512, step=32),
11            activation=hp.Choice('activation', ['relu', 'tanh', 'selu'])
12        ))
13        
14        # Tune dropout
15        model.add(keras.layers.Dropout(
16            hp.Float('dropout', 0.0, 0.5, step=0.1)
17        ))
18    
19    model.add(keras.layers.Dense(1, activation='sigmoid'))
20    
21    # Tune learning rate
22    model.compile(
23        optimizer=keras.optimizers.Adam(
24            hp.Float('learning_rate', 1e-4, 1e-2, sampling='log')
25        ),
26        loss='binary_crossentropy',
27        metrics=['accuracy']
28    )
29    
30    return model
31
32# Create tuner
33tuner = kt.Hyperband(
34    build_model,
35    objective='val_accuracy',
36    max_epochs=50,
37    factor=3,
38    directory='tuning',
39    project_name='my_model'
40)
41
42# Search
43tuner.search(
44    X_train, y_train,
45    epochs=50,
46    validation_data=(X_val, y_val),
47    callbacks=[keras.callbacks.EarlyStopping(patience=5)]
48)
49
50# Best model
51best_model = tuner.get_best_models(1)[0]
52best_hp = tuner.get_best_hyperparameters(1)[0]

Cross-Validation Strategies

K-Fold

Python

1from sklearn.model_selection import KFold, StratifiedKFold
2
3# Standard K-Fold
4kfold = KFold(n_splits=5, shuffle=True, random_state=42)
5
6# Stratified K-Fold (for classification)
7stratified = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
8
9# Time Series Split
10from sklearn.model_selection import TimeSeriesSplit
11tscv = TimeSeriesSplit(n_splits=5)

Nested Cross-Validation

Đánh giá unbiased performance:

Python

1from sklearn.model_selection import cross_val_score
2
3# Outer CV for model evaluation
4outer_cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
5
6# Inner CV for hyperparameter tuning
7inner_cv = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)
8
9# Grid search with inner CV
10grid_search = GridSearchCV(
11    XGBClassifier(),
12    param_grid,
13    cv=inner_cv,
14    scoring='accuracy'
15)
16
17# Nested CV scores
18nested_scores = cross_val_score(
19    grid_search, X, y, 
20    cv=outer_cv, 
21    scoring='accuracy'
22)
23
24print(f"Nested CV Score: {nested_scores.mean():.4f} (+/- {nested_scores.std():.4f})")

Best Practices

Tips for Effective Tuning

Start with Random Search trước Grid Search
Use log scale cho learning rate, regularization
Early stopping để tiết kiệm thời gian
Nested CV cho unbiased evaluation
Monitor for overfitting khi tune
Document experiments để reproducibility

Practical Example

Complete Tuning Pipeline

Python

1import numpy as np
2import pandas as pd
3from sklearn.model_selection import train_test_split
4from sklearn.preprocessing import StandardScaler
5import optuna
6from xgboost import XGBClassifier
7
8# Load data
9data = pd.read_csv('data.csv')
10X = data.drop('target', axis=1)
11y = data['target']
12
13# Split
14X_train, X_test, y_train, y_test = train_test_split(
15    X, y, test_size=0.2, random_state=42, stratify=y
16)
17
18# Scale
19scaler = StandardScaler()
20X_train_scaled = scaler.fit_transform(X_train)
21X_test_scaled = scaler.transform(X_test)
22
23# Optuna objective
24def objective(trial):
25    params = {
26        'n_estimators': trial.suggest_int('n_estimators', 100, 500),
27        'max_depth': trial.suggest_int('max_depth', 3, 15),
28        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
29        'subsample': trial.suggest_float('subsample', 0.6, 1.0),
30        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.6, 1.0),
31        'gamma': trial.suggest_float('gamma', 0, 5),
32        'reg_alpha': trial.suggest_float('reg_alpha', 1e-8, 10, log=True),
33        'reg_lambda': trial.suggest_float('reg_lambda', 1e-8, 10, log=True)
34    }
35    
36    model = XGBClassifier(**params, random_state=42, use_label_encoder=False, eval_metric='logloss')
37    
38    # Use cross-validation
39    from sklearn.model_selection import cross_val_score
40    scores = cross_val_score(model, X_train_scaled, y_train, cv=5, scoring='accuracy')
41    
42    return scores.mean()
43
44# Run optimization
45study = optuna.create_study(direction='maximize')
46study.optimize(objective, n_trials=100, show_progress_bar=True)
47
48# Train final model with best params
49best_model = XGBClassifier(**study.best_params, random_state=42)
50best_model.fit(X_train_scaled, y_train)
51
52# Evaluate
53from sklearn.metrics import accuracy_score, classification_report
54y_pred = best_model.predict(X_test_scaled)
55print(f"Test Accuracy: {accuracy_score(y_test, y_pred):.4f}")
56print(classification_report(y_test, y_pred))

Bài tập thực hành

Hands-on Exercise

Hyperparameter Tuning Challenge:

Load Titanic dataset
Implement tuning với:
- Grid Search cho Random Forest
- Random Search cho XGBoost
- Optuna cho LightGBM
So sánh kết quả và time
Use nested CV cho final evaluation

Target: Tìm best hyperparameters và so sánh efficiency

Hyperparameter Tuning

⚙️ Hyperparameter Tuning

Hyperparameters vs Parameters

Common Hyperparameters

1. Model Architecture

2. Training Process

Search Strategies

1. Grid Search

2. Random Search

3. Bayesian Optimization

4. Optuna (Modern Approach)

Neural Network Hyperparameter Tuning

Với Keras Tuner

Cross-Validation Strategies

K-Fold

Nested Cross-Validation

Best Practices

Practical Example

Complete Tuning Pipeline

Bài tập thực hành

Tài liệu tham khảo

Khóa học

Mentor & Hỗ trợ

Blog

Giới thiệu