1
1
🎯 Mục tiêu bài học
TB5 min
Trong bài này, bạn sẽ học:
- ✅ Text Preprocessing: Tokenization, Padding, Embedding
- ✅ Text Classification với RNN
- ✅ Language Modeling: Dự đoán từ tiếp theo
- ✅ Time Series Prediction với RNN
- ✅ Bidirectional RNN: Xử lý cả 2 chiều
RNN có thể áp dụng cho bất kỳ dữ liệu tuần tự nào: text, time series, DNA sequences, audio, video frames, ...
Task 1
2
2
📝 Text Preprocessing Pipeline
TB5 min
Pipeline xử lý text
Pipeline Tiền xử lý Văn bản cho RNN
Tokenization với Keras
python.py
1from tensorflow import keras2from tensorflow.keras.preprocessing.text import Tokenizer3from tensorflow.keras.preprocessing.sequence import pad_sequences45# Sample texts6texts = [7 "I love machine learning",8 "Deep learning is amazing",9 "Neural networks are powerful",10 "AI will change the world"11]1213# Create tokenizer14tokenizer = Tokenizer(num_words=10000) # Vocabulary size15tokenizer.fit_on_texts(texts)1617# View word index18print("Word Index:")19for word, idx in list(tokenizer.word_index.items())[:10]:20 print(f" '{word}': {idx}")2122# Convert to sequences23sequences = tokenizer.texts_to_sequences(texts)24print("\nSequences:")25for text, seq in zip(texts, sequences):26 print(f" '{text}' → {seq}")Expected Output
1Word Index:2 'learning': 13 'i': 24 'love': 35 'machine': 46 'deep': 57 ...8 9Sequences:10 'I love machine learning' → [2, 3, 4, 1]11 'Deep learning is amazing' → [5, 1, 6, 7]12 'Neural networks are powerful' → [8, 9, 10, 11]13 'AI will change the world' → [12, 13, 14, 15, 16]Padding
python.py
1# Pad sequences to same length2MAX_LEN = 1034padded = pad_sequences(5 sequences,6 maxlen=MAX_LEN,7 padding='pre', # Pad at beginning8 truncating='post' # Truncate at end9)1011print("Padded Sequences:")12print(padded)13print(f"Shape: {padded.shape}")Expected Output
1Padded Sequences:2[[ 0 0 0 0 0 0 2 3 4 1]3 [ 0 0 0 0 0 0 5 1 6 7]4 [ 0 0 0 0 0 0 8 9 10 11]5 [ 0 0 0 0 0 12 13 14 15 16]]6Shape: (4, 10)Padding options:
padding='pre': Thêm 0 ở đầu (phổ biến hơn)padding='post': Thêm 0 ở cuốitruncating: Cắt ở đầu hoặc cuối nếu sequence quá dài
Checkpoint
Bạn đã hiểu cách preprocessing text?
Task 2
3
3
🔤 Embedding Layer
TB5 min
Tại sao cần Embedding?
| Representation | Vấn đề |
|---|---|
| One-hot | Sparse, không capture semantic |
| Integer | Không có ý nghĩa (word 5 ≠ 5 × word 1) |
| Embedding | Dense, capture semantic similarity |
Embedding trong Keras
python.py
1from tensorflow.keras import layers23# Embedding layer4embedding = layers.Embedding(5 input_dim=10000, # Vocabulary size6 output_dim=128, # Embedding dimension7 input_length=100 # Sequence length8)910# Input: (batch, sequence_length) - integers11# Output: (batch, sequence_length, embedding_dim) - vectors1213# Ví dụ14import tensorflow as tf15sample_input = tf.constant([[1, 2, 3], [4, 5, 6]]) # 2 sequences, 3 words each16sample_output = embedding(sample_input)17print(f"Input shape: {sample_input.shape}")18print(f"Output shape: {sample_output.shape}")Expected Output
1Input shape: (2, 3)2Output shape: (2, 3, 128)Pretrained Embeddings (GloVe, Word2Vec)
python.py
1import numpy as np23def load_glove_embeddings(glove_file, word_index, embedding_dim=100):4 """Load pretrained GloVe embeddings"""5 # Load GloVe6 embeddings_index = {}7 with open(glove_file, 'r', encoding='utf-8') as f:8 for line in f:9 values = line.split()10 word = values[0]11 coefs = np.asarray(values[1:], dtype='float32')12 embeddings_index[word] = coefs13 14 print(f"Loaded {len(embeddings_index)} word vectors")15 16 # Create embedding matrix17 vocab_size = len(word_index) + 118 embedding_matrix = np.zeros((vocab_size, embedding_dim))19 20 for word, idx in word_index.items():21 if idx < vocab_size:22 embedding_vector = embeddings_index.get(word)23 if embedding_vector is not None:24 embedding_matrix[idx] = embedding_vector25 26 return embedding_matrix2728# Usage29# embedding_matrix = load_glove_embeddings(30# 'glove.6B.100d.txt', 31# tokenizer.word_index,32# embedding_dim=10033# )3435# Create embedding layer with pretrained weights36# embedding_layer = layers.Embedding(37# input_dim=VOCAB_SIZE,38# output_dim=100,39# weights=[embedding_matrix],40# trainable=False # Freeze pretrained weights41# )Checkpoint
Bạn đã hiểu Embedding layer?
Task 3
4
4
🎭 Text Classification hoàn chỉnh
TB5 min
IMDB Sentiment Classification
python.py
1from tensorflow import keras2from tensorflow.keras import layers3from tensorflow.keras.datasets import imdb4from tensorflow.keras.preprocessing.sequence import pad_sequences56# Hyperparameters7VOCAB_SIZE = 100008MAX_LEN = 2009EMBEDDING_DIM = 12810RNN_UNITS = 641112# Load IMDB dataset13print("Loading data...")14(x_train, y_train), (x_test, y_test) = imdb.load_data(15 num_words=VOCAB_SIZE16)1718# Pad sequences19x_train = pad_sequences(x_train, maxlen=MAX_LEN)20x_test = pad_sequences(x_test, maxlen=MAX_LEN)2122print(f"Train shape: {x_train.shape}")23print(f"Test shape: {x_test.shape}")2425# Build model26def create_text_classifier():27 model = keras.Sequential([28 # Embedding29 layers.Embedding(30 input_dim=VOCAB_SIZE,31 output_dim=EMBEDDING_DIM,32 input_length=MAX_LEN33 ),34 35 # RNN layers36 layers.SimpleRNN(RNN_UNITS, return_sequences=True),37 layers.SimpleRNN(RNN_UNITS // 2),38 39 # Classification40 layers.Dropout(0.5),41 layers.Dense(64, activation='relu'),42 layers.Dropout(0.3),43 layers.Dense(1, activation='sigmoid')44 ])45 46 return model4748model = create_text_classifier()4950model.compile(51 optimizer='adam',52 loss='binary_crossentropy',53 metrics=['accuracy']54)5556model.summary()Training
python.py
1# Callbacks2callbacks = [3 keras.callbacks.EarlyStopping(4 patience=3,5 restore_best_weights=True6 ),7 keras.callbacks.ReduceLROnPlateau(8 factor=0.5,9 patience=210 )11]1213# Train14history = model.fit(15 x_train, y_train,16 epochs=10,17 batch_size=128,18 validation_split=0.2,19 callbacks=callbacks20)2122# Evaluate23test_loss, test_acc = model.evaluate(x_test, y_test)24print(f"\nTest Accuracy: {test_acc:.2%}")Prediction
python.py
1# Get word index for decoding2word_index = imdb.get_word_index()3reverse_word_index = {v: k for k, v in word_index.items()}45def decode_review(sequence):6 """Convert sequence back to text"""7 return ' '.join([reverse_word_index.get(i - 3, '?') 8 for i in sequence if i > 3])910def predict_sentiment(text, tokenizer=None):11 """Predict sentiment of new text"""12 # If using custom tokenizer13 if tokenizer:14 seq = tokenizer.texts_to_sequences([text])15 else:16 # Simple word to index (demo only)17 words = text.lower().split()18 seq = [[word_index.get(w, 0) + 3 for w in words]]19 20 padded = pad_sequences(seq, maxlen=MAX_LEN)21 pred = model.predict(padded, verbose=0)[0][0]22 23 sentiment = "Positive" if pred > 0.5 else "Negative"24 confidence = pred if pred > 0.5 else 1 - pred25 26 return sentiment, confidence2728# Test29sample_review = "This movie was absolutely fantastic and entertaining"30sentiment, conf = predict_sentiment(sample_review)31print(f"'{sample_review}'")32print(f"Sentiment: {sentiment} (confidence: {conf:.2%})")Checkpoint
Bạn có thể xây dựng Text Classifier?
Task 4
5
5
↔️ Bidirectional RNN
TB5 min
Tại sao cần Bidirectional?
Standard RNN chỉ xem thông tin từ trái → phải. Nhưng đôi khi context từ cả 2 phía đều quan trọng:
"The movie was not bad at all"
- Forward: "not" → tiêu cực
- Backward: "at all" → nhấn mạnh tích cực
Kiến trúc Bidirectional RNN
Code Keras
python.py
1from tensorflow.keras import layers23# Bidirectional wrapper4model = keras.Sequential([5 layers.Embedding(VOCAB_SIZE, EMBEDDING_DIM, 6 input_length=MAX_LEN),7 8 # Bidirectional RNN9 layers.Bidirectional(10 layers.SimpleRNN(64, return_sequences=True)11 ),12 # Output: (batch, timesteps, 128) - 64*213 14 layers.Bidirectional(15 layers.SimpleRNN(32)16 ),17 # Output: (batch, 64) - 32*218 19 layers.Dense(1, activation='sigmoid')20])2122model.summary()Merge modes
python.py
1# Các cách kết hợp forward và backward2layers.Bidirectional(3 layers.SimpleRNN(64),4 merge_mode='concat' # Default: [h⃗, h⃖] → 128 units5)67layers.Bidirectional(8 layers.SimpleRNN(64),9 merge_mode='sum' # h⃗ + h⃖ → 64 units10)1112layers.Bidirectional(13 layers.SimpleRNN(64),14 merge_mode='mul' # h⃗ * h⃖ → 64 units15)1617layers.Bidirectional(18 layers.SimpleRNN(64),19 merge_mode='ave' # (h⃗ + h⃖) / 2 → 64 units20)Checkpoint
Bạn đã hiểu Bidirectional RNN?
Task 5
6
6
📈 Time Series Prediction
TB5 min
Chuẩn bị data
python.py
1import numpy as np2import matplotlib.pyplot as plt34def create_time_series_data(n_samples=1000, noise=0.1):5 """Create synthetic time series"""6 t = np.linspace(0, 100, n_samples)7 # Trend + Seasonality + Noise8 data = 0.05 * t + 2 * np.sin(0.5 * t) + np.random.randn(n_samples) * noise9 return data1011def create_sequences(data, seq_length, forecast_horizon=1):12 """13 Create input-output sequences for time series14 15 Args:16 data: Time series array17 seq_length: Number of past timesteps to use18 forecast_horizon: Number of future steps to predict19 """20 X, y = [], []21 for i in range(len(data) - seq_length - forecast_horizon + 1):22 X.append(data[i:i+seq_length])23 y.append(data[i+seq_length:i+seq_length+forecast_horizon])24 25 X = np.array(X).reshape(-1, seq_length, 1) # (samples, timesteps, features)26 y = np.array(y)27 28 return X, y2930# Create data31data = create_time_series_data(1000)3233# Create sequences34SEQ_LENGTH = 2035X, y = create_sequences(data, SEQ_LENGTH)3637# Split38split = int(len(X) * 0.8)39X_train, X_test = X[:split], X[split:]40y_train, y_test = y[:split], y[split:]4142print(f"X_train shape: {X_train.shape}")43print(f"y_train shape: {y_train.shape}")Expected Output
1X_train shape: (784, 20, 1)2y_train shape: (784, 1)Model cho Time Series
python.py
1def create_ts_model(seq_length, n_features=1, forecast_horizon=1):2 """RNN model for time series prediction"""3 model = keras.Sequential([4 layers.SimpleRNN(64, return_sequences=True,5 input_shape=(seq_length, n_features)),6 layers.SimpleRNN(32),7 layers.Dense(32, activation='relu'),8 layers.Dense(forecast_horizon) # Predict n steps ahead9 ])10 11 return model1213# Create and compile14model = create_ts_model(SEQ_LENGTH)15model.compile(16 optimizer='adam',17 loss='mse',18 metrics=['mae']19)2021# Train22history = model.fit(23 X_train, y_train,24 epochs=50,25 batch_size=32,26 validation_split=0.2,27 callbacks=[28 keras.callbacks.EarlyStopping(patience=5)29 ],30 verbose=031)3233# Evaluate34test_loss, test_mae = model.evaluate(X_test, y_test)35print(f"Test MAE: {test_mae:.4f}")Visualization
python.py
1# Predict2y_pred = model.predict(X_test)34# Plot5plt.figure(figsize=(14, 5))67# Plot predictions vs actual8n_plot = 1009plt.subplot(1, 2, 1)10plt.plot(y_test[:n_plot], label='Actual', alpha=0.7)11plt.plot(y_pred[:n_plot], label='Predicted', alpha=0.7)12plt.xlabel('Time')13plt.ylabel('Value')14plt.title('Time Series Prediction')15plt.legend()1617# Plot training history18plt.subplot(1, 2, 2)19plt.plot(history.history['loss'], label='Train Loss')20plt.plot(history.history['val_loss'], label='Val Loss')21plt.xlabel('Epoch')22plt.ylabel('Loss')23plt.title('Training History')24plt.legend()2526plt.tight_layout()27plt.show()Checkpoint
Bạn có thể xây dựng Time Series model?
Task 6
7
7
💡 Multi-step Forecasting
TB5 min
Dự đoán nhiều bước
python.py
1# Predict 5 steps ahead2FORECAST_HORIZON = 53X_multi, y_multi = create_sequences(data, SEQ_LENGTH, FORECAST_HORIZON)45# Split6X_train_m, X_test_m = X_multi[:split], X_multi[split:]7y_train_m, y_test_m = y_multi[:split], y_multi[split:]89# Model with multiple outputs10model_multi = create_ts_model(SEQ_LENGTH, forecast_horizon=FORECAST_HORIZON)11model_multi.compile(optimizer='adam', loss='mse', metrics=['mae'])1213model_multi.fit(14 X_train_m, y_train_m,15 epochs=50,16 batch_size=32,17 validation_split=0.2,18 callbacks=[keras.callbacks.EarlyStopping(patience=5)],19 verbose=020)2122# Predict23y_pred_m = model_multi.predict(X_test_m[:1])24print(f"Input shape: {X_test_m[:1].shape}")25print(f"Output (5 steps): {y_pred_m}")Sequence-to-Sequence cho Time Series
python.py
1def create_seq2seq_model(seq_length, forecast_horizon):2 """3 Encoder-Decoder architecture for time series4 """5 model = keras.Sequential([6 # Encoder7 layers.SimpleRNN(64, return_sequences=True,8 input_shape=(seq_length, 1)),9 layers.SimpleRNN(32),10 11 # Repeat vector for decoder12 layers.RepeatVector(forecast_horizon),13 14 # Decoder15 layers.SimpleRNN(32, return_sequences=True),16 layers.SimpleRNN(64, return_sequences=True),17 18 # Output19 layers.TimeDistributed(layers.Dense(1))20 ])21 22 return model2324# Build25seq2seq = create_seq2seq_model(SEQ_LENGTH, FORECAST_HORIZON)26seq2seq.summary()Checkpoint
Bạn đã hiểu Multi-step forecasting?
Task 7
8
8
🎯 Tổng kết
TB5 min
Ứng dụng RNN đã học
| Task | Input | Output | Architecture |
|---|---|---|---|
| Sentiment | Text | Label | Many-to-One |
| Time Series | Past values | Future value(s) | Many-to-One/Many |
| Language Model | Words | Next word | Many-to-Many |
Pipeline xử lý Text
Ví dụ
11. Tokenization: Text → Words → Integers22. Padding: Sequences → Fixed length33. Embedding: Integers → Dense vectors44. RNN: Sequence processing55. Output: Classification/RegressionKey Components
| Component | Keras Layer | Purpose |
|---|---|---|
| Tokenizer | Tokenizer() | Text → Sequences |
| Padding | pad_sequences() | Fixed length |
| Embedding | Embedding() | Words → Vectors |
| RNN | SimpleRNN() | Sequence processing |
| Bidirectional | Bidirectional() | Both directions |
Hạn chế của SimpleRNN
| Vấn đề | Mô tả |
|---|---|
| Vanishing gradient | Khó học long-term dependencies |
| Sequential | Không parallelize được |
| Short memory | Quên thông tin xa |
Bài tiếp theo
LSTM (Long Short-Term Memory):
- Giải quyết vanishing gradient
- Memory cells cho long-term dependencies
- Gates để kiểm soát information flow
🎉 Tuyệt vời! Bạn đã biết cách áp dụng RNN vào các bài toán thực tế!
Task 8
