MinAI - Về trang chủ
Lý thuyết
11/2160 phút
Đang tải...

Ứng Dụng RNN - Text và Time Series

Áp dụng RNN vào các bài toán thực tế: Text Classification, Language Modeling, Time Series Prediction

1

🎯 Mục tiêu bài học

TB5 min

Trong bài này, bạn sẽ học:

  • Text Preprocessing: Tokenization, Padding, Embedding
  • Text Classification với RNN
  • Language Modeling: Dự đoán từ tiếp theo
  • Time Series Prediction với RNN
  • Bidirectional RNN: Xử lý cả 2 chiều

RNN có thể áp dụng cho bất kỳ dữ liệu tuần tự nào: text, time series, DNA sequences, audio, video frames, ...

2

📝 Text Preprocessing Pipeline

TB5 min

Pipeline xử lý text

Pipeline Tiền xử lý Văn bản cho RNN

Pipeline Tiền xử lý Văn bản cho RNN📝Văn bản thô"Tôi yêu AI"✂️Tokenization["Tôi", "yêu", "AI"]🔢Encoding[45, 12, 89]📏Padding[45, 12, 89, 0, 0]🎯EmbeddingVectors 128-DĐưa vào RNN

Tokenization với Keras

python.py
1from tensorflow import keras
2from tensorflow.keras.preprocessing.text import Tokenizer
3from tensorflow.keras.preprocessing.sequence import pad_sequences
4
5# Sample texts
6texts = [
7 "I love machine learning",
8 "Deep learning is amazing",
9 "Neural networks are powerful",
10 "AI will change the world"
11]
12
13# Create tokenizer
14tokenizer = Tokenizer(num_words=10000) # Vocabulary size
15tokenizer.fit_on_texts(texts)
16
17# View word index
18print("Word Index:")
19for word, idx in list(tokenizer.word_index.items())[:10]:
20 print(f" '{word}': {idx}")
21
22# Convert to sequences
23sequences = tokenizer.texts_to_sequences(texts)
24print("\nSequences:")
25for text, seq in zip(texts, sequences):
26 print(f" '{text}' → {seq}")
Expected Output
1Word Index:
2 'learning': 1
3 'i': 2
4 'love': 3
5 'machine': 4
6 'deep': 5
7 ...
8
9Sequences:
10 'I love machine learning' → [2, 3, 4, 1]
11 'Deep learning is amazing' → [5, 1, 6, 7]
12 'Neural networks are powerful' → [8, 9, 10, 11]
13 'AI will change the world' → [12, 13, 14, 15, 16]

Padding

python.py
1# Pad sequences to same length
2MAX_LEN = 10
3
4padded = pad_sequences(
5 sequences,
6 maxlen=MAX_LEN,
7 padding='pre', # Pad at beginning
8 truncating='post' # Truncate at end
9)
10
11print("Padded Sequences:")
12print(padded)
13print(f"Shape: {padded.shape}")
Expected Output
1Padded Sequences:
2[[ 0 0 0 0 0 0 2 3 4 1]
3 [ 0 0 0 0 0 0 5 1 6 7]
4 [ 0 0 0 0 0 0 8 9 10 11]
5 [ 0 0 0 0 0 12 13 14 15 16]]
6Shape: (4, 10)

Padding options:

  • padding='pre': Thêm 0 ở đầu (phổ biến hơn)
  • padding='post': Thêm 0 ở cuối
  • truncating: Cắt ở đầu hoặc cuối nếu sequence quá dài

Checkpoint

Bạn đã hiểu cách preprocessing text?

3

🔤 Embedding Layer

TB5 min

Tại sao cần Embedding?

RepresentationVấn đề
One-hotSparse, không capture semantic
IntegerKhông có ý nghĩa (word 5 ≠ 5 × word 1)
EmbeddingDense, capture semantic similarity

Embedding trong Keras

python.py
1from tensorflow.keras import layers
2
3# Embedding layer
4embedding = layers.Embedding(
5 input_dim=10000, # Vocabulary size
6 output_dim=128, # Embedding dimension
7 input_length=100 # Sequence length
8)
9
10# Input: (batch, sequence_length) - integers
11# Output: (batch, sequence_length, embedding_dim) - vectors
12
13# Ví dụ
14import tensorflow as tf
15sample_input = tf.constant([[1, 2, 3], [4, 5, 6]]) # 2 sequences, 3 words each
16sample_output = embedding(sample_input)
17print(f"Input shape: {sample_input.shape}")
18print(f"Output shape: {sample_output.shape}")
Expected Output
1Input shape: (2, 3)
2Output shape: (2, 3, 128)

Pretrained Embeddings (GloVe, Word2Vec)

python.py
1import numpy as np
2
3def load_glove_embeddings(glove_file, word_index, embedding_dim=100):
4 """Load pretrained GloVe embeddings"""
5 # Load GloVe
6 embeddings_index = {}
7 with open(glove_file, 'r', encoding='utf-8') as f:
8 for line in f:
9 values = line.split()
10 word = values[0]
11 coefs = np.asarray(values[1:], dtype='float32')
12 embeddings_index[word] = coefs
13
14 print(f"Loaded {len(embeddings_index)} word vectors")
15
16 # Create embedding matrix
17 vocab_size = len(word_index) + 1
18 embedding_matrix = np.zeros((vocab_size, embedding_dim))
19
20 for word, idx in word_index.items():
21 if idx < vocab_size:
22 embedding_vector = embeddings_index.get(word)
23 if embedding_vector is not None:
24 embedding_matrix[idx] = embedding_vector
25
26 return embedding_matrix
27
28# Usage
29# embedding_matrix = load_glove_embeddings(
30# 'glove.6B.100d.txt',
31# tokenizer.word_index,
32# embedding_dim=100
33# )
34
35# Create embedding layer with pretrained weights
36# embedding_layer = layers.Embedding(
37# input_dim=VOCAB_SIZE,
38# output_dim=100,
39# weights=[embedding_matrix],
40# trainable=False # Freeze pretrained weights
41# )

Checkpoint

Bạn đã hiểu Embedding layer?

4

🎭 Text Classification hoàn chỉnh

TB5 min

IMDB Sentiment Classification

python.py
1from tensorflow import keras
2from tensorflow.keras import layers
3from tensorflow.keras.datasets import imdb
4from tensorflow.keras.preprocessing.sequence import pad_sequences
5
6# Hyperparameters
7VOCAB_SIZE = 10000
8MAX_LEN = 200
9EMBEDDING_DIM = 128
10RNN_UNITS = 64
11
12# Load IMDB dataset
13print("Loading data...")
14(x_train, y_train), (x_test, y_test) = imdb.load_data(
15 num_words=VOCAB_SIZE
16)
17
18# Pad sequences
19x_train = pad_sequences(x_train, maxlen=MAX_LEN)
20x_test = pad_sequences(x_test, maxlen=MAX_LEN)
21
22print(f"Train shape: {x_train.shape}")
23print(f"Test shape: {x_test.shape}")
24
25# Build model
26def create_text_classifier():
27 model = keras.Sequential([
28 # Embedding
29 layers.Embedding(
30 input_dim=VOCAB_SIZE,
31 output_dim=EMBEDDING_DIM,
32 input_length=MAX_LEN
33 ),
34
35 # RNN layers
36 layers.SimpleRNN(RNN_UNITS, return_sequences=True),
37 layers.SimpleRNN(RNN_UNITS // 2),
38
39 # Classification
40 layers.Dropout(0.5),
41 layers.Dense(64, activation='relu'),
42 layers.Dropout(0.3),
43 layers.Dense(1, activation='sigmoid')
44 ])
45
46 return model
47
48model = create_text_classifier()
49
50model.compile(
51 optimizer='adam',
52 loss='binary_crossentropy',
53 metrics=['accuracy']
54)
55
56model.summary()

Training

python.py
1# Callbacks
2callbacks = [
3 keras.callbacks.EarlyStopping(
4 patience=3,
5 restore_best_weights=True
6 ),
7 keras.callbacks.ReduceLROnPlateau(
8 factor=0.5,
9 patience=2
10 )
11]
12
13# Train
14history = model.fit(
15 x_train, y_train,
16 epochs=10,
17 batch_size=128,
18 validation_split=0.2,
19 callbacks=callbacks
20)
21
22# Evaluate
23test_loss, test_acc = model.evaluate(x_test, y_test)
24print(f"\nTest Accuracy: {test_acc:.2%}")

Prediction

python.py
1# Get word index for decoding
2word_index = imdb.get_word_index()
3reverse_word_index = {v: k for k, v in word_index.items()}
4
5def decode_review(sequence):
6 """Convert sequence back to text"""
7 return ' '.join([reverse_word_index.get(i - 3, '?')
8 for i in sequence if i > 3])
9
10def predict_sentiment(text, tokenizer=None):
11 """Predict sentiment of new text"""
12 # If using custom tokenizer
13 if tokenizer:
14 seq = tokenizer.texts_to_sequences([text])
15 else:
16 # Simple word to index (demo only)
17 words = text.lower().split()
18 seq = [[word_index.get(w, 0) + 3 for w in words]]
19
20 padded = pad_sequences(seq, maxlen=MAX_LEN)
21 pred = model.predict(padded, verbose=0)[0][0]
22
23 sentiment = "Positive" if pred > 0.5 else "Negative"
24 confidence = pred if pred > 0.5 else 1 - pred
25
26 return sentiment, confidence
27
28# Test
29sample_review = "This movie was absolutely fantastic and entertaining"
30sentiment, conf = predict_sentiment(sample_review)
31print(f"'{sample_review}'")
32print(f"Sentiment: {sentiment} (confidence: {conf:.2%})")

Checkpoint

Bạn có thể xây dựng Text Classifier?

5

↔️ Bidirectional RNN

TB5 min

Tại sao cần Bidirectional?

Standard RNN chỉ xem thông tin từ trái → phải. Nhưng đôi khi context từ cả 2 phía đều quan trọng:

"The movie was not bad at all"

  • Forward: "not" → tiêu cực
  • Backward: "at all" → nhấn mạnh tích cực

Kiến trúc Bidirectional RNN

Kiến trúc Bidirectional RNNInputForwardBackwardConcatenateOutputx1h→1h←1[→,←]y1x2h→2h←2[→,←]y2x3h→3h←3[→,←]y3x4h→4h←4[→,←]y4Kết hợp thông tin từ hai hướng

Code Keras

python.py
1from tensorflow.keras import layers
2
3# Bidirectional wrapper
4model = keras.Sequential([
5 layers.Embedding(VOCAB_SIZE, EMBEDDING_DIM,
6 input_length=MAX_LEN),
7
8 # Bidirectional RNN
9 layers.Bidirectional(
10 layers.SimpleRNN(64, return_sequences=True)
11 ),
12 # Output: (batch, timesteps, 128) - 64*2
13
14 layers.Bidirectional(
15 layers.SimpleRNN(32)
16 ),
17 # Output: (batch, 64) - 32*2
18
19 layers.Dense(1, activation='sigmoid')
20])
21
22model.summary()

Merge modes

python.py
1# Các cách kết hợp forward và backward
2layers.Bidirectional(
3 layers.SimpleRNN(64),
4 merge_mode='concat' # Default: [h⃗, h⃖] → 128 units
5)
6
7layers.Bidirectional(
8 layers.SimpleRNN(64),
9 merge_mode='sum' # h⃗ + h⃖ → 64 units
10)
11
12layers.Bidirectional(
13 layers.SimpleRNN(64),
14 merge_mode='mul' # h⃗ * h⃖ → 64 units
15)
16
17layers.Bidirectional(
18 layers.SimpleRNN(64),
19 merge_mode='ave' # (h⃗ + h⃖) / 2 → 64 units
20)

Checkpoint

Bạn đã hiểu Bidirectional RNN?

6

📈 Time Series Prediction

TB5 min

Chuẩn bị data

python.py
1import numpy as np
2import matplotlib.pyplot as plt
3
4def create_time_series_data(n_samples=1000, noise=0.1):
5 """Create synthetic time series"""
6 t = np.linspace(0, 100, n_samples)
7 # Trend + Seasonality + Noise
8 data = 0.05 * t + 2 * np.sin(0.5 * t) + np.random.randn(n_samples) * noise
9 return data
10
11def create_sequences(data, seq_length, forecast_horizon=1):
12 """
13 Create input-output sequences for time series
14
15 Args:
16 data: Time series array
17 seq_length: Number of past timesteps to use
18 forecast_horizon: Number of future steps to predict
19 """
20 X, y = [], []
21 for i in range(len(data) - seq_length - forecast_horizon + 1):
22 X.append(data[i:i+seq_length])
23 y.append(data[i+seq_length:i+seq_length+forecast_horizon])
24
25 X = np.array(X).reshape(-1, seq_length, 1) # (samples, timesteps, features)
26 y = np.array(y)
27
28 return X, y
29
30# Create data
31data = create_time_series_data(1000)
32
33# Create sequences
34SEQ_LENGTH = 20
35X, y = create_sequences(data, SEQ_LENGTH)
36
37# Split
38split = int(len(X) * 0.8)
39X_train, X_test = X[:split], X[split:]
40y_train, y_test = y[:split], y[split:]
41
42print(f"X_train shape: {X_train.shape}")
43print(f"y_train shape: {y_train.shape}")
Expected Output
1X_train shape: (784, 20, 1)
2y_train shape: (784, 1)

Model cho Time Series

python.py
1def create_ts_model(seq_length, n_features=1, forecast_horizon=1):
2 """RNN model for time series prediction"""
3 model = keras.Sequential([
4 layers.SimpleRNN(64, return_sequences=True,
5 input_shape=(seq_length, n_features)),
6 layers.SimpleRNN(32),
7 layers.Dense(32, activation='relu'),
8 layers.Dense(forecast_horizon) # Predict n steps ahead
9 ])
10
11 return model
12
13# Create and compile
14model = create_ts_model(SEQ_LENGTH)
15model.compile(
16 optimizer='adam',
17 loss='mse',
18 metrics=['mae']
19)
20
21# Train
22history = model.fit(
23 X_train, y_train,
24 epochs=50,
25 batch_size=32,
26 validation_split=0.2,
27 callbacks=[
28 keras.callbacks.EarlyStopping(patience=5)
29 ],
30 verbose=0
31)
32
33# Evaluate
34test_loss, test_mae = model.evaluate(X_test, y_test)
35print(f"Test MAE: {test_mae:.4f}")

Visualization

python.py
1# Predict
2y_pred = model.predict(X_test)
3
4# Plot
5plt.figure(figsize=(14, 5))
6
7# Plot predictions vs actual
8n_plot = 100
9plt.subplot(1, 2, 1)
10plt.plot(y_test[:n_plot], label='Actual', alpha=0.7)
11plt.plot(y_pred[:n_plot], label='Predicted', alpha=0.7)
12plt.xlabel('Time')
13plt.ylabel('Value')
14plt.title('Time Series Prediction')
15plt.legend()
16
17# Plot training history
18plt.subplot(1, 2, 2)
19plt.plot(history.history['loss'], label='Train Loss')
20plt.plot(history.history['val_loss'], label='Val Loss')
21plt.xlabel('Epoch')
22plt.ylabel('Loss')
23plt.title('Training History')
24plt.legend()
25
26plt.tight_layout()
27plt.show()

Checkpoint

Bạn có thể xây dựng Time Series model?

7

💡 Multi-step Forecasting

TB5 min

Dự đoán nhiều bước

python.py
1# Predict 5 steps ahead
2FORECAST_HORIZON = 5
3X_multi, y_multi = create_sequences(data, SEQ_LENGTH, FORECAST_HORIZON)
4
5# Split
6X_train_m, X_test_m = X_multi[:split], X_multi[split:]
7y_train_m, y_test_m = y_multi[:split], y_multi[split:]
8
9# Model with multiple outputs
10model_multi = create_ts_model(SEQ_LENGTH, forecast_horizon=FORECAST_HORIZON)
11model_multi.compile(optimizer='adam', loss='mse', metrics=['mae'])
12
13model_multi.fit(
14 X_train_m, y_train_m,
15 epochs=50,
16 batch_size=32,
17 validation_split=0.2,
18 callbacks=[keras.callbacks.EarlyStopping(patience=5)],
19 verbose=0
20)
21
22# Predict
23y_pred_m = model_multi.predict(X_test_m[:1])
24print(f"Input shape: {X_test_m[:1].shape}")
25print(f"Output (5 steps): {y_pred_m}")

Sequence-to-Sequence cho Time Series

python.py
1def create_seq2seq_model(seq_length, forecast_horizon):
2 """
3 Encoder-Decoder architecture for time series
4 """
5 model = keras.Sequential([
6 # Encoder
7 layers.SimpleRNN(64, return_sequences=True,
8 input_shape=(seq_length, 1)),
9 layers.SimpleRNN(32),
10
11 # Repeat vector for decoder
12 layers.RepeatVector(forecast_horizon),
13
14 # Decoder
15 layers.SimpleRNN(32, return_sequences=True),
16 layers.SimpleRNN(64, return_sequences=True),
17
18 # Output
19 layers.TimeDistributed(layers.Dense(1))
20 ])
21
22 return model
23
24# Build
25seq2seq = create_seq2seq_model(SEQ_LENGTH, FORECAST_HORIZON)
26seq2seq.summary()

Checkpoint

Bạn đã hiểu Multi-step forecasting?

8

🎯 Tổng kết

TB5 min

Ứng dụng RNN đã học

TaskInputOutputArchitecture
SentimentTextLabelMany-to-One
Time SeriesPast valuesFuture value(s)Many-to-One/Many
Language ModelWordsNext wordMany-to-Many

Pipeline xử lý Text

Ví dụ
11. Tokenization: Text → Words → Integers
22. Padding: Sequences → Fixed length
33. Embedding: Integers → Dense vectors
44. RNN: Sequence processing
55. Output: Classification/Regression

Key Components

ComponentKeras LayerPurpose
TokenizerTokenizer()Text → Sequences
Paddingpad_sequences()Fixed length
EmbeddingEmbedding()Words → Vectors
RNNSimpleRNN()Sequence processing
BidirectionalBidirectional()Both directions

Hạn chế của SimpleRNN

Vấn đềMô tả
Vanishing gradientKhó học long-term dependencies
SequentialKhông parallelize được
Short memoryQuên thông tin xa

Bài tiếp theo

LSTM (Long Short-Term Memory):

  • Giải quyết vanishing gradient
  • Memory cells cho long-term dependencies
  • Gates để kiểm soát information flow

🎉 Tuyệt vời! Bạn đã biết cách áp dụng RNN vào các bài toán thực tế!