Lý thuyết
45 phút
Bài 9/12

Indexes - Tối ưu hiệu năng

Hiểu về Indexes trong SQL Server - Clustered, Non-clustered, Covering indexes và cách optimize

Indexes - Tối ưu hiệu năng Database

1. Index là gì?

Index là cấu trúc dữ liệu giúp SQL Server tìm kiếm dữ liệu nhanh hơn, tương tự như mục lục của sách.

Tại sao cần Index?

Text
1Without Index: Table Scan - đọc TOÀN BỘ table
2With Index: Index Seek - nhảy trực tiếp đến data

SQL Server Index Types

Index Types
Clustered
Sorts physical data
1 per table
Usually PK
Non-Clustered
Separate structure
Multiple per table
Pointer to data
Special
Filtered Index
Columnstore
Full-Text
No IndexWith Index
Table ScanIndex Seek
O(n) - LinearO(log n) - Logarithmic
Slow on large tablesFast regardless of size

2. Clustered Index

Đặc điểm Clustered Index

  • Sắp xếp physical data theo index key
  • Chỉ 1 per table (vì data chỉ có thể sort 1 cách)
  • Primary Key tự động tạo clustered index (mặc định)
  • Leaf level = actual data rows
SQL
1-- Clustered index được tạo tự động với PK
2CREATE TABLE Customer (
3 CustomerID INT PRIMARY KEY, -- Clustered Index automatically
4 CustomerName NVARCHAR(100)
5);
6
7-- Hoặc tạo explicit
8CREATE TABLE Customer (
9 CustomerID INT,
10 CustomerName NVARCHAR(100)
11);
12
13CREATE CLUSTERED INDEX CIX_Customer_CustomerID
14ON Customer(CustomerID);

B-Tree Structure

Text
1┌─────────────┐
2 │ Root │
3 │ 1-50, 51-100│
4 └─────────────┘
5
6 ┌─────────────┴─────────────┐
7 ▼ ▼
8 ┌─────────────┐ ┌─────────────┐
9 │ Intermediate│ │ Intermediate│
10 │ 1-25, 26-50│ │ 51-75,76-100│
11 └─────────────┘ └─────────────┘
12 │ │
13 ┌──────┴──────┐ ┌──────┴──────┐
14 ▼ ▼ ▼ ▼
15┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
16│ Leaf │ │ Leaf │ │ Leaf │ │ Leaf │
17│ Data │ │ Data │ │ Data │ │ Data │
18│ 1-25 │ │ 26-50 │ │ 51-75 │ │ 76-100│
19└───────┘ └───────┘ └───────┘ └───────┘
20
21Clustered: Leaf contains actual data rows

Chọn Clustered Index Key

SQL
1-- ✅ Good clustered index keys:
2-- 1. Unique, narrow, ever-increasing (như IDENTITY)
3CREATE TABLE Order (
4 OrderID INT IDENTITY(1,1) PRIMARY KEY, -- Perfect!
5 ...
6);
7
8-- 2. Frequently used in range queries
9CREATE CLUSTERED INDEX CIX_Order_OrderDate
10ON Order(OrderDate); -- Good for date range queries
11
12-- ❌ Bad clustered index keys:
13-- 1. GUID (wide, random)
14CREATE TABLE Session (
15 SessionID UNIQUEIDENTIFIER PRIMARY KEY, -- Bad! Random = page splits
16 ...
17);
18
19-- 2. Wide composite keys
20CREATE CLUSTERED INDEX CIX_Bad
21ON Order(CustomerID, OrderDate, Status); -- Too wide

3. Non-Clustered Index

Đặc điểm Non-Clustered Index

  • Cấu trúc riêng biệt với data
  • Multiple per table (up to 999)
  • Leaf level chứa index key + pointer (RID hoặc clustering key)
SQL
1-- Tạo non-clustered index
2CREATE NONCLUSTERED INDEX IX_Customer_Email
3ON Customer.Customer(Email);
4
5-- Composite non-clustered index
6CREATE NONCLUSTERED INDEX IX_Order_CustomerDate
7ON Sales.[Order](CustomerID, OrderDate DESC);
8
9-- Xem index trong table
10SELECT
11 i.name AS IndexName,
12 i.type_desc AS IndexType,
13 COL_NAME(ic.object_id, ic.column_id) AS ColumnName,
14 ic.is_included_column
15FROM sys.indexes i
16JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
17WHERE i.object_id = OBJECT_ID('Sales.SalesOrderHeader');

Non-Clustered Structure

Text
1Non-Clustered Index Clustered Index (Data)
2┌─────────────────┐ ┌─────────────────┐
3│ Index Key + │ Pointer │ Actual Data │
4│ Clustering Key │ ──────────────► │ Rows │
5└─────────────────┘ └─────────────────┘
6
7Example: Index on Email
8┌───────────────────────────┐ ┌─────────────────────────────┐
9│ Email │ CustID │ │CustID│Name │Email │
10├───────────────────────────┤ ├─────────────────────────────┤
11│ alice@mail.com │ 5 │ ──► │ 5 │Alice │alice@mail │
12│ bob@mail.com │ 3 │ ──► │ 3 │Bob │bob@mail │
13│ carol@mail.com │ 1 │ ──► │ 1 │Carol │carol@mail │
14└───────────────────────────┘ └─────────────────────────────┘

4. Covering Index với INCLUDE

What is Covering Index?

Covering Index = Index chứa TẤT CẢ columns cần cho query, không cần lookup về data.

SQL
1-- Query này cần CustomerID, OrderDate, TotalAmount
2SELECT CustomerID, OrderDate, TotalAmount
3FROM Sales.[Order]
4WHERE CustomerID = 5;
5
6-- Index chỉ có CustomerID
7CREATE INDEX IX_Order_CustomerID
8ON Sales.[Order](CustomerID);
9-- Cần Key Lookup để lấy OrderDate, TotalAmount → Slow!
10
11-- Covering Index với INCLUDE
12CREATE INDEX IX_Order_CustomerID_Covering
13ON Sales.[Order](CustomerID)
14INCLUDE (OrderDate, TotalAmount);
15-- Không cần lookup → Fast!

INCLUDE vs Key Columns

Key ColumnsINCLUDE Columns
Used for searchingOnly for retrieval
Sorted in indexNot sorted
Limited to 16 columnsNo limit
Trong B-tree all levelsOnly in leaf level
SQL
1-- Key columns: columns in WHERE, JOIN, ORDER BY
2-- INCLUDE columns: columns in SELECT only
3
4-- Example
5SELECT FirstName, LastName, Phone -- Include these
6FROM Customer.Customer
7WHERE Email = 'test@email.com' -- Key column
8ORDER BY LastName; -- Key column
9
10CREATE INDEX IX_Customer_Email_LastName
11ON Customer.Customer(Email, LastName)
12INCLUDE (FirstName, Phone);

5. Filtered Index

Filtered Index = Index chỉ cho một subset của data (với WHERE clause).

SQL
1-- Filtered index cho active customers only
2CREATE INDEX IX_Customer_Email_Active
3ON Customer.Customer(Email)
4WHERE IsActive = 1;
5
6-- Filtered index cho non-null values
7CREATE INDEX IX_Order_ShipDate
8ON Sales.[Order](ShipDate)
9WHERE ShipDate IS NOT NULL;
10
11-- Filtered index cho specific status
12CREATE INDEX IX_Order_Pending
13ON Sales.[Order](CustomerID, OrderDate)
14WHERE Status = 'Pending';

Benefits of Filtered Index

BenefitDescription
Smaller sizeLess storage, faster scans
Better maintenanceFaster to update
Targeted optimizationPerfect for specific queries
SQL
1-- Scenario: 99% orders are 'Delivered', 1% are 'Pending'
2-- Full index wastes space for Delivered orders
3
4-- Filtered index only includes Pending
5CREATE INDEX IX_Order_Pending
6ON Sales.[Order](OrderDate)
7WHERE Status = 'Pending';
8-- Small, fast, perfect for monitoring queries

6. Index Management

6.1 Xem Index Info

SQL
1-- Xem tất cả indexes trong database
2SELECT
3 OBJECT_SCHEMA_NAME(i.object_id) AS SchemaName,
4 OBJECT_NAME(i.object_id) AS TableName,
5 i.name AS IndexName,
6 i.type_desc AS IndexType,
7 i.is_unique,
8 i.is_primary_key,
9 i.fill_factor
10FROM sys.indexes i
11WHERE i.object_id > 100 -- User tables only
12ORDER BY SchemaName, TableName, IndexName;
13
14-- Index columns detail
15SELECT
16 i.name AS IndexName,
17 COL_NAME(ic.object_id, ic.column_id) AS ColumnName,
18 ic.key_ordinal,
19 ic.is_included_column,
20 ic.is_descending_key
21FROM sys.indexes i
22JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
23WHERE i.object_id = OBJECT_ID('Sales.SalesOrderHeader')
24ORDER BY i.name, ic.key_ordinal;

6.2 Index Usage Statistics

SQL
1-- Xem index được sử dụng như thế nào
2SELECT
3 OBJECT_NAME(s.object_id) AS TableName,
4 i.name AS IndexName,
5 i.type_desc AS IndexType,
6 s.user_seeks, -- Index seek operations
7 s.user_scans, -- Index scan operations
8 s.user_lookups, -- Bookmark lookups
9 s.user_updates, -- Updates to index
10 s.last_user_seek,
11 s.last_user_scan
12FROM sys.dm_db_index_usage_stats s
13JOIN sys.indexes i ON s.object_id = i.object_id AND s.index_id = i.index_id
14WHERE s.database_id = DB_ID()
15ORDER BY s.user_seeks + s.user_scans DESC;
16
17-- Tìm unused indexes (candidates for removal)
18SELECT
19 OBJECT_NAME(i.object_id) AS TableName,
20 i.name AS IndexName,
21 i.type_desc,
22 s.user_seeks,
23 s.user_scans,
24 s.user_lookups,
25 s.user_updates
26FROM sys.indexes i
27LEFT JOIN sys.dm_db_index_usage_stats s
28 ON i.object_id = s.object_id AND i.index_id = s.index_id
29WHERE OBJECTPROPERTY(i.object_id, 'IsUserTable') = 1
30 AND i.type_desc = 'NONCLUSTERED'
31 AND (s.user_seeks IS NULL OR s.user_seeks = 0)
32 AND (s.user_scans IS NULL OR s.user_scans = 0);

6.3 Missing Index Suggestions

SQL
1-- SQL Server tracks missing indexes
2SELECT
3 OBJECT_NAME(mid.object_id) AS TableName,
4 mid.equality_columns,
5 mid.inequality_columns,
6 mid.included_columns,
7 migs.avg_user_impact,
8 migs.user_seeks
9FROM sys.dm_db_missing_index_details mid
10JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle
11JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle
12WHERE mid.database_id = DB_ID()
13ORDER BY migs.avg_user_impact * migs.user_seeks DESC;

6.4 Index Fragmentation

SQL
1-- Check fragmentation
2SELECT
3 OBJECT_NAME(ps.object_id) AS TableName,
4 i.name AS IndexName,
5 ps.index_type_desc,
6 ps.avg_fragmentation_in_percent,
7 ps.page_count
8FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ps
9JOIN sys.indexes i ON ps.object_id = i.object_id AND ps.index_id = i.index_id
10WHERE ps.avg_fragmentation_in_percent > 10
11ORDER BY ps.avg_fragmentation_in_percent DESC;
12
13-- Rebuild vs Reorganize
14-- < 10% fragmentation: Do nothing
15-- 10-30% fragmentation: REORGANIZE
16-- > 30% fragmentation: REBUILD
17
18-- Reorganize (online, less resource intensive)
19ALTER INDEX IX_Customer_Email ON Customer.Customer REORGANIZE;
20
21-- Rebuild (more thorough, can be online with Enterprise)
22ALTER INDEX IX_Customer_Email ON Customer.Customer REBUILD;
23ALTER INDEX IX_Customer_Email ON Customer.Customer REBUILD WITH (ONLINE = ON);
24
25-- Rebuild all indexes on table
26ALTER INDEX ALL ON Customer.Customer REBUILD;

7. AdventureWorks Index Analysis

SQL
1-- Xem indexes trong Sales.SalesOrderHeader
2SELECT
3 i.name AS IndexName,
4 i.type_desc,
5 i.is_unique,
6 STRING_AGG(
7 COL_NAME(ic.object_id, ic.column_id) +
8 CASE WHEN ic.is_included_column = 1 THEN ' (INCLUDE)' ELSE '' END,
9 ', '
10 ) WITHIN GROUP (ORDER BY ic.key_ordinal) AS Columns
11FROM sys.indexes i
12JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
13WHERE i.object_id = OBJECT_ID('Sales.SalesOrderHeader')
14GROUP BY i.name, i.type_desc, i.is_unique
15ORDER BY i.type_desc DESC;
16
17-- Analyze query với Execution Plan
18SET STATISTICS IO ON;
19
20-- Query without proper index
21SELECT * FROM Sales.SalesOrderHeader
22WHERE CustomerID = 29825;
23
24-- Check if CustomerID has index
25-- Yes: IX_SalesOrderHeader_CustomerID
26
27SET STATISTICS IO OFF;

8. Index Design Best Practices

8.1 When to Create Index

SQL
1-- ✅ Create indexes on:
2-- 1. Foreign Keys (for JOINs)
3CREATE INDEX IX_Order_CustomerID ON Sales.[Order](CustomerID);
4
5-- 2. WHERE clause columns (frequently filtered)
6CREATE INDEX IX_Order_Status ON Sales.[Order](Status);
7
8-- 3. ORDER BY columns
9CREATE INDEX IX_Order_OrderDate ON Sales.[Order](OrderDate DESC);
10
11-- 4. Columns in aggregate functions with GROUP BY
12CREATE INDEX IX_OrderDetail_ProductID ON Sales.OrderDetail(ProductID);

8.2 When NOT to Create Index

SQL
1-- ❌ Don't index:
2-- 1. Small tables (< 1000 rows)
3-- 2. Highly updated columns (write-heavy)
4-- 3. Low selectivity columns (Gender, Status với few values)
5-- 4. Wide columns (NVARCHAR(MAX), TEXT)
6
7-- Bad: Index on column with low selectivity
8CREATE INDEX IX_Customer_Gender ON Customer.Customer(Gender);
9-- Only 2-3 distinct values, not selective
10
11-- Bad: Index on frequently updated column
12CREATE INDEX IX_Order_LastModified ON Sales.[Order](ModifiedDate);
13-- Updated on every change

8.3 Index Design Guidelines

markdown
1## Index Guidelines
2
3### Column Order in Composite Index
41. Equality columns first (= )
52. Inequality columns last (>, <, BETWEEN)
63. Most selective columns first
7
8### Example
9Query: WHERE Status = 'Active' AND OrderDate > '2024-01-01'
10
11Good: INDEX(Status, OrderDate) -- Equality first
12Bad: INDEX(OrderDate, Status) -- Inequality first
13
14### Key Size
15- Keep index key narrow (< 16 bytes ideal)
16- Use INCLUDE for wide columns
17
18### Number of Indexes
19- OLTP: Fewer indexes (3-5 per table)
20- OLAP: More indexes acceptable
21- Monitor unused indexes regularly

9. Hands-on Exercise

Exercise: Optimize Queries with Indexes

SQL
1-- Scenario: E-commerce database với queries chậm
2-- Analyze và tạo indexes phù hợp
3
4-- Query 1: Tìm orders của customer
5SELECT o.*, c.FirstName, c.LastName
6FROM Sales.[Order] o
7JOIN Customer.Customer c ON o.CustomerID = c.CustomerID
8WHERE c.Email = 'customer@email.com';
9
10-- Cần indexes:
11CREATE INDEX IX_Customer_Email ON Customer.Customer(Email);
12CREATE INDEX IX_Order_CustomerID ON Sales.[Order](CustomerID);
13
14-- Query 2: Orders theo date range và status
15SELECT OrderID, OrderNumber, OrderDate, TotalAmount
16FROM Sales.[Order]
17WHERE OrderDate BETWEEN '2024-01-01' AND '2024-03-31'
18 AND Status = 'Delivered'
19ORDER BY OrderDate DESC;
20
21-- Cần covering index:
22CREATE INDEX IX_Order_Status_Date
23ON Sales.[Order](Status, OrderDate DESC)
24INCLUDE (OrderNumber, TotalAmount);
25
26-- Query 3: Top products by sales
27SELECT p.ProductID, p.ProductName, SUM(od.Quantity) AS TotalSold
28FROM Inventory.Product p
29JOIN Sales.OrderDetail od ON p.ProductID = od.ProductID
30GROUP BY p.ProductID, p.ProductName
31ORDER BY TotalSold DESC;
32
33-- Cần indexes:
34CREATE INDEX IX_OrderDetail_ProductID
35ON Sales.OrderDetail(ProductID)
36INCLUDE (Quantity);
37
38-- Query 4: Pending orders (small subset)
39SELECT * FROM Sales.[Order]
40WHERE Status = 'Pending'
41ORDER BY OrderDate;
42
43-- Filtered index cho subset:
44CREATE INDEX IX_Order_Pending_Date
45ON Sales.[Order](OrderDate)
46INCLUDE (OrderNumber, CustomerID, TotalAmount)
47WHERE Status = 'Pending';

10. Summary

markdown
1## Index Checklist
2
3### Must Have
4- ✅ Clustered index on every table (usually PK)
5- ✅ Indexes on all Foreign Key columns
6- ✅ Indexes for frequently used WHERE columns
7
8### Consider Adding
9- Covering indexes for critical queries
10- Filtered indexes for subset queries
11- Composite indexes for multi-column conditions
12
13### Maintenance
14- Monitor fragmentation monthly
15- Review unused indexes quarterly
16- Check missing index suggestions
17- Test before production deployment
18
19### Performance Impact
20| Operation | Impact of More Indexes |
21|-----------|----------------------|
22| SELECT | Faster |
23| INSERT | Slower |
24| UPDATE | Slower |
25| DELETE | Slower |

Tiếp theo

Bài tiếp theo: Performance Tuning - học cách phân tích và tối ưu hóa performance của SQL Server!