Indexes - Tối ưu hiệu năng Database
1. Index là gì?
Index là cấu trúc dữ liệu giúp SQL Server tìm kiếm dữ liệu nhanh hơn, tương tự như mục lục của sách.
Tại sao cần Index?
Text
1Without Index: Table Scan - đọc TOÀN BỘ table2With Index: Index Seek - nhảy trực tiếp đến dataSQL Server Index Types
Index Types
Clustered
Sorts physical data
1 per table
Usually PK
Non-Clustered
Separate structure
Multiple per table
Pointer to data
Special
Filtered Index
Columnstore
Full-Text
| No Index | With Index |
|---|---|
| Table Scan | Index Seek |
| O(n) - Linear | O(log n) - Logarithmic |
| Slow on large tables | Fast regardless of size |
2. Clustered Index
Đặc điểm Clustered Index
- Sắp xếp physical data theo index key
- Chỉ 1 per table (vì data chỉ có thể sort 1 cách)
- Primary Key tự động tạo clustered index (mặc định)
- Leaf level = actual data rows
SQL
1-- Clustered index được tạo tự động với PK2CREATE TABLE Customer (3 CustomerID INT PRIMARY KEY, -- Clustered Index automatically4 CustomerName NVARCHAR(100)5);67-- Hoặc tạo explicit8CREATE TABLE Customer (9 CustomerID INT,10 CustomerName NVARCHAR(100)11);1213CREATE CLUSTERED INDEX CIX_Customer_CustomerID14ON Customer(CustomerID);B-Tree Structure
Text
1┌─────────────┐2 │ Root │3 │ 1-50, 51-100│4 └─────────────┘5 │6 ┌─────────────┴─────────────┐7 ▼ ▼8 ┌─────────────┐ ┌─────────────┐9 │ Intermediate│ │ Intermediate│10 │ 1-25, 26-50│ │ 51-75,76-100│11 └─────────────┘ └─────────────┘12 │ │13 ┌──────┴──────┐ ┌──────┴──────┐14 ▼ ▼ ▼ ▼15┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐16│ Leaf │ │ Leaf │ │ Leaf │ │ Leaf │17│ Data │ │ Data │ │ Data │ │ Data │18│ 1-25 │ │ 26-50 │ │ 51-75 │ │ 76-100│19└───────┘ └───────┘ └───────┘ └───────┘20 21Clustered: Leaf contains actual data rowsChọn Clustered Index Key
SQL
1-- ✅ Good clustered index keys:2-- 1. Unique, narrow, ever-increasing (như IDENTITY)3CREATE TABLE Order (4 OrderID INT IDENTITY(1,1) PRIMARY KEY, -- Perfect!5 ...6);78-- 2. Frequently used in range queries9CREATE CLUSTERED INDEX CIX_Order_OrderDate10ON Order(OrderDate); -- Good for date range queries1112-- ❌ Bad clustered index keys:13-- 1. GUID (wide, random)14CREATE TABLE Session (15 SessionID UNIQUEIDENTIFIER PRIMARY KEY, -- Bad! Random = page splits16 ...17);1819-- 2. Wide composite keys20CREATE CLUSTERED INDEX CIX_Bad21ON Order(CustomerID, OrderDate, Status); -- Too wide3. Non-Clustered Index
Đặc điểm Non-Clustered Index
- Cấu trúc riêng biệt với data
- Multiple per table (up to 999)
- Leaf level chứa index key + pointer (RID hoặc clustering key)
SQL
1-- Tạo non-clustered index2CREATE NONCLUSTERED INDEX IX_Customer_Email3ON Customer.Customer(Email);45-- Composite non-clustered index6CREATE NONCLUSTERED INDEX IX_Order_CustomerDate7ON Sales.[Order](CustomerID, OrderDate DESC);89-- Xem index trong table10SELECT 11 i.name AS IndexName,12 i.type_desc AS IndexType,13 COL_NAME(ic.object_id, ic.column_id) AS ColumnName,14 ic.is_included_column15FROM sys.indexes i16JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id17WHERE i.object_id = OBJECT_ID('Sales.SalesOrderHeader');Non-Clustered Structure
Text
1Non-Clustered Index Clustered Index (Data)2┌─────────────────┐ ┌─────────────────┐3│ Index Key + │ Pointer │ Actual Data │4│ Clustering Key │ ──────────────► │ Rows │5└─────────────────┘ └─────────────────┘6 7Example: Index on Email8┌───────────────────────────┐ ┌─────────────────────────────┐9│ Email │ CustID │ │CustID│Name │Email │10├───────────────────────────┤ ├─────────────────────────────┤11│ alice@mail.com │ 5 │ ──► │ 5 │Alice │alice@mail │12│ bob@mail.com │ 3 │ ──► │ 3 │Bob │bob@mail │13│ carol@mail.com │ 1 │ ──► │ 1 │Carol │carol@mail │14└───────────────────────────┘ └─────────────────────────────┘4. Covering Index với INCLUDE
What is Covering Index?
Covering Index = Index chứa TẤT CẢ columns cần cho query, không cần lookup về data.
SQL
1-- Query này cần CustomerID, OrderDate, TotalAmount2SELECT CustomerID, OrderDate, TotalAmount3FROM Sales.[Order]4WHERE CustomerID = 5;56-- Index chỉ có CustomerID7CREATE INDEX IX_Order_CustomerID 8ON Sales.[Order](CustomerID);9-- Cần Key Lookup để lấy OrderDate, TotalAmount → Slow!1011-- Covering Index với INCLUDE12CREATE INDEX IX_Order_CustomerID_Covering13ON Sales.[Order](CustomerID)14INCLUDE (OrderDate, TotalAmount);15-- Không cần lookup → Fast!INCLUDE vs Key Columns
| Key Columns | INCLUDE Columns |
|---|---|
| Used for searching | Only for retrieval |
| Sorted in index | Not sorted |
| Limited to 16 columns | No limit |
| Trong B-tree all levels | Only in leaf level |
SQL
1-- Key columns: columns in WHERE, JOIN, ORDER BY2-- INCLUDE columns: columns in SELECT only34-- Example5SELECT FirstName, LastName, Phone -- Include these6FROM Customer.Customer7WHERE Email = 'test@email.com' -- Key column8ORDER BY LastName; -- Key column910CREATE INDEX IX_Customer_Email_LastName11ON Customer.Customer(Email, LastName)12INCLUDE (FirstName, Phone);5. Filtered Index
Filtered Index = Index chỉ cho một subset của data (với WHERE clause).
SQL
1-- Filtered index cho active customers only2CREATE INDEX IX_Customer_Email_Active3ON Customer.Customer(Email)4WHERE IsActive = 1;56-- Filtered index cho non-null values7CREATE INDEX IX_Order_ShipDate8ON Sales.[Order](ShipDate)9WHERE ShipDate IS NOT NULL;1011-- Filtered index cho specific status12CREATE INDEX IX_Order_Pending13ON Sales.[Order](CustomerID, OrderDate)14WHERE Status = 'Pending';Benefits of Filtered Index
| Benefit | Description |
|---|---|
| Smaller size | Less storage, faster scans |
| Better maintenance | Faster to update |
| Targeted optimization | Perfect for specific queries |
SQL
1-- Scenario: 99% orders are 'Delivered', 1% are 'Pending'2-- Full index wastes space for Delivered orders34-- Filtered index only includes Pending5CREATE INDEX IX_Order_Pending6ON Sales.[Order](OrderDate)7WHERE Status = 'Pending';8-- Small, fast, perfect for monitoring queries6. Index Management
6.1 Xem Index Info
SQL
1-- Xem tất cả indexes trong database2SELECT 3 OBJECT_SCHEMA_NAME(i.object_id) AS SchemaName,4 OBJECT_NAME(i.object_id) AS TableName,5 i.name AS IndexName,6 i.type_desc AS IndexType,7 i.is_unique,8 i.is_primary_key,9 i.fill_factor10FROM sys.indexes i11WHERE i.object_id > 100 -- User tables only12ORDER BY SchemaName, TableName, IndexName;1314-- Index columns detail15SELECT 16 i.name AS IndexName,17 COL_NAME(ic.object_id, ic.column_id) AS ColumnName,18 ic.key_ordinal,19 ic.is_included_column,20 ic.is_descending_key21FROM sys.indexes i22JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id23WHERE i.object_id = OBJECT_ID('Sales.SalesOrderHeader')24ORDER BY i.name, ic.key_ordinal;6.2 Index Usage Statistics
SQL
1-- Xem index được sử dụng như thế nào2SELECT 3 OBJECT_NAME(s.object_id) AS TableName,4 i.name AS IndexName,5 i.type_desc AS IndexType,6 s.user_seeks, -- Index seek operations7 s.user_scans, -- Index scan operations8 s.user_lookups, -- Bookmark lookups9 s.user_updates, -- Updates to index10 s.last_user_seek,11 s.last_user_scan12FROM sys.dm_db_index_usage_stats s13JOIN sys.indexes i ON s.object_id = i.object_id AND s.index_id = i.index_id14WHERE s.database_id = DB_ID()15ORDER BY s.user_seeks + s.user_scans DESC;1617-- Tìm unused indexes (candidates for removal)18SELECT 19 OBJECT_NAME(i.object_id) AS TableName,20 i.name AS IndexName,21 i.type_desc,22 s.user_seeks,23 s.user_scans,24 s.user_lookups,25 s.user_updates26FROM sys.indexes i27LEFT JOIN sys.dm_db_index_usage_stats s 28 ON i.object_id = s.object_id AND i.index_id = s.index_id29WHERE OBJECTPROPERTY(i.object_id, 'IsUserTable') = 130 AND i.type_desc = 'NONCLUSTERED'31 AND (s.user_seeks IS NULL OR s.user_seeks = 0)32 AND (s.user_scans IS NULL OR s.user_scans = 0);6.3 Missing Index Suggestions
SQL
1-- SQL Server tracks missing indexes2SELECT 3 OBJECT_NAME(mid.object_id) AS TableName,4 mid.equality_columns,5 mid.inequality_columns,6 mid.included_columns,7 migs.avg_user_impact,8 migs.user_seeks9FROM sys.dm_db_missing_index_details mid10JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle11JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle12WHERE mid.database_id = DB_ID()13ORDER BY migs.avg_user_impact * migs.user_seeks DESC;6.4 Index Fragmentation
SQL
1-- Check fragmentation2SELECT 3 OBJECT_NAME(ps.object_id) AS TableName,4 i.name AS IndexName,5 ps.index_type_desc,6 ps.avg_fragmentation_in_percent,7 ps.page_count8FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ps9JOIN sys.indexes i ON ps.object_id = i.object_id AND ps.index_id = i.index_id10WHERE ps.avg_fragmentation_in_percent > 1011ORDER BY ps.avg_fragmentation_in_percent DESC;1213-- Rebuild vs Reorganize14-- < 10% fragmentation: Do nothing15-- 10-30% fragmentation: REORGANIZE16-- > 30% fragmentation: REBUILD1718-- Reorganize (online, less resource intensive)19ALTER INDEX IX_Customer_Email ON Customer.Customer REORGANIZE;2021-- Rebuild (more thorough, can be online with Enterprise)22ALTER INDEX IX_Customer_Email ON Customer.Customer REBUILD;23ALTER INDEX IX_Customer_Email ON Customer.Customer REBUILD WITH (ONLINE = ON);2425-- Rebuild all indexes on table26ALTER INDEX ALL ON Customer.Customer REBUILD;7. AdventureWorks Index Analysis
SQL
1-- Xem indexes trong Sales.SalesOrderHeader2SELECT 3 i.name AS IndexName,4 i.type_desc,5 i.is_unique,6 STRING_AGG(7 COL_NAME(ic.object_id, ic.column_id) + 8 CASE WHEN ic.is_included_column = 1 THEN ' (INCLUDE)' ELSE '' END,9 ', '10 ) WITHIN GROUP (ORDER BY ic.key_ordinal) AS Columns11FROM sys.indexes i12JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id13WHERE i.object_id = OBJECT_ID('Sales.SalesOrderHeader')14GROUP BY i.name, i.type_desc, i.is_unique15ORDER BY i.type_desc DESC;1617-- Analyze query với Execution Plan18SET STATISTICS IO ON;1920-- Query without proper index21SELECT * FROM Sales.SalesOrderHeader22WHERE CustomerID = 29825;2324-- Check if CustomerID has index25-- Yes: IX_SalesOrderHeader_CustomerID2627SET STATISTICS IO OFF;8. Index Design Best Practices
8.1 When to Create Index
SQL
1-- ✅ Create indexes on:2-- 1. Foreign Keys (for JOINs)3CREATE INDEX IX_Order_CustomerID ON Sales.[Order](CustomerID);45-- 2. WHERE clause columns (frequently filtered)6CREATE INDEX IX_Order_Status ON Sales.[Order](Status);78-- 3. ORDER BY columns9CREATE INDEX IX_Order_OrderDate ON Sales.[Order](OrderDate DESC);1011-- 4. Columns in aggregate functions with GROUP BY12CREATE INDEX IX_OrderDetail_ProductID ON Sales.OrderDetail(ProductID);8.2 When NOT to Create Index
SQL
1-- ❌ Don't index:2-- 1. Small tables (< 1000 rows)3-- 2. Highly updated columns (write-heavy)4-- 3. Low selectivity columns (Gender, Status với few values)5-- 4. Wide columns (NVARCHAR(MAX), TEXT)67-- Bad: Index on column with low selectivity8CREATE INDEX IX_Customer_Gender ON Customer.Customer(Gender);9-- Only 2-3 distinct values, not selective1011-- Bad: Index on frequently updated column12CREATE INDEX IX_Order_LastModified ON Sales.[Order](ModifiedDate);13-- Updated on every change8.3 Index Design Guidelines
markdown
1## Index Guidelines2 3### Column Order in Composite Index41. Equality columns first (= )52. Inequality columns last (>, <, BETWEEN)63. Most selective columns first7 8### Example9Query: WHERE Status = 'Active' AND OrderDate > '2024-01-01'10 11Good: INDEX(Status, OrderDate) -- Equality first12Bad: INDEX(OrderDate, Status) -- Inequality first13 14### Key Size15- Keep index key narrow (< 16 bytes ideal)16- Use INCLUDE for wide columns17 18### Number of Indexes19- OLTP: Fewer indexes (3-5 per table)20- OLAP: More indexes acceptable21- Monitor unused indexes regularly9. Hands-on Exercise
Exercise: Optimize Queries with Indexes
SQL
1-- Scenario: E-commerce database với queries chậm2-- Analyze và tạo indexes phù hợp34-- Query 1: Tìm orders của customer5SELECT o.*, c.FirstName, c.LastName6FROM Sales.[Order] o7JOIN Customer.Customer c ON o.CustomerID = c.CustomerID8WHERE c.Email = 'customer@email.com';910-- Cần indexes:11CREATE INDEX IX_Customer_Email ON Customer.Customer(Email);12CREATE INDEX IX_Order_CustomerID ON Sales.[Order](CustomerID);1314-- Query 2: Orders theo date range và status15SELECT OrderID, OrderNumber, OrderDate, TotalAmount16FROM Sales.[Order]17WHERE OrderDate BETWEEN '2024-01-01' AND '2024-03-31'18 AND Status = 'Delivered'19ORDER BY OrderDate DESC;2021-- Cần covering index:22CREATE INDEX IX_Order_Status_Date23ON Sales.[Order](Status, OrderDate DESC)24INCLUDE (OrderNumber, TotalAmount);2526-- Query 3: Top products by sales27SELECT p.ProductID, p.ProductName, SUM(od.Quantity) AS TotalSold28FROM Inventory.Product p29JOIN Sales.OrderDetail od ON p.ProductID = od.ProductID30GROUP BY p.ProductID, p.ProductName31ORDER BY TotalSold DESC;3233-- Cần indexes:34CREATE INDEX IX_OrderDetail_ProductID 35ON Sales.OrderDetail(ProductID)36INCLUDE (Quantity);3738-- Query 4: Pending orders (small subset)39SELECT * FROM Sales.[Order]40WHERE Status = 'Pending'41ORDER BY OrderDate;4243-- Filtered index cho subset:44CREATE INDEX IX_Order_Pending_Date45ON Sales.[Order](OrderDate)46INCLUDE (OrderNumber, CustomerID, TotalAmount)47WHERE Status = 'Pending';10. Summary
markdown
1## Index Checklist2 3### Must Have4- ✅ Clustered index on every table (usually PK)5- ✅ Indexes on all Foreign Key columns6- ✅ Indexes for frequently used WHERE columns7 8### Consider Adding9- Covering indexes for critical queries10- Filtered indexes for subset queries11- Composite indexes for multi-column conditions12 13### Maintenance14- Monitor fragmentation monthly15- Review unused indexes quarterly16- Check missing index suggestions17- Test before production deployment18 19### Performance Impact20| Operation | Impact of More Indexes |21|-----------|----------------------|22| SELECT | Faster |23| INSERT | Slower |24| UPDATE | Slower |25| DELETE | Slower |Tiếp theo
Bài tiếp theo: Performance Tuning - học cách phân tích và tối ưu hóa performance của SQL Server!
