🎯 Mục tiêu bài học
Sau bài học này, bạn sẽ:
- ✅ Đọc và phân tích Execution Plans
- ✅ Sử dụng SET STATISTICS IO/TIME để đo performance
- ✅ Áp dụng các kỹ thuật tối ưu query
- ✅ Hiểu và xử lý Parameter Sniffing, Implicit Conversion
- ✅ Sử dụng Query Store để monitor performance
🔍 Performance Analysis Overview
Performance Tuning Areas
Checkpoint
Performance Tuning bao gồm những lĩnh vực chính nào? Mỗi lĩnh vực giải quyết vấn đề gì?
📊 Execution Plans
Đọc Execution Plan
1-- Hiển thị Estimated Execution Plan2SET SHOWPLAN_ALL ON;3GO4SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID = 29825;5GO6SET SHOWPLAN_ALL OFF;7GO89-- Hiển thị Actual Execution Plan (in SSMS: Ctrl + M)10SET STATISTICS PROFILE ON;11GO12SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID = 29825;13GO14SET STATISTICS PROFILE OFF;1516-- XML format17SET STATISTICS XML ON;18GO19SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID = 29825;20GO21SET STATISTICS XML OFF;Common Operators
| Operator | Description | Good/Bad |
|---|---|---|
| Index Seek | Direct lookup | ✅ Very Good |
| Index Scan | Scan entire index | ⚠️ Check |
| Table Scan | Scan entire table | ❌ Bad |
| Clustered Index Seek | Direct to data | ✅ Best |
| Key Lookup | Extra lookup | ⚠️ Consider covering index |
| Hash Match | JOIN/Aggregate | ⚠️ Memory-intensive |
| Nested Loops | JOIN small sets | ✅ Good for small |
| Sort | ORDER BY | ⚠️ Can be expensive |
Phát hiện vấn đề từ Execution Plan
1-- Thick arrows = nhiều rows đi qua2-- Yellow bang icon = warnings3-- High % cost = bottleneck45-- Ví dụ: Query với Table Scan6SELECT * FROM Sales.SalesOrderHeader7WHERE YEAR(OrderDate) = 2014; -- Function on column prevents index use!89-- Fix: Rewrite để sử dụng index10SELECT * FROM Sales.SalesOrderHeader11WHERE OrderDate >= '2014-01-01' AND OrderDate < '2015-01-01';Checkpoint
Execution Plan là gì? Liệt kê các operators phổ biến và đánh giá tốt/xấu.
📈 Query Statistics
SET STATISTICS
1-- I/O Statistics2SET STATISTICS IO ON;3GO45SELECT c.CustomerID, c.PersonID, COUNT(*) AS OrderCount6FROM Sales.Customer c7JOIN Sales.SalesOrderHeader soh ON c.CustomerID = soh.CustomerID8GROUP BY c.CustomerID, c.PersonID;910GO11SET STATISTICS IO OFF;1213/*14Output:15Table 'SalesOrderHeader'. Scan count 1, logical reads 68916Table 'Customer'. Scan count 1, logical reads 1231718Logical reads = pages read from cache19Physical reads = pages read from disk (slow!)20*/2122-- Time Statistics23SET STATISTICS TIME ON;24GO25SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID = 29825;26GO27SET STATISTICS TIME OFF;2829/*30Output:31CPU time = 0 ms, elapsed time = 1 ms32*/Monitor Long Running Queries
1-- Tìm queries đang chạy lâu2SELECT 3 r.session_id,4 r.status,5 r.command,6 r.cpu_time,7 r.total_elapsed_time / 1000.0 AS elapsed_seconds,8 r.logical_reads,9 r.writes,10 t.text AS query_text,11 qp.query_plan12FROM sys.dm_exec_requests r13CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) t14CROSS APPLY sys.dm_exec_query_plan(r.plan_handle) qp15WHERE r.session_id > 50 -- User sessions16ORDER BY r.total_elapsed_time DESC;1718-- Top resource-consuming queries (từ cache)19SELECT TOP 2020 total_worker_time / execution_count AS avg_cpu,21 total_logical_reads / execution_count AS avg_reads,22 total_elapsed_time / execution_count AS avg_duration,23 execution_count,24 SUBSTRING(st.text, (qs.statement_start_offset/2)+1,25 ((CASE qs.statement_end_offset26 WHEN -1 THEN DATALENGTH(st.text)27 ELSE qs.statement_end_offset28 END - qs.statement_start_offset)/2) + 1) AS query_text29FROM sys.dm_exec_query_stats qs30CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st31ORDER BY avg_cpu DESC;Checkpoint
SET STATISTICS IO cho biết thông tin gì? Logical reads và Physical reads khác nhau thế nào?
⚡ Query Optimization Techniques
Avoid Function on Columns
1-- ❌ Bad: Function prevents index usage2SELECT * FROM Customer.Customer3WHERE UPPER(Email) = 'TEST@EMAIL.COM';45-- ✅ Good: Store normalized or use computed column6SELECT * FROM Customer.Customer7WHERE Email = 'test@email.com'; -- Case-insensitive by collation89-- Or add computed column with index10ALTER TABLE Customer.Customer11ADD EmailLower AS LOWER(Email) PERSISTED;1213CREATE INDEX IX_Customer_EmailLower ON Customer.Customer(EmailLower);1415-- ❌ Bad: Function on date16SELECT * FROM Sales.[Order]17WHERE YEAR(OrderDate) = 2024 AND MONTH(OrderDate) = 1;1819-- ✅ Good: Use range20SELECT * FROM Sales.[Order]21WHERE OrderDate >= '2024-01-01' AND OrderDate < '2024-02-01';Avoid SELECT *
1-- ❌ Bad: Returns unnecessary columns2SELECT * FROM Sales.SalesOrderHeader3WHERE CustomerID = 29825;45-- ✅ Good: Only needed columns6SELECT SalesOrderID, OrderDate, Status, TotalDue7FROM Sales.SalesOrderHeader8WHERE CustomerID = 29825;910-- Benefits:11-- 1. Less data to transfer12-- 2. Can use covering index13-- 3. Clear documentation of what's neededUse EXISTS instead of IN (for large lists)
1-- ❌ Potentially slow với large subquery2SELECT * FROM Production.Product3WHERE ProductID IN (4 SELECT ProductID FROM Sales.SalesOrderDetail5);67-- ✅ Better: EXISTS (stops at first match)8SELECT * FROM Production.Product p9WHERE EXISTS (10 SELECT 1 FROM Sales.SalesOrderDetail sod11 WHERE sod.ProductID = p.ProductID12);1314-- Note: Modern optimizer often handles this, but EXISTS is clearer intentProper JOIN Order and Hints
1-- Let optimizer choose join order (usually best)2SELECT c.CustomerID, COUNT(*) AS OrderCount3FROM Sales.Customer c4JOIN Sales.SalesOrderHeader soh ON c.CustomerID = soh.CustomerID5GROUP BY c.CustomerID;67-- Force join order (rare, only when optimizer fails)8SELECT c.CustomerID, COUNT(*) AS OrderCount9FROM Sales.Customer c10INNER JOIN Sales.SalesOrderHeader soh ON c.CustomerID = soh.CustomerID11OPTION (FORCE ORDER);1213-- Join hints (use sparingly)14SELECT *15FROM Sales.Customer c16INNER HASH JOIN Sales.SalesOrderHeader soh ON c.CustomerID = soh.CustomerID;17-- LOOP JOIN, MERGE JOIN, HASH JOINOptimize Pagination
1-- ❌ Old method: OFFSET can be slow for large pages2SELECT *3FROM Sales.SalesOrderHeader4ORDER BY SalesOrderID5OFFSET 100000 ROWS FETCH NEXT 20 ROWS ONLY;6-- Must scan 100,000 rows first!78-- ✅ Better: Keyset pagination9SELECT TOP 20 *10FROM Sales.SalesOrderHeader11WHERE SalesOrderID > @LastSalesOrderID -- From previous page12ORDER BY SalesOrderID;13-- Direct seek to starting pointCheckpoint
Liệt kê ít nhất 4 kỹ thuật tối ưu query. Tại sao không nên dùng function trên indexed column?
🔧 Index Tuning
Identify Missing Indexes
1-- SQL Server's missing index suggestions2SELECT 3 ROUND(migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans), 0) AS improvement_measure,4 'CREATE INDEX IX_' + OBJECT_NAME(mid.object_id) + '_' + 5 REPLACE(REPLACE(REPLACE(ISNULL(mid.equality_columns,''), ', ', '_'), '[', ''), ']', '') AS create_statement,6 mid.equality_columns,7 mid.inequality_columns,8 mid.included_columns,9 migs.user_seeks,10 migs.user_scans,11 migs.avg_user_impact12FROM sys.dm_db_missing_index_group_stats migs13JOIN sys.dm_db_missing_index_groups mig ON migs.group_handle = mig.index_group_handle14JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle15WHERE mid.database_id = DB_ID()16ORDER BY improvement_measure DESC;Identify Unused Indexes
1-- Find indexes with low usage2SELECT 3 OBJECT_NAME(i.object_id) AS TableName,4 i.name AS IndexName,5 i.type_desc,6 us.user_seeks,7 us.user_scans,8 us.user_lookups,9 us.user_updates,10 -- Ratio of reads vs writes11 CASE WHEN (us.user_seeks + us.user_scans + us.user_lookups) = 0 12 THEN 0 13 ELSE ROUND(us.user_updates * 1.0 / (us.user_seeks + us.user_scans + us.user_lookups), 2)14 END AS WriteToReadRatio15FROM sys.indexes i16LEFT JOIN sys.dm_db_index_usage_stats us 17 ON i.object_id = us.object_id AND i.index_id = us.index_id18WHERE OBJECTPROPERTY(i.object_id, 'IsUserTable') = 119 AND i.index_id > 1 -- Non-clustered only20ORDER BY (us.user_seeks + us.user_scans + us.user_lookups) ASC;Index Maintenance Schedule
1-- Reorganize or Rebuild based on fragmentation2DECLARE @TableName NVARCHAR(256), @IndexName NVARCHAR(256), @Fragmentation FLOAT;34DECLARE index_cursor CURSOR FOR5SELECT 6 QUOTENAME(OBJECT_SCHEMA_NAME(ps.object_id)) + '.' + QUOTENAME(OBJECT_NAME(ps.object_id)),7 i.name,8 ps.avg_fragmentation_in_percent9FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ps10JOIN sys.indexes i ON ps.object_id = i.object_id AND ps.index_id = i.index_id11WHERE ps.avg_fragmentation_in_percent > 1012 AND ps.page_count > 100013 AND i.name IS NOT NULL;1415OPEN index_cursor;16FETCH NEXT FROM index_cursor INTO @TableName, @IndexName, @Fragmentation;1718WHILE @@FETCH_STATUS = 019BEGIN20 IF @Fragmentation < 3021 EXEC('ALTER INDEX ' + @IndexName + ' ON ' + @TableName + ' REORGANIZE');22 ELSE23 EXEC('ALTER INDEX ' + @IndexName + ' ON ' + @TableName + ' REBUILD');24 25 FETCH NEXT FROM index_cursor INTO @TableName, @IndexName, @Fragmentation;26END2728CLOSE index_cursor;29DEALLOCATE index_cursor;Checkpoint
Làm sao tìm missing indexes và unused indexes? Tại sao cần xóa unused indexes?
📉 Statistics
Understanding Statistics
1-- Statistics giúp Query Optimizer estimate row counts2-- Outdated statistics = bad execution plans34-- Xem statistics của table5SELECT 6 s.name AS StatisticsName,7 STATS_DATE(s.object_id, s.stats_id) AS LastUpdated,8 s.auto_created,9 s.user_created10FROM sys.stats s11WHERE s.object_id = OBJECT_ID('Sales.SalesOrderHeader');1213-- Xem chi tiết statistics14DBCC SHOW_STATISTICS('Sales.SalesOrderHeader', 'IX_SalesOrderHeader_CustomerID');Update Statistics
1-- Update statistics cho 1 table2UPDATE STATISTICS Sales.SalesOrderHeader;34-- Update với FULLSCAN (more accurate, slower)5UPDATE STATISTICS Sales.SalesOrderHeader WITH FULLSCAN;67-- Update all statistics in database8EXEC sp_updatestats;910-- Auto update settings11ALTER DATABASE AdventureWorks2019 12SET AUTO_UPDATE_STATISTICS ON;1314ALTER DATABASE AdventureWorks2019 15SET AUTO_CREATE_STATISTICS ON;Checkpoint
Statistics ảnh hưởng thế nào đến Query Optimizer? Khi nào cần update statistics?
⚠️ Common Performance Problems
Parameter Sniffing
1-- Problem: First execution caches plan for specific parameter2-- Subsequent executions may use suboptimal plan34CREATE PROCEDURE Sales.usp_GetOrdersByCustomer5 @CustomerID INT6AS7BEGIN8 SELECT * FROM Sales.SalesOrderHeader9 WHERE CustomerID = @CustomerID;10END;1112-- Customer 1 has 1 order, Customer 29825 has 10,000 orders13-- Plan cached for first customer may be wrong for second1415-- Solutions:1617-- 1. RECOMPILE (for every execution)18CREATE PROCEDURE Sales.usp_GetOrdersByCustomer19 @CustomerID INT20AS21BEGIN22 SELECT * FROM Sales.SalesOrderHeader23 WHERE CustomerID = @CustomerID24 OPTION (RECOMPILE);25END;2627-- 2. OPTIMIZE FOR28CREATE PROCEDURE Sales.usp_GetOrdersByCustomer29 @CustomerID INT30AS31BEGIN32 SELECT * FROM Sales.SalesOrderHeader33 WHERE CustomerID = @CustomerID34 OPTION (OPTIMIZE FOR (@CustomerID = 29825)); -- Typical value35END;3637-- 3. Local variable (loses parameter sniffing)38CREATE PROCEDURE Sales.usp_GetOrdersByCustomer39 @CustomerID INT40AS41BEGIN42 DECLARE @LocalCustID INT = @CustomerID;43 44 SELECT * FROM Sales.SalesOrderHeader45 WHERE CustomerID = @LocalCustID;46END;Implicit Conversion
1-- ❌ Problem: Data type mismatch causes conversion2-- CustomerID is INT, but passed as VARCHAR34DECLARE @CustID VARCHAR(10) = '29825';5SELECT * FROM Sales.Customer WHERE CustomerID = @CustID;6-- Implicit conversion! Index may not be used efficiently78-- ✅ Solution: Use correct data type9DECLARE @CustID INT = 29825;10SELECT * FROM Sales.Customer WHERE CustomerID = @CustID;1112-- Check for implicit conversions in execution plan13-- Look for CONVERT_IMPLICIT warningTable Variables vs Temp Tables
1-- Table Variables: No statistics, always estimated 1 row2DECLARE @Orders TABLE (OrderID INT);3INSERT INTO @Orders SELECT SalesOrderID FROM Sales.SalesOrderHeader;45SELECT * FROM @Orders o6JOIN Sales.SalesOrderDetail sod ON o.OrderID = sod.SalesOrderID;7-- May use wrong join strategy!89-- Temp Tables: Have statistics, better for large data10CREATE TABLE #Orders (OrderID INT);11INSERT INTO #Orders SELECT SalesOrderID FROM Sales.SalesOrderHeader;1213-- Update statistics14UPDATE STATISTICS #Orders;1516SELECT * FROM #Orders o17JOIN Sales.SalesOrderDetail sod ON o.OrderID = sod.SalesOrderID;18-- Better execution plan1920DROP TABLE #Orders;2122-- Rule: < 100 rows → Table Variable, > 100 rows → Temp TableCheckpoint
Parameter Sniffing là gì và có những cách giải quyết nào? Khi nào dùng Table Variable vs Temp Table?
🗄️ Query Store (SQL Server 2016+)
1-- Enable Query Store2ALTER DATABASE AdventureWorks20193SET QUERY_STORE = ON;45-- Configure Query Store6ALTER DATABASE AdventureWorks20197SET QUERY_STORE (8 OPERATION_MODE = READ_WRITE,9 CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 30),10 DATA_FLUSH_INTERVAL_SECONDS = 900,11 INTERVAL_LENGTH_MINUTES = 60,12 MAX_STORAGE_SIZE_MB = 1000,13 QUERY_CAPTURE_MODE = AUTO14);1516-- View Query Store data17SELECT 18 q.query_id,19 qt.query_sql_text,20 rs.count_executions,21 rs.avg_duration / 1000.0 AS avg_duration_ms,22 rs.avg_logical_io_reads,23 rs.avg_cpu_time / 1000.0 AS avg_cpu_ms24FROM sys.query_store_query q25JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id26JOIN sys.query_store_plan p ON q.query_id = p.query_id27JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id28ORDER BY rs.avg_duration DESC;2930-- Force a specific plan31EXEC sp_query_store_force_plan @query_id = 1, @plan_id = 1;Checkpoint
Query Store dùng để làm gì? Cách force một execution plan cụ thể?
📋 Performance Monitoring Checklist
Daily Monitoring
- Check for blocking sessions
- Review long-running queries
- Monitor TempDB usage
- Check error logs
Weekly Monitoring
- Review missing index suggestions
- Check index fragmentation
- Analyze top resource-consuming queries
- Review Query Store reports
Monthly Monitoring
- Update statistics (if not auto)
- Rebuild/reorganize fragmented indexes
- Review unused indexes for removal
- Capacity planning review
Queries to Run
1-- Blocking sessions2SELECT 3 blocking_session_id,4 session_id,5 wait_type,6 wait_time,7 wait_resource8FROM sys.dm_exec_requests9WHERE blocking_session_id <> 0;1011-- TempDB usage12SELECT 13 SUM(user_object_reserved_page_count) * 8 / 1024 AS user_objects_mb,14 SUM(internal_object_reserved_page_count) * 8 / 1024 AS internal_objects_mb,15 SUM(version_store_reserved_page_count) * 8 / 1024 AS version_store_mb16FROM sys.dm_db_file_space_usage;1718-- Wait statistics19SELECT TOP 1020 wait_type,21 waiting_tasks_count,22 wait_time_ms,23 signal_wait_time_ms24FROM sys.dm_os_wait_stats25WHERE wait_type NOT LIKE '%SLEEP%'26ORDER BY wait_time_ms DESC;Checkpoint
Liệt kê những gì cần monitor hàng ngày, hàng tuần, hàng tháng cho database performance.
💡 Summary Best Practices
Query Level
- ✅ Review execution plans regularly
- ✅ Avoid SELECT *
- ✅ Avoid functions on indexed columns
- ✅ Use appropriate data types
- ✅ Parameterize queries
Index Level
- ✅ Create indexes for foreign keys
- ✅ Use covering indexes for key queries
- ✅ Remove unused indexes
- ✅ Maintain index fragmentation
Database Level
- ✅ Enable Query Store
- ✅ Keep statistics updated
- ✅ Monitor wait statistics
- ✅ Regular maintenance plans
Design Level
- ✅ Proper normalization
- ✅ Choose right data types
- ✅ Partition large tables
- ✅ Archive old data
Checkpoint
Tóm tắt best practices cho Performance Tuning ở từng level: Query, Index, Database, Design.
🚀 Bài tiếp theo
Bài tiếp theo: Database Security
Học cách bảo mật database với SQL Server authentication, permissions, và encryption!
