Bài 4: Subqueries
1. Subquery là gì?
Subquery (nested query) là query nằm bên trong query khác:
1SELECT column12FROM table13WHERE column2 = (SELECT column2 FROM table2 WHERE ...);4 └─────────────────────────────────────┘5 SubqueryKhi nào dùng Subquery?
| Scenario | Ví dụ |
|---|---|
| So sánh với aggregate | Tìm products giá cao hơn trung bình |
| Lọc dựa trên list | Tìm customers đã mua product X |
| Tạo derived table | Complex aggregations |
| Kiểm tra existence | Tìm products chưa bao giờ được bán |
2. Scalar Subquery
2.1 Concept
Subquery trả về một giá trị duy nhất (single row, single column).
1-- Scalar subquery trong WHERE2SELECT *3FROM Production.Product4WHERE ListPrice > (SELECT AVG(ListPrice) FROM Production.Product);5 └──────────────────────────────────────────────┘6 Returns: 438.66 (single value)2.2 Examples
Products đắt hơn giá trung bình:
1SELECT 2 Name,3 ListPrice,4 (SELECT AVG(ListPrice) FROM Production.Product WHERE ListPrice > 0) AS AvgPrice5FROM Production.Product6WHERE ListPrice > (SELECT AVG(ListPrice) FROM Production.Product WHERE ListPrice > 0)7ORDER BY ListPrice DESC;Đơn hàng lớn nhất:
1SELECT 2 SalesOrderID,3 CustomerID,4 TotalDue5FROM Sales.SalesOrderHeader6WHERE TotalDue = (SELECT MAX(TotalDue) FROM Sales.SalesOrderHeader);So sánh với tổng:
1-- Tỷ lệ % doanh thu của mỗi Territory2SELECT 3 st.Name AS Territory,4 SUM(soh.TotalDue) AS TerritoryRevenue,5 CAST(SUM(soh.TotalDue) * 100.0 / 6 (SELECT SUM(TotalDue) FROM Sales.SalesOrderHeader) AS DECIMAL(5,2)) AS RevenuePercent7FROM Sales.SalesOrderHeader soh8JOIN Sales.SalesTerritory st ON soh.TerritoryID = st.TerritoryID9GROUP BY st.TerritoryID, st.Name10ORDER BY TerritoryRevenue DESC;3. Column Subquery (IN, NOT IN)
3.1 Concept
Subquery trả về một column (nhiều rows, một column).
Thường dùng với IN, NOT IN, ANY, ALL.
3.2 IN Subquery
1-- Customers đã mua Bikes2SELECT DISTINCT3 c.CustomerID,4 p.FirstName + ' ' + p.LastName AS CustomerName5FROM Sales.Customer c6JOIN Person.Person p ON c.PersonID = p.BusinessEntityID7WHERE c.CustomerID IN (8 SELECT DISTINCT soh.CustomerID9 FROM Sales.SalesOrderHeader soh10 JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID11 JOIN Production.Product prod ON sod.ProductID = prod.ProductID12 JOIN Production.ProductSubcategory ps ON prod.ProductSubcategoryID = ps.ProductSubcategoryID13 JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID14 WHERE pc.Name = 'Bikes'15);3.3 NOT IN Subquery
1-- Products chưa bao giờ được đặt hàng2SELECT 3 ProductID,4 Name,5 ListPrice6FROM Production.Product7WHERE ProductID NOT IN (8 SELECT DISTINCT ProductID 9 FROM Sales.SalesOrderDetail10)11AND ListPrice > 012ORDER BY ListPrice DESC;Nếu subquery trả về NULL, NOT IN sẽ không hoạt động như mong đợi. Luôn exclude NULL hoặc dùng NOT EXISTS.
1-- ✅ An toàn hơn: exclude NULL trong subquery2WHERE ProductID NOT IN (3 SELECT DISTINCT ProductID 4 FROM Sales.SalesOrderDetail5 WHERE ProductID IS NOT NULL6)3.4 ANY / ALL
1-- Products đắt hơn BẤT KỲ product nào trong category Bikes2SELECT Name, ListPrice3FROM Production.Product4WHERE ListPrice > ANY (5 SELECT p.ListPrice6 FROM Production.Product p7 JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID8 JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID9 WHERE pc.Name = 'Bikes'10);1112-- Products đắt hơn TẤT CẢ products trong category Clothing13SELECT Name, ListPrice14FROM Production.Product15WHERE ListPrice > ALL (16 SELECT p.ListPrice17 FROM Production.Product p18 JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID19 JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID20 WHERE pc.Name = 'Clothing'21 AND p.ListPrice > 022);4. Table Subquery (Derived Table)
4.1 Concept
Subquery trong FROM clause, tạo thành temporary table.
1SELECT *2FROM (3 SELECT column1, column24 FROM table15 WHERE condition6) AS DerivedTable;4.2 Examples
Top customers mỗi territory:
1SELECT 2 Territory,3 CustomerName,4 TotalSpent5FROM (6 SELECT 7 st.Name AS Territory,8 p.FirstName + ' ' + p.LastName AS CustomerName,9 SUM(soh.TotalDue) AS TotalSpent,10 ROW_NUMBER() OVER (PARTITION BY st.TerritoryID ORDER BY SUM(soh.TotalDue) DESC) AS Rank11 FROM Sales.SalesOrderHeader soh12 JOIN Sales.Customer c ON soh.CustomerID = c.CustomerID13 JOIN Person.Person p ON c.PersonID = p.BusinessEntityID14 JOIN Sales.SalesTerritory st ON soh.TerritoryID = st.TerritoryID15 GROUP BY st.TerritoryID, st.Name, c.CustomerID, p.FirstName, p.LastName16) AS RankedCustomers17WHERE Rank <= 318ORDER BY Territory, Rank;Aggregation trên aggregation:
1-- Average của monthly sales2SELECT 3 AVG(MonthlySales) AS AvgMonthlySales,4 MIN(MonthlySales) AS MinMonthlySales,5 MAX(MonthlySales) AS MaxMonthlySales6FROM (7 SELECT 8 YEAR(OrderDate) AS Year,9 MONTH(OrderDate) AS Month,10 SUM(TotalDue) AS MonthlySales11 FROM Sales.SalesOrderHeader12 GROUP BY YEAR(OrderDate), MONTH(OrderDate)13) AS MonthlySalesData;5. Correlated Subquery
5.1 Concept
Subquery tham chiếu đến outer query. Chạy một lần cho mỗi row của outer query.
1SELECT *2FROM table1 t13WHERE column1 = (4 SELECT MAX(column1)5 FROM table1 t26 WHERE t2.category = t1.category -- Reference to outer query7);5.2 Examples
Product đắt nhất mỗi Category:
1SELECT 2 p.Name AS ProductName,3 p.ListPrice,4 pc.Name AS Category5FROM Production.Product p6JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID7JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID8WHERE p.ListPrice = (9 SELECT MAX(p2.ListPrice)10 FROM Production.Product p211 JOIN Production.ProductSubcategory ps2 ON p2.ProductSubcategoryID = ps2.ProductSubcategoryID12 WHERE ps2.ProductCategoryID = ps.ProductCategoryID13)14ORDER BY pc.Name;Employees earning above department average:
1SELECT 2 e.BusinessEntityID,3 p.FirstName + ' ' + p.LastName AS EmployeeName,4 d.Name AS Department,5 eph.Rate AS HourlyRate6FROM HumanResources.Employee e7JOIN Person.Person p ON e.BusinessEntityID = p.BusinessEntityID8JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID9JOIN HumanResources.Department d ON edh.DepartmentID = d.DepartmentID10JOIN HumanResources.EmployeePayHistory eph ON e.BusinessEntityID = eph.BusinessEntityID11WHERE edh.EndDate IS NULL12AND eph.Rate > (13 SELECT AVG(eph2.Rate)14 FROM HumanResources.Employee e215 JOIN HumanResources.EmployeeDepartmentHistory edh2 ON e2.BusinessEntityID = edh2.BusinessEntityID16 JOIN HumanResources.EmployeePayHistory eph2 ON e2.BusinessEntityID = eph2.BusinessEntityID17 WHERE edh2.DepartmentID = edh.DepartmentID -- Correlated!18 AND edh2.EndDate IS NULL19)20ORDER BY d.Name, eph.Rate DESC;6. EXISTS / NOT EXISTS
6.1 Concept
Kiểm tra sự tồn tại của rows trong subquery. Thường nhanh hơn IN với large datasets.
1SELECT *2FROM table1 t13WHERE EXISTS (4 SELECT 15 FROM table2 t26 WHERE t2.key = t1.key7);6.2 Examples
Customers đã đặt hàng (EXISTS):
1SELECT 2 c.CustomerID,3 p.FirstName + ' ' + p.LastName AS CustomerName4FROM Sales.Customer c5JOIN Person.Person p ON c.PersonID = p.BusinessEntityID6WHERE EXISTS (7 SELECT 18 FROM Sales.SalesOrderHeader soh9 WHERE soh.CustomerID = c.CustomerID10);Products chưa được bán (NOT EXISTS):
1SELECT 2 p.ProductID,3 p.Name,4 p.ListPrice5FROM Production.Product p6WHERE NOT EXISTS (7 SELECT 18 FROM Sales.SalesOrderDetail sod9 WHERE sod.ProductID = p.ProductID10)11AND p.ListPrice > 012ORDER BY p.ListPrice DESC;6.3 EXISTS vs IN
1-- EXISTS: thường nhanh hơn với large subquery results2-- (dừng ngay khi tìm thấy match)34-- IN: tốt cho small, distinct lists5-- (phải evaluate toàn bộ list)7. Subquery trong SELECT
7.1 Scalar Subquery in SELECT
1-- Mỗi Product với Category average price2SELECT 3 p.Name AS ProductName,4 p.ListPrice,5 ps.Name AS Subcategory,6 (7 SELECT AVG(p2.ListPrice)8 FROM Production.Product p29 WHERE p2.ProductSubcategoryID = p.ProductSubcategoryID10 AND p2.ListPrice > 011 ) AS SubcategoryAvgPrice12FROM Production.Product p13JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID14WHERE p.ListPrice > 015ORDER BY ps.Name, p.ListPrice DESC;8. Subquery vs JOIN
Nhiều subqueries có thể viết lại bằng JOIN và ngược lại.
8.1 Comparison
1-- Subquery approach2SELECT ProductID, Name3FROM Production.Product4WHERE ProductSubcategoryID IN (5 SELECT ProductSubcategoryID6 FROM Production.ProductSubcategory7 WHERE Name LIKE '%Bikes%'8);910-- JOIN approach (thường preferred)11SELECT p.ProductID, p.Name12FROM Production.Product p13JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID14WHERE ps.Name LIKE '%Bikes%';8.2 Guidelines
| Use Subquery when | Use JOIN when |
|---|---|
| Checking existence | Cần columns từ nhiều tables |
| Comparing with aggregate | Better readability |
| NOT IN scenarios | Performance critical |
| Single value comparison |
9. Thực hành
Exercise 1: Scalar Subquery
1-- Tìm Products có giá cao hơn giá trung bình của Bikes category2-- Hiển thị: Name, ListPrice, BikeAvgPrice34-- YOUR CODE HERE💡 Xem đáp án
1SELECT 2 Name,3 ListPrice,4 (5 SELECT AVG(p2.ListPrice)6 FROM Production.Product p27 JOIN Production.ProductSubcategory ps2 ON p2.ProductSubcategoryID = ps2.ProductSubcategoryID8 JOIN Production.ProductCategory pc2 ON ps2.ProductCategoryID = pc2.ProductCategoryID9 WHERE pc2.Name = 'Bikes'10 ) AS BikeAvgPrice11FROM Production.Product12WHERE ListPrice > (13 SELECT AVG(p2.ListPrice)14 FROM Production.Product p215 JOIN Production.ProductSubcategory ps2 ON p2.ProductSubcategoryID = ps2.ProductSubcategoryID16 JOIN Production.ProductCategory pc2 ON ps2.ProductCategoryID = pc2.ProductCategoryID17 WHERE pc2.Name = 'Bikes'18)19ORDER BY ListPrice DESC;Exercise 2: IN Subquery
1-- Tìm Employees làm việc ở departments thuộc 'Research and Development'2-- Hiển thị: EmployeeName, JobTitle, Department34-- YOUR CODE HERE💡 Xem đáp án
1SELECT 2 p.FirstName + ' ' + p.LastName AS EmployeeName,3 e.JobTitle,4 d.Name AS Department5FROM HumanResources.Employee e6JOIN Person.Person p ON e.BusinessEntityID = p.BusinessEntityID7JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID8JOIN HumanResources.Department d ON edh.DepartmentID = d.DepartmentID9WHERE edh.EndDate IS NULL10AND edh.DepartmentID IN (11 SELECT DepartmentID12 FROM HumanResources.Department13 WHERE GroupName = 'Research and Development'14)15ORDER BY d.Name, p.LastName;Exercise 3: EXISTS
1-- Tìm Categories có ít nhất 1 product được bán2-- Hiển thị: CategoryName34-- YOUR CODE HERE💡 Xem đáp án
1SELECT pc.Name AS CategoryName2FROM Production.ProductCategory pc3WHERE EXISTS (4 SELECT 15 FROM Production.ProductSubcategory ps6 JOIN Production.Product p ON ps.ProductSubcategoryID = p.ProductSubcategoryID7 JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID8 WHERE ps.ProductCategoryID = pc.ProductCategoryID9)10ORDER BY pc.Name;Exercise 4: Correlated Subquery
1-- Đơn hàng đầu tiên của mỗi Customer2-- Hiển thị: CustomerID, FirstOrderDate, FirstOrderTotal34-- YOUR CODE HERE💡 Xem đáp án
1SELECT 2 soh.CustomerID,3 soh.OrderDate AS FirstOrderDate,4 soh.TotalDue AS FirstOrderTotal5FROM Sales.SalesOrderHeader soh6WHERE soh.OrderDate = (7 SELECT MIN(soh2.OrderDate)8 FROM Sales.SalesOrderHeader soh29 WHERE soh2.CustomerID = soh.CustomerID10)11ORDER BY soh.CustomerID;10. Tổng kết
| Subquery Type | Vị trí | Returns | Use Case |
|---|---|---|---|
| Scalar | WHERE, SELECT | Single value | So sánh với aggregate |
| Column (IN) | WHERE | List values | Filter by list |
| Table | FROM | Result set | Derived tables |
| Correlated | WHERE | Varies | Row-by-row comparison |
| EXISTS | WHERE | Boolean | Check existence |
Best Practices:
- Tránh NOT IN với subquery có thể trả về NULL → dùng NOT EXISTS
- Correlated subqueries có thể chậm → consider JOIN
- Dùng EXISTS khi chỉ cần check existence
- Derived tables tốt cho complex aggregations
Bài tiếp theo: CTEs và Window Functions
