Lý thuyết
Bài 4/10

Subqueries

Scalar, Column, Table Subqueries và Correlated Subqueries

Bài 4: Subqueries

Nested Queries Concept

1. Subquery là gì?

Subquery (nested query) là query nằm bên trong query khác:

SQL
1SELECT column1
2FROM table1
3WHERE column2 = (SELECT column2 FROM table2 WHERE ...);
4
5 Subquery

Khi nào dùng Subquery?

ScenarioVí dụ
So sánh với aggregateTìm products giá cao hơn trung bình
Lọc dựa trên listTìm customers đã mua product X
Tạo derived tableComplex aggregations
Kiểm tra existenceTìm products chưa bao giờ được bán

2. Scalar Subquery

2.1 Concept

Subquery trả về một giá trị duy nhất (single row, single column).

SQL
1-- Scalar subquery trong WHERE
2SELECT *
3FROM Production.Product
4WHERE ListPrice > (SELECT AVG(ListPrice) FROM Production.Product);
5
6 Returns: 438.66 (single value)

2.2 Examples

Products đắt hơn giá trung bình:

SQL
1SELECT
2 Name,
3 ListPrice,
4 (SELECT AVG(ListPrice) FROM Production.Product WHERE ListPrice > 0) AS AvgPrice
5FROM Production.Product
6WHERE ListPrice > (SELECT AVG(ListPrice) FROM Production.Product WHERE ListPrice > 0)
7ORDER BY ListPrice DESC;

Đơn hàng lớn nhất:

SQL
1SELECT
2 SalesOrderID,
3 CustomerID,
4 TotalDue
5FROM Sales.SalesOrderHeader
6WHERE TotalDue = (SELECT MAX(TotalDue) FROM Sales.SalesOrderHeader);

So sánh với tổng:

SQL
1-- Tỷ lệ % doanh thu của mỗi Territory
2SELECT
3 st.Name AS Territory,
4 SUM(soh.TotalDue) AS TerritoryRevenue,
5 CAST(SUM(soh.TotalDue) * 100.0 /
6 (SELECT SUM(TotalDue) FROM Sales.SalesOrderHeader) AS DECIMAL(5,2)) AS RevenuePercent
7FROM Sales.SalesOrderHeader soh
8JOIN Sales.SalesTerritory st ON soh.TerritoryID = st.TerritoryID
9GROUP BY st.TerritoryID, st.Name
10ORDER BY TerritoryRevenue DESC;

3. Column Subquery (IN, NOT IN)

3.1 Concept

Subquery trả về một column (nhiều rows, một column). Thường dùng với IN, NOT IN, ANY, ALL.

3.2 IN Subquery

SQL
1-- Customers đã mua Bikes
2SELECT DISTINCT
3 c.CustomerID,
4 p.FirstName + ' ' + p.LastName AS CustomerName
5FROM Sales.Customer c
6JOIN Person.Person p ON c.PersonID = p.BusinessEntityID
7WHERE c.CustomerID IN (
8 SELECT DISTINCT soh.CustomerID
9 FROM Sales.SalesOrderHeader soh
10 JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID
11 JOIN Production.Product prod ON sod.ProductID = prod.ProductID
12 JOIN Production.ProductSubcategory ps ON prod.ProductSubcategoryID = ps.ProductSubcategoryID
13 JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
14 WHERE pc.Name = 'Bikes'
15);

3.3 NOT IN Subquery

SQL
1-- Products chưa bao giờ được đặt hàng
2SELECT
3 ProductID,
4 Name,
5 ListPrice
6FROM Production.Product
7WHERE ProductID NOT IN (
8 SELECT DISTINCT ProductID
9 FROM Sales.SalesOrderDetail
10)
11AND ListPrice > 0
12ORDER BY ListPrice DESC;
Cẩn thận với NOT IN và NULL

Nếu subquery trả về NULL, NOT IN sẽ không hoạt động như mong đợi. Luôn exclude NULL hoặc dùng NOT EXISTS.

SQL
1-- ✅ An toàn hơn: exclude NULL trong subquery
2WHERE ProductID NOT IN (
3 SELECT DISTINCT ProductID
4 FROM Sales.SalesOrderDetail
5 WHERE ProductID IS NOT NULL
6)

3.4 ANY / ALL

SQL
1-- Products đắt hơn BẤT KỲ product nào trong category Bikes
2SELECT Name, ListPrice
3FROM Production.Product
4WHERE ListPrice > ANY (
5 SELECT p.ListPrice
6 FROM Production.Product p
7 JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
8 JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
9 WHERE pc.Name = 'Bikes'
10);
11
12-- Products đắt hơn TẤT CẢ products trong category Clothing
13SELECT Name, ListPrice
14FROM Production.Product
15WHERE ListPrice > ALL (
16 SELECT p.ListPrice
17 FROM Production.Product p
18 JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
19 JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
20 WHERE pc.Name = 'Clothing'
21 AND p.ListPrice > 0
22);

4. Table Subquery (Derived Table)

4.1 Concept

Subquery trong FROM clause, tạo thành temporary table.

SQL
1SELECT *
2FROM (
3 SELECT column1, column2
4 FROM table1
5 WHERE condition
6) AS DerivedTable;

4.2 Examples

Top customers mỗi territory:

SQL
1SELECT
2 Territory,
3 CustomerName,
4 TotalSpent
5FROM (
6 SELECT
7 st.Name AS Territory,
8 p.FirstName + ' ' + p.LastName AS CustomerName,
9 SUM(soh.TotalDue) AS TotalSpent,
10 ROW_NUMBER() OVER (PARTITION BY st.TerritoryID ORDER BY SUM(soh.TotalDue) DESC) AS Rank
11 FROM Sales.SalesOrderHeader soh
12 JOIN Sales.Customer c ON soh.CustomerID = c.CustomerID
13 JOIN Person.Person p ON c.PersonID = p.BusinessEntityID
14 JOIN Sales.SalesTerritory st ON soh.TerritoryID = st.TerritoryID
15 GROUP BY st.TerritoryID, st.Name, c.CustomerID, p.FirstName, p.LastName
16) AS RankedCustomers
17WHERE Rank <= 3
18ORDER BY Territory, Rank;

Aggregation trên aggregation:

SQL
1-- Average của monthly sales
2SELECT
3 AVG(MonthlySales) AS AvgMonthlySales,
4 MIN(MonthlySales) AS MinMonthlySales,
5 MAX(MonthlySales) AS MaxMonthlySales
6FROM (
7 SELECT
8 YEAR(OrderDate) AS Year,
9 MONTH(OrderDate) AS Month,
10 SUM(TotalDue) AS MonthlySales
11 FROM Sales.SalesOrderHeader
12 GROUP BY YEAR(OrderDate), MONTH(OrderDate)
13) AS MonthlySalesData;

5. Correlated Subquery

5.1 Concept

Subquery tham chiếu đến outer query. Chạy một lần cho mỗi row của outer query.

SQL
1SELECT *
2FROM table1 t1
3WHERE column1 = (
4 SELECT MAX(column1)
5 FROM table1 t2
6 WHERE t2.category = t1.category -- Reference to outer query
7);

5.2 Examples

Product đắt nhất mỗi Category:

SQL
1SELECT
2 p.Name AS ProductName,
3 p.ListPrice,
4 pc.Name AS Category
5FROM Production.Product p
6JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
7JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
8WHERE p.ListPrice = (
9 SELECT MAX(p2.ListPrice)
10 FROM Production.Product p2
11 JOIN Production.ProductSubcategory ps2 ON p2.ProductSubcategoryID = ps2.ProductSubcategoryID
12 WHERE ps2.ProductCategoryID = ps.ProductCategoryID
13)
14ORDER BY pc.Name;

Employees earning above department average:

SQL
1SELECT
2 e.BusinessEntityID,
3 p.FirstName + ' ' + p.LastName AS EmployeeName,
4 d.Name AS Department,
5 eph.Rate AS HourlyRate
6FROM HumanResources.Employee e
7JOIN Person.Person p ON e.BusinessEntityID = p.BusinessEntityID
8JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID
9JOIN HumanResources.Department d ON edh.DepartmentID = d.DepartmentID
10JOIN HumanResources.EmployeePayHistory eph ON e.BusinessEntityID = eph.BusinessEntityID
11WHERE edh.EndDate IS NULL
12AND eph.Rate > (
13 SELECT AVG(eph2.Rate)
14 FROM HumanResources.Employee e2
15 JOIN HumanResources.EmployeeDepartmentHistory edh2 ON e2.BusinessEntityID = edh2.BusinessEntityID
16 JOIN HumanResources.EmployeePayHistory eph2 ON e2.BusinessEntityID = eph2.BusinessEntityID
17 WHERE edh2.DepartmentID = edh.DepartmentID -- Correlated!
18 AND edh2.EndDate IS NULL
19)
20ORDER BY d.Name, eph.Rate DESC;

6. EXISTS / NOT EXISTS

6.1 Concept

Kiểm tra sự tồn tại của rows trong subquery. Thường nhanh hơn IN với large datasets.

SQL
1SELECT *
2FROM table1 t1
3WHERE EXISTS (
4 SELECT 1
5 FROM table2 t2
6 WHERE t2.key = t1.key
7);

6.2 Examples

Customers đã đặt hàng (EXISTS):

SQL
1SELECT
2 c.CustomerID,
3 p.FirstName + ' ' + p.LastName AS CustomerName
4FROM Sales.Customer c
5JOIN Person.Person p ON c.PersonID = p.BusinessEntityID
6WHERE EXISTS (
7 SELECT 1
8 FROM Sales.SalesOrderHeader soh
9 WHERE soh.CustomerID = c.CustomerID
10);

Products chưa được bán (NOT EXISTS):

SQL
1SELECT
2 p.ProductID,
3 p.Name,
4 p.ListPrice
5FROM Production.Product p
6WHERE NOT EXISTS (
7 SELECT 1
8 FROM Sales.SalesOrderDetail sod
9 WHERE sod.ProductID = p.ProductID
10)
11AND p.ListPrice > 0
12ORDER BY p.ListPrice DESC;

6.3 EXISTS vs IN

SQL
1-- EXISTS: thường nhanh hơn với large subquery results
2-- (dừng ngay khi tìm thấy match)
3
4-- IN: tốt cho small, distinct lists
5-- (phải evaluate toàn bộ list)

7. Subquery trong SELECT

7.1 Scalar Subquery in SELECT

SQL
1-- Mỗi Product với Category average price
2SELECT
3 p.Name AS ProductName,
4 p.ListPrice,
5 ps.Name AS Subcategory,
6 (
7 SELECT AVG(p2.ListPrice)
8 FROM Production.Product p2
9 WHERE p2.ProductSubcategoryID = p.ProductSubcategoryID
10 AND p2.ListPrice > 0
11 ) AS SubcategoryAvgPrice
12FROM Production.Product p
13JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
14WHERE p.ListPrice > 0
15ORDER BY ps.Name, p.ListPrice DESC;

8. Subquery vs JOIN

Khi nào dùng gì?

Nhiều subqueries có thể viết lại bằng JOIN và ngược lại.

8.1 Comparison

SQL
1-- Subquery approach
2SELECT ProductID, Name
3FROM Production.Product
4WHERE ProductSubcategoryID IN (
5 SELECT ProductSubcategoryID
6 FROM Production.ProductSubcategory
7 WHERE Name LIKE '%Bikes%'
8);
9
10-- JOIN approach (thường preferred)
11SELECT p.ProductID, p.Name
12FROM Production.Product p
13JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
14WHERE ps.Name LIKE '%Bikes%';

8.2 Guidelines

Use Subquery whenUse JOIN when
Checking existenceCần columns từ nhiều tables
Comparing with aggregateBetter readability
NOT IN scenariosPerformance critical
Single value comparison

9. Thực hành

Exercises

Exercise 1: Scalar Subquery

SQL
1-- Tìm Products có giá cao hơn giá trung bình của Bikes category
2-- Hiển thị: Name, ListPrice, BikeAvgPrice
3
4-- YOUR CODE HERE
💡 Xem đáp án
SQL
1SELECT
2 Name,
3 ListPrice,
4 (
5 SELECT AVG(p2.ListPrice)
6 FROM Production.Product p2
7 JOIN Production.ProductSubcategory ps2 ON p2.ProductSubcategoryID = ps2.ProductSubcategoryID
8 JOIN Production.ProductCategory pc2 ON ps2.ProductCategoryID = pc2.ProductCategoryID
9 WHERE pc2.Name = 'Bikes'
10 ) AS BikeAvgPrice
11FROM Production.Product
12WHERE ListPrice > (
13 SELECT AVG(p2.ListPrice)
14 FROM Production.Product p2
15 JOIN Production.ProductSubcategory ps2 ON p2.ProductSubcategoryID = ps2.ProductSubcategoryID
16 JOIN Production.ProductCategory pc2 ON ps2.ProductCategoryID = pc2.ProductCategoryID
17 WHERE pc2.Name = 'Bikes'
18)
19ORDER BY ListPrice DESC;

Exercise 2: IN Subquery

SQL
1-- Tìm Employees làm việc ở departments thuộc 'Research and Development'
2-- Hiển thị: EmployeeName, JobTitle, Department
3
4-- YOUR CODE HERE
💡 Xem đáp án
SQL
1SELECT
2 p.FirstName + ' ' + p.LastName AS EmployeeName,
3 e.JobTitle,
4 d.Name AS Department
5FROM HumanResources.Employee e
6JOIN Person.Person p ON e.BusinessEntityID = p.BusinessEntityID
7JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID
8JOIN HumanResources.Department d ON edh.DepartmentID = d.DepartmentID
9WHERE edh.EndDate IS NULL
10AND edh.DepartmentID IN (
11 SELECT DepartmentID
12 FROM HumanResources.Department
13 WHERE GroupName = 'Research and Development'
14)
15ORDER BY d.Name, p.LastName;

Exercise 3: EXISTS

SQL
1-- Tìm Categories có ít nhất 1 product được bán
2-- Hiển thị: CategoryName
3
4-- YOUR CODE HERE
💡 Xem đáp án
SQL
1SELECT pc.Name AS CategoryName
2FROM Production.ProductCategory pc
3WHERE EXISTS (
4 SELECT 1
5 FROM Production.ProductSubcategory ps
6 JOIN Production.Product p ON ps.ProductSubcategoryID = p.ProductSubcategoryID
7 JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID
8 WHERE ps.ProductCategoryID = pc.ProductCategoryID
9)
10ORDER BY pc.Name;

Exercise 4: Correlated Subquery

SQL
1-- Đơn hàng đầu tiên của mỗi Customer
2-- Hiển thị: CustomerID, FirstOrderDate, FirstOrderTotal
3
4-- YOUR CODE HERE
💡 Xem đáp án
SQL
1SELECT
2 soh.CustomerID,
3 soh.OrderDate AS FirstOrderDate,
4 soh.TotalDue AS FirstOrderTotal
5FROM Sales.SalesOrderHeader soh
6WHERE soh.OrderDate = (
7 SELECT MIN(soh2.OrderDate)
8 FROM Sales.SalesOrderHeader soh2
9 WHERE soh2.CustomerID = soh.CustomerID
10)
11ORDER BY soh.CustomerID;

10. Tổng kết

Subquery TypeVị tríReturnsUse Case
ScalarWHERE, SELECTSingle valueSo sánh với aggregate
Column (IN)WHEREList valuesFilter by list
TableFROMResult setDerived tables
CorrelatedWHEREVariesRow-by-row comparison
EXISTSWHEREBooleanCheck existence

Best Practices:

  • Tránh NOT IN với subquery có thể trả về NULL → dùng NOT EXISTS
  • Correlated subqueries có thể chậm → consider JOIN
  • Dùng EXISTS khi chỉ cần check existence
  • Derived tables tốt cho complex aggregations

Bài tiếp theo: CTEs và Window Functions