- Published on
SQL Window Functions: RANK and DENSE_RANK
- Authors
- Name
- David Jimenez
Imagine, as others have before1, that we have the table2 below containing employee's salary data:
Id | Name | Salary | DepartmentId |
---|---|---|---|
1 | Joe | 70000.0000 | 1 |
2 | Jim | 90000.0000 | 1 |
3 | Henry | 80000.0000 | 2 |
4 | Sam | 60000.0000 | 2 |
5 | Max | 90000.0000 | 1 |
Finding the highest salary would be simple:
SELECT MAX(Salary) HighestSalary
FROM dbo.Employee;
HighestSalary |
---|
90000.0000 |
What if we wanted to know who is earning those high salaries? We could leverage our previous query, and then use something like this:
SELECT [Name], Salary
FROM dbo.Employee
WHERE Salary IN (SELECT MAX(Salary) FROM dbo.Employee);
Name | Salary |
---|---|
Jim | 90000.0000 |
Max | 90000.0000 |
Different departments command different salary ranges. To find the highest salaries per department, we can run this query:
SELECT DepartmentId, MAX(Salary) MaxSalary
FROM dbo.Employee
GROUP BY DepartmentId
DepartmentId | MaxSalary |
---|---|
1 | 90000.0000 |
2 | 80000.0000 |
To find the top earners per department, we can run this query:
SELECT E.*
FROM dbo.Employee E
INNER JOIN
(SELECT DepartmentId, MAX(Salary) MaxSalary
FROM dbo.Employee
GROUP BY DepartmentId) DepartmentMaxSalary
ON
E.DepartmentId = DepartmentMaxSalary.DepartmentId
AND E.Salary = DepartmentMaxSalary.MaxSalary
Or this one, though our data set is so small it hardly matters:
SELECT E.*
FROM dbo.Employee E
WHERE EXISTS
(
SELECT 1
FROM
(SELECT DepartmentId, MAX(Salary) MaxSalary
FROM dbo.Employee
GROUP BY DepartmentId) DepartmentMaxSalary
WHERE
E.DepartmentId = DepartmentMaxSalary.DepartmentId
AND E.Salary = DepartmentMaxSalary.MaxSalary
)
Id | Name | Salary | DepartmentId |
---|---|---|---|
2 | Jim | 90000.0000 | 1 |
3 | Henry | 80000.0000 | 2 |
5 | Max | 90000.0000 | 1 |
RANK and DENSE_RANK
Let's now use the RANK
and DENSE_RANK
functions.
RANK
SELECT
E.*,
RANK() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) SalaryRank
FROM dbo.Employee E
Id | Name | Salary | DepartmentId | SalaryRank |
---|---|---|---|---|
2 | Jim | 90000.0000 | 1 | 1 |
5 | Max | 90000.0000 | 1 | 1 |
1 | Joe | 70000.0000 | 1 | 3 |
3 | Henry | 80000.0000 | 2 | 1 |
4 | Sam | 60000.0000 | 2 | 2 |
DENSE_RANK
SELECT
E.*,
DENSE_RANK() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) SalaryRank
FROM dbo.Employee E
Id | Name | Salary | DepartmentId | SalaryRank |
---|---|---|---|---|
2 | Jim | 90000.0000 | 1 | 1 |
5 | Max | 90000.0000 | 1 | 1 |
1 | Joe | 70000.0000 | 1 | 2 |
3 | Henry | 80000.0000 | 2 | 1 |
4 | Sam | 60000.0000 | 2 | 2 |
The difference between RANK and DENSE_RANK
When we used RANK
, Joe came in third because there were two people in the number one spot. When using DENSE_RANK
Joe is second. We'll see the impact this has at the end of the post.
Another approach to finding the highest earners
We can answer the question of who are the top earners per department using the RANK
function:
WITH RankedData AS
(
SELECT
E.*,
RANK() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) SalaryRank
FROM dbo.Employee E
)
SELECT Id, [Name], Salary, DepartmentId
FROM RankedData
WHERE SalaryRank = 1;
At first pass, the difference between the first approach and the window function does not seem that significant. The number of lines of SQL, for instance, is very similar. So, why bother with window functions?
One reason is readibility. It is faster for me to understand what the solution with RANK
is doing. The second and more important reason, is that the window function solution can be easily modified to address additional questions. For example, who were the second highest earners?
WITH RankedData AS
(
SELECT
E.*,
DENSE_RANK() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) SalaryRank
FROM dbo.Employee E
)
SELECT Id, [Name], Salary, DepartmentId
FROM RankedData
WHERE SalaryRank = 2;
Id | Name | Salary | DepartmentId |
---|---|---|---|
1 | Joe | 70000.0000 | 1 |
4 | Sam | 60000.0000 | 2 |
Note that we switched RANK
for DENSE_RANK
. If had used RANK
, only Sam would have appeared in the result set.
Footnotes
-
See for example: https://www.sqltutorial.org/sql-window-functions/sql-rank/ ↩
-
Data comes from this LeetCode problem: https://leetcode.com/problems/department-highest-salary/description ↩