Published on

SQL Window Functions: RANK and DENSE_RANK

image
Authors
  • avatar
    Name
    David Jimenez
    Twitter

Imagine, as others have before1, that we have the table2 below containing employee's salary data:

Id Name Salary DepartmentId
1 Joe 70000.0000 1
2 Jim 90000.0000 1
3 Henry 80000.0000 2
4 Sam 60000.0000 2
5 Max 90000.0000 1

Finding the highest salary would be simple:

SELECT MAX(Salary) HighestSalary
FROM dbo.Employee;
HighestSalary
90000.0000

What if we wanted to know who is earning those high salaries? We could leverage our previous query, and then use something like this:

SELECT [Name], Salary
FROM dbo.Employee
WHERE Salary IN (SELECT MAX(Salary) FROM dbo.Employee);
Name Salary
Jim 90000.0000
Max 90000.0000

Different departments command different salary ranges. To find the highest salaries per department, we can run this query:

SELECT DepartmentId, MAX(Salary) MaxSalary
FROM dbo.Employee
GROUP BY DepartmentId
DepartmentId MaxSalary
1 90000.0000
2 80000.0000

To find the top earners per department, we can run this query:

SELECT E.*
FROM dbo.Employee E
INNER JOIN
    (SELECT DepartmentId, MAX(Salary) MaxSalary
    FROM dbo.Employee
    GROUP BY DepartmentId) DepartmentMaxSalary
    ON 
        E.DepartmentId = DepartmentMaxSalary.DepartmentId
        AND E.Salary = DepartmentMaxSalary.MaxSalary

Or this one, though our data set is so small it hardly matters:

SELECT E.*
FROM dbo.Employee E
WHERE EXISTS
    (
        SELECT 1
        FROM
            (SELECT DepartmentId, MAX(Salary) MaxSalary
            FROM dbo.Employee
            GROUP BY DepartmentId) DepartmentMaxSalary
        WHERE 
            E.DepartmentId = DepartmentMaxSalary.DepartmentId
            AND E.Salary = DepartmentMaxSalary.MaxSalary
    )
Id Name Salary DepartmentId
2 Jim 90000.0000 1
3 Henry 80000.0000 2
5 Max 90000.0000 1

RANK and DENSE_RANK

Let's now use the RANK and DENSE_RANK functions.

RANK

SELECT 
    E.*,
    RANK() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) SalaryRank
FROM dbo.Employee E
Id Name Salary DepartmentId SalaryRank
2 Jim 90000.0000 1 1
5 Max 90000.0000 1 1
1 Joe 70000.0000 1 3
3 Henry 80000.0000 2 1
4 Sam 60000.0000 2 2

DENSE_RANK

SELECT 
    E.*,
    DENSE_RANK() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) SalaryRank
FROM dbo.Employee E
Id Name Salary DepartmentId SalaryRank
2 Jim 90000.0000 1 1
5 Max 90000.0000 1 1
1 Joe 70000.0000 1 2
3 Henry 80000.0000 2 1
4 Sam 60000.0000 2 2

The difference between RANK and DENSE_RANK

When we used RANK, Joe came in third because there were two people in the number one spot. When using DENSE_RANK Joe is second. We'll see the impact this has at the end of the post.

Another approach to finding the highest earners

We can answer the question of who are the top earners per department using the RANK function:

WITH RankedData AS
(
    SELECT 
        E.*,
        RANK() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) SalaryRank
    FROM dbo.Employee E
)
SELECT Id, [Name], Salary, DepartmentId
FROM RankedData 
WHERE SalaryRank = 1;

At first pass, the difference between the first approach and the window function does not seem that significant. The number of lines of SQL, for instance, is very similar. So, why bother with window functions?

One reason is readibility. It is faster for me to understand what the solution with RANK is doing. The second and more important reason, is that the window function solution can be easily modified to address additional questions. For example, who were the second highest earners?

WITH RankedData AS
(
    SELECT 
        E.*,
        DENSE_RANK() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) SalaryRank
    FROM dbo.Employee E
)
SELECT Id, [Name], Salary, DepartmentId
FROM RankedData 
WHERE SalaryRank = 2;
Id Name Salary DepartmentId
1 Joe 70000.0000 1
4 Sam 60000.0000 2

Note that we switched RANK for DENSE_RANK. If had used RANK, only Sam would have appeared in the result set.

Footnotes

  1. See for example: https://www.sqltutorial.org/sql-window-functions/sql-rank/

  2. Data comes from this LeetCode problem: https://leetcode.com/problems/department-highest-salary/description