Using FIRST_VALUE to prioritize records in a dataset

20 Oct 2023

When working with datasets, it is often necessary to prioritize or sort records based on certain criteria. In SQL, the FIRST_VALUE function can be used to achieve this. This function allows us to select the first value in an ordered set of records according to a specified sorting order.

Syntax

The syntax for using FIRST_VALUE is as follows:

FIRST_VALUE(expression) OVER (ORDER BY column [ASC|DESC])

expression: The column or expression to retrieve the first value from.
ORDER BY column: The column used for ordering the dataset.
ASC|DESC: Optional keyword to specify the sorting order. ASC (ascending) is the default.

Example

Let’s consider a table called employees with the following columns: id, name, and salary. We want to prioritize the employees based on their salary and retrieve the highest-paid employee from each department.

Here’s how we can achieve this using the FIRST_VALUE function:

SELECT id, name, salary,
  FIRST_VALUE(name) OVER (PARTITION BY department_id ORDER BY salary DESC) AS highest_paid_employee
FROM employees;

In the above example, we first partition the dataset by the department_id column. This ensures that the highest paid employee is selected for each unique department. Then, we use ORDER BY salary DESC to sort the records in descending order of salary.

The FIRST_VALUE(name) retrieves the name of the highest-paid employee for each partition.

Use Cases

The FIRST_VALUE function can be quite handy in various scenarios, such as:

Identifying the oldest or youngest person in a group.
Determining the first or last transaction made by each customer.
Finding the highest or lowest value within a specific group.

Conclusion

The FIRST_VALUE function in SQL provides a convenient way to prioritize and select records based on specified criteria. It can be a powerful tool for data analysis and decision-making in various scenarios.