Window functions
Window functions are functions that perform calculations for a set of rows that are related to the current row in some way. Unlike aggregate functions, values calculated by window functions can be inserted into each row of the source table.
A window function call always contains an OVER
expression following the window function name and arguments. The OVER
expression specifies exactly how the rows of the data set are to be divided for processing by the window function.
The OVER
expression may be supplemented by the PARTITION BY
expression, which divides the rows of the data set into groups, combining the rows into groups by matching the values of the specified columns. The value of the window function is calculated by the rows that fall into the same group as the current row.
cume_dist()
Description |
Cumulative allocation. |
Usage |
|
Cumulative distribution: the number of rows in the group preceding the current row or with the same value as the current row, divided by the total number of rows in the group.
If the ORDER BY
condition is specified, the value is calculated using the specified order.
See example
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 2),
('A', 3),
('B', 4);
SELECT gr, num,
cume_dist(ORDER BY num) OVER (PARTITION BY gr) AS cume_dist
FROM t ORDER BY num;
+-------+--------+-----------+
| gr | num | cume_dist |
+-------+--------+-----------+
| A | 1 | 0.25 |
+-------+--------+-----------+
| A | 2 | 0.75 |
+-------+--------+-----------+
| A | 2 | 0.75 |
+-------+--------+-----------+
| A | 3 | 1 |
+-------+--------+-----------+
| B | 4 | 1 |
+-------+--------+-----------+
dense_rank()
Description |
The rank of the current string within the group without skips. |
Usage |
|
Aliases |
|
Returns the rank of the current string within a group without skips. Essentially this function counts groups of rows with matching values on a given column and assigns them numbers within the group.
See example
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 2),
('A', 5),
('B', 4);
SELECT gr, num,
dense_rank() OVER (PARTITION BY gr ORDER BY num) AS dense_rank
FROM t ORDER BY gr, num;
+----+--------+------------+
| gr | num | dense_rank |
+----+--------+------------+
| A | 1 | 1 |
+----+--------+------------+
| A | 2 | 2 |
+----+--------+------------+
| A | 2 | 2 |
+----+--------+------------+
| A | 5 | 3 |
+----+--------+------------+
| B | 4 | 1 |
+----+--------+------------+
first_value()
Description |
Returns the value calculated using the specified expression for the first row of the group. |
Usage |
|
If the ORDER BY
condition is specified, the value is calculated using the specified order.
Parameters
-
IGNORE NULLS
— the value is calculated excluding rows withNULL
value.
See example
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 3),
('A', NULL),
('B', 4);
SELECT gr, num,
first_value(num) OVER (PARTITION BY gr) AS first_value
FROM t ORDER BY gr, num;
+----+------+-------------+
| gr | num | first_value |
+----+------+-------------+
| A | 1 | 1 |
+----+------+-------------+
| A | 2 | 1 |
+----+------+-------------+
| A | 3 | 1 |
+----+------+-------------+
| A | null | 1 |
+----+------+-------------+
| B | 4 | 4 |
+----+------+-------------+
lag()
Description |
Returns the value calculated for the string shifted by |
Usage |
|
If no such row exists, the value default
is returned (it must be of a compatible type).
Both arguments, offset
and default
, are optional. If they are not specified, the following default values are used:
-
offset: 1
-
default: NULL
If the ORDER BY
condition is specified, the value is calculated using the specified order.
Parameters
-
IGNORE NULLS
— the value is calculated excluding rows withNULL
value.
See examples
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 3),
('B', 4),
('B', 5);
SELECT gr, num,
lag(num) OVER (PARTITION BY gr) AS lag
FROM t ORDER BY gr, num;
+----+-----+------+
| gr | num | lag |
+----+-----+------+
| A | 1 | null |
+----+-----+------+
| A | 2 | 1 |
+----+-----+------+
| A | 3 | 2 |
+----+-----+------+
| B | 4 | null |
+----+-----+------+
| B | 5 | 4 |
+----+-----+------+
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 3),
('B', 4),
('B', 5);
SELECT gr, num,
lag(num, 1, -1) OVER (PARTITION BY gr) AS lag
FROM t ORDER BY gr, num;
+----+-----+-----+
| gr | num | lag |
+----+-----+-----+
| A | 1 | -1 |
+----+-----+-----+
| A | 2 | 1 |
+----+-----+-----+
| A | 3 | 2 |
+----+-----+-----+
| B | 4 | -1 |
+----+-----+-----+
| B | 5 | 4 |
+----+-----+-----+
last_value()
Description |
Returns the value calculated by the specified expression for the last row of the group. |
Usage |
|
If the ORDER BY
condition is specified, the value is calculated using the specified order.
Parameters
-
IGNORE NULLS
— the value is calculated excluding rows withNULL
value.
See example
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 3),
('A', NULL),
('B', 4),
('B', 5);
SELECT gr, num,
last_value(num IGNORE NULLS) OVER (PARTITION BY gr) AS last_value
FROM t ORDER BY gr, num;
+----+------+------------+
| gr | num | last_value |
+----+------+------------+
| A | 1 | 3 |
+----+------+------------+
| A | 2 | 3 |
+----+------+------------+
| A | 3 | 3 |
+----+------+------------+
| A | null | 3 |
+----+------+------------+
| B | 4 | 5 |
+----+------+------------+
| B | 5 | 5 |
+----+------+------------+
lead()
Description |
Returns the value calculated for the row shifted by |
Usage |
|
If no such row exists, the value of default
is returned (it must be type-compatible).
Both arguments, offset
and default
, are optional. If they are not specified, the following default values are used:
-
offset: 1
-
default: NULL
If the ORDER BY
condition is specified, the value is calculated using the specified order.
Parameters
-
IGNORE NULLS
— the value is calculated excluding rows withNULL
value.
See examples
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 3),
('B', 4),
('B', 5);
SELECT gr, num,
lead(num) OVER (PARTITION BY gr) AS lead
FROM t ORDER BY gr, num;
+----+-----+------+
| gr | num | lead |
+----+-----+------+
| A | 1 | 2 |
+----+-----+------+
| A | 2 | 3 |
+----+-----+------+
| A | 3 | null |
+----+-----+------+
| B | 4 | 5 |
+----+-----+------+
| B | 5 | null |
+----+-----+------+
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 3),
('B', 4),
('B', 5);
SELECT gr, num,
lead(num, 1, -1) OVER (PARTITION BY gr) AS lead
FROM t ORDER BY gr, num;
+----+-----+------+
| gr | num | lead |
+----+-----+------+
| A | 1 | 2 |
+----+-----+------+
| A | 2 | 3 |
+----+-----+------+
| A | 3 | -1 |
+----+-----+------+
| B | 4 | 5 |
+----+-----+------+
| B | 5 | -1 |
+----+-----+------+
nth_value()
Description |
Returns the value calculated for the nth row within a group (counting from |
Usage |
|
If there is no such row, the value NULL
is returned.
If the ORDER BY
condition is specified, the value is calculated using the specified order.
Parameters
-
IGNORE NULLS
— the value is calculated excluding rows withNULL
value.
See examples
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 3),
('B', 4),
('B', 5);
SELECT gr, num,
nth_value(num, 3) OVER (PARTITION BY gr) AS third_value
FROM t ORDER BY gr, num;
+----+-----+-------------+
| gr | num | third_value |
+----+-----+-------------+
| A | 1 | 3 |
+----+-----+-------------+
| A | 2 | 3 |
+----+-----+-------------+
| A | 3 | 3 |
+----+-----+-------------+
| B | 4 | null |
+----+-----+-------------+
| B | 5 | null |
+----+-----+-------------+
ntile()
Description |
Splits each group into |
Usage |
|
The partitioning is done so that the sizes of the subgroups are as close as possible.
If the ORDER BY
condition is specified, the value is calculated using the specified order.
See example
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 3),
('B', 4),
('B', 5);
SELECT gr, num,
ntile(2) OVER (PARTITION BY gr) AS group_number
FROM t ORDER BY gr, num;
+----+-----+--------------+
| gr | num | group_number |
+----+-----+--------------+
| A | 1 | 1 |
+----+-----+--------------+
| A | 2 | 1 |
+----+-----+--------------+
| A | 3 | 2 |
+----+-----+--------------+
| B | 4 | 1 |
+----+-----+--------------+
| B | 5 | 2 |
+----+-----+--------------+
percent_rank()
Description |
Calculates the relative rank of the current row within a group. |
Usage |
|
Returns a value of type DOUBLE
from 0
to 1
inclusive. The relative rank is calculated using the formula (rank - 1) / (total number of rows in the group - 1)
.
If the ORDER BY
condition is specified, the value is calculated using the specified order.
See example
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 3),
('B', 4),
('B', 5);
SELECT gr, num,
percent_rank(ORDER BY num) OVER (PARTITION BY gr) AS percent_rank,
FROM t ORDER BY gr, num;
+----+-----+--------------+
| gr | num | percent_rank |
+----+-----+--------------+
| A | 1 | 0 |
+----+-----+--------------+
| A | 2 | 0.5 |
+----+-----+--------------+
| A | 3 | 1 |
+----+-----+--------------+
| B | 4 | 0 |
+----+-----+--------------+
| B | 5 | 1 |
+----+-----+--------------+
rank()
Description |
Returns the rank (with skips) of the current row within a group. |
Usage |
|
If the ORDER BY
condition is specified, the value is calculated using the specified order.
See example
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 2),
('A', 3),
('B', 4),
('B', 5);
SELECT gr, num,
rank(ORDER BY num) OVER (PARTITION BY gr) AS rank,
FROM t ORDER BY gr, num;
+----+-----+------+
| gr | num | rank |
+----+-----+------+
| A | 1 | 1 |
+----+-----+------+
| A | 2 | 2 |
+----+-----+------+
| A | 2 | 2 |
+----+-----+------+
| A | 3 | 4 |
+----+-----+------+
| B | 4 | 1 |
+----+-----+------+
| B | 5 | 2 |
+----+-----+------+
row_number()
Description |
Returns the number of the current row in its group (counting from |
Usage |
|
If the ORDER BY
condition is specified, the value is calculated using the specified order.
See example
CREATE TABLE t(gr VARCHAR, num BIGINT);
INSERT INTO t VALUES
('A', 1),
('A', 2),
('A', 2),
('A', 3),
('B', 4),
('B', 5);
SELECT gr, num,
row_number() OVER (PARTITION BY gr) AS row_number,
FROM t ORDER BY gr, num;
+----+-----+------------+
| gr | num | row_number |
+----+-----+------------+
| A | 1 | 1 |
+----+-----+------------+
| A | 2 | 2 |
+----+-----+------------+
| A | 2 | 3 |
+----+-----+------------+
| A | 3 | 4 |
+----+-----+------------+
| B | 4 | 1 |
+----+-----+------------+
| B | 5 | 2 |
+----+-----+------------+