Window functions
Window functions in DataLens are calculated as aggregate functions of measures. At the same time, grouping by dimensions differs from the grouping set in the chart. Grouping parameters are specified in the function call as a list of dimensions to be included in the grouping (WITHIN ...
) or excluded from it (AMONG ...
).
Given that measures are aggregated values, one may consider window functions in DataLens as aggregations of aggregations. One should note that not all aggregate functions would have the result of aggregating particular results that is identical to the general aggregation of values.
For example, when calculating the sum of a measure for each group and then adding these values together, you will get the total sum of the measure in all groups:
SUM(SUM(a) UNION SUM(b)) = SUM(a UNION b)
When counting the amount of measure values for each group and then trying to calculate the amount from the sum of these values, you will get the amount of amounts (summands in the sum) rather than the total amount of values in all groups:
COUNT(COUNT(a) UNION COUNT(b)) = 2
Example
Category |
Sales |
OrderCount |
Sales_Furniture+Technology |
OrderCount_Furniture+Technology |
---|---|---|---|---|
Furniture | 100000 | 350 | 300000 | 2 |
Technology | 200000 | 650 | 300000 | 2 |
Where:
- Sales: Sum of the sales measure in the categories
- Order Count: Amount of the order measure values in the categories
- Sales_Furniture+Technology: Total sum of the sales measure in all categories:
SUM_IF([Sales], [Category] = 'Furniture' TOTAL) + SUM_IF([Sales], [Category] = 'Technology' TOTAL)
- OrderCount_Furniture+Technology: Amount of amounts of the order measure values:
COUNT(COUNT_IF([Order Count], [Category] = 'Furniture' TOTAL) + COUNT_IF([Order Count], [Category] = 'Technology' TOTAL) TOTAL)
Usage Restrictions
-
The first argument in window functions can only be measures.
For the
AVG_IF
,COUNT_IF
,SUM_IF
window functions, the first argument (expression
in the function description) must always be a measure.Example:
AVG_IF([Profit], [Profit] > 5)
For other window functions, the first (and only) argument (
value
in the function description) must always be a measure, too.Examples:
- Correct formula:
SUM(SUM([Profit]) TOTAL)
. - Incorrect formula:
RANK([Profit] TOTAL)
, where[Profit]
is a non-aggregated expression.
- Correct formula:
-
For grouping window functions, only the dimensions used to build the chart can be applied. Only the dimensions used to build the chart set the grouping when calculating a measure. These dimensions define how values are split into groups and therefore have fixed values in each group.
If you specify a dimension that was not used to build the chart, it won't have a fixed value and the value can be different in each group row. As a result, it will be impossible to determine which value of this dimension must be used to calculate the measure. This limitation applies to the
WITHIN
andAMONG
grouping types.Examples:
- Correct formula:
RANK(SUM([Profit]) WITHIN [Category])
in the chart with grouping by the[Order Date]
and[Category]
dimensions.[Category]
is used when building the chart. - Correct formula:
RANK(SUM([Profit]) WITHIN [Category])
in the chart with grouping by the[Order Date]
and[City]
dimensions.[Category]
is not part of the grouping, so it will not be used in the calculation. The result will be the same as when using theRANK(SUM([Profit]) TOTAL)
formula. - Incorrect formula:
RANK(SUM([Profit]) AMONG [City])
in the chart with grouping by the[Order Date]
and[Category]
dimensions. Running the function will return the Unknown dimension for window error.
- Correct formula:
-
Window functions do not support managing the level of detail. This is only available for aggregate functions.
-
For functions that depend on sorting order, the fields listed in
ORDER BY
must be used when building the chart. -
If a window function is used to build a Table chart, we don't recommend enabling the display of Total in the settings. This may cause an error.
Syntax
The general syntax for window functions is as follows:
<WINDOW_FUNCTION_NAME>(
arg1, arg2, ...
[ TOTAL
| WITHIN dim1, dim2, ...
| AMONG dim1, dim2, ... ]
[ ORDER BY field1, field2, ... ]
[ BEFORE FILTER BY filtered_field1, ... ]
)
It starts off, just like a regular function call, with its name and arguments (arg1, arg2, ...
in this case).
Grouping
The arguments are followed by a window grouping, which can be one of three types:
TOTAL
(equivalent toWITHIN
without dimensions): all query entries fall into a single window.WITHIN dim1, dim2, ...
: records are grouped by the dimensionsdim1, dim2, ...
.AMONG dim1, dim2, ...
: records are grouped by all dimensions from the query, except those listed. For example, if we use formulaRSUM(SUM([Sales]) AMONG dim1, dim2)
with dimensionsdim1
,dim2
,dim3
,dim4
in the data query, then the entries will be grouped bydim3
anddim4
, so it will be equivalent toRSUM([Sales] WITHIN dim3, dim4)
.
The grouping clause is optional. TOTAL
is used by default.
Ordering
After the grouping comes the ordering clause. It is only supported for order-dependent functions:
M* |
R* |
Positional functions |
---|---|---|
MAVG | RAVG | LAG |
MCOUNT | RCOUNT | FIRST |
MMAX | RMAX | LAST |
MMIN | RMIN | |
MSUM | RSUM |
The ordering clause is optional for these functions.
See the descriptions of these functions for more information on how this order affects the result value.
The ORDER BY
clause accepts dimensions as well as measures. It also supports the standard ASC
/DESC
syntax (ASC
is assumed by default) to specify ascending or descending order respectively:
... ORDER BY [Date] ASC, SUM([Sales]) DESC, [Category] ...
Fields listed in ORDER BY
are combined with fields listed in the chart's sorting section.
Example:
- Function —
... ORDER BY [Date] DESC, [City]
. - Chart — Sorted by
Date
andCategory
. - Result —
Date
(descending),City
,Category
.
BEFORE FILTER BY
If any fields are listed in BEFORE FILTER BY
, then this window function is calculated before data is filtered using these fields.
BEFORE FILTER BY
applies to all nested window functions too.
Example:
- Formula —
MAVG(RSUM([Sales]), 10 BEFORE FILTER BY [Date])
. - Equivalent —
MAVG(RSUM([Sales] BEFORE FILTER BY [Date]), 10 BEFORE FILTER BY [Date])
.
Do not use conflicting BEFORE FILTER BY
clauses:
- Correct formula:
MAVG(RSUM([Sales] BEFORE FILTER BY [Date], [Category]), 10 BEFORE FILTER BY [Date])
— functions are nested and ([Date]
) is a subset of ([Date], [Category]
). - Correct formula:
MAVG(RSUM([Sales] BEFORE FILTER BY [Category]), 10 BEFORE FILTER BY [Date])
— functions are nested, so field lists are combined in the second of the two functions. - Correct formula:
RSUM([Sales] BEFORE FILTER BY [Date], [Category]) - RSUM([Sales] BEFORE FILTER BY [Date])
— ([Date]
) is a subset of ([Date], [Category]
). - Incorrect formula:
RSUM([Sales] BEFORE FILTER BY [Category]) - RSUM([Sales] BEFORE FILTER BY [Date])
— functions are not nested and neither of ([Category]
) and ([Date]
) is a subset of the other.
Aggregate Functions as Window Functions
The following aggregations can also be used as window functions:
Aggregations | Conditional Aggregations |
---|---|
SUM | SUM_IF |
COUNT | COUNT_IF |
AVG | AVG_IF |
MAX | |
MIN |
To use the window version of the aggregate functions, you must explicitly specify the grouping (unlike other window functions, where it is optional).
For example, the SUM([Sales]) / SUM(SUM([Sales]) TOTAL)
expression can be used to calculate the ratio of a group's sum of [Sales]
to the sum of [Sales]
among all entries.
AVG
Syntax:AVG( value TOTAL | WITHIN ... | AMONG ... )
orAVG( value TOTAL | WITHIN ... | AMONG ... [ BEFORE FILTER BY ... ] )
Returns the average of all values. Applicable to numeric data types.
AVG_IF
Syntax:AVG_IF( expression, condition TOTAL | WITHIN ... | AMONG ... )
orAVG_IF( expression, condition TOTAL | WITHIN ... | AMONG ... [ BEFORE FILTER BY ... ] )
Returns the average of all values that meet the condition
condition. If the values don't exist, it returns NULL
. Applicable to numeric data types only.
COUNT
Syntax:COUNT( [ value ] TOTAL | WITHIN ... | AMONG ... )
orCOUNT( [ value ] TOTAL | WITHIN ... | AMONG ... [ BEFORE FILTER BY ... ] )
Returns the number of items in the specified window.
COUNT_IF
Syntax:COUNT_IF( expression, condition TOTAL | WITHIN ... | AMONG ... )
orCOUNT_IF( expression, condition TOTAL | WITHIN ... | AMONG ... [ BEFORE FILTER BY ... ] )
Returns the number of items in the specified window meeting the expression
condition.
FIRST
Syntax:FIRST( value )
orFIRST( value [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the value of value
from the first row in the window. See also LAST.
LAG
Syntax:LAG( value [ , offset [ , default ] ] )
orLAG( value [ , offset [ , default ] ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns value
re-evaluated against the row that is offset from the current row by offset
within the specified window:
- Positive
offset
seeks among preceding rows. - Negative
offset
seeks among following rows.
By default offset
is 1
.
If there is no available value (offset
reaches before the first row or after the last one), then default
is returned. If default
is not specified, then NULL
is used.
See also AGO for a non-window function alternative.
LAST
Syntax:LAST( value )
orLAST( value [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the value of value
from the last row in the window. See also FIRST.
MAVG
Syntax:MAVG( value, rows_1 [ , rows_2 ] )
orMAVG( value, rows_1 [ , rows_2 ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the moving average of values in a fixed-size window defined by the sort order and arguments:
rows_1 |
rows_2 |
Window |
---|---|---|
positive | - | The current row and rows_1 preceding rows. |
negative | - | The current row and -rows_1 following rows. |
any sign | any sign | rows_1 preceding rows, the current row and rows_2 following rows. |
Window functions with a similar behavior: MSUM, MCOUNT, MMIN, MMAX.
MAX
Syntax:MAX( value TOTAL | WITHIN ... | AMONG ... )
orMAX( value TOTAL | WITHIN ... | AMONG ... [ BEFORE FILTER BY ... ] )
Returns the maximum value.
If value
:
- number — Returns the largest number.
- date — Returns the latest date.
- string — Returns the last value in the alphabetic order.
MCOUNT
Syntax:MCOUNT( value, rows_1 [ , rows_2 ] )
orMCOUNT( value, rows_1 [ , rows_2 ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the moving count of (non-NULL
) values in a fixed-size window defined by the sort order and arguments:
rows_1 |
rows_2 |
Window |
---|---|---|
positive | - | The current row and rows_1 preceding rows. |
negative | - | The current row and -rows_1 following rows. |
any sign | any sign | rows_1 preceding rows, the current row and rows_2 following rows. |
Window functions with a similar behavior: MSUM, MMIN, MMAX, MAVG.
MIN
Syntax:MIN( value TOTAL | WITHIN ... | AMONG ... )
orMIN( value TOTAL | WITHIN ... | AMONG ... [ BEFORE FILTER BY ... ] )
Returns the minimum value.
If value
:
- number — Returns the smallest number.
- date — Returns the earliest date.
- string — Returns the first value in the alphabetic order.
MMAX
Syntax:MMAX( value, rows_1 [ , rows_2 ] )
orMMAX( value, rows_1 [ , rows_2 ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the moving maximum of values in a fixed-size window defined by the sort order and arguments:
rows_1 |
rows_2 |
Window |
---|---|---|
positive | - | The current row and rows_1 preceding rows. |
negative | - | The current row and -rows_1 following rows. |
any sign | any sign | rows_1 preceding rows, the current row and rows_2 following rows. |
Window functions with a similar behavior: MSUM, MCOUNT, MMIN, MAVG.
MMIN
Syntax:MMIN( value, rows_1 [ , rows_2 ] )
orMMIN( value, rows_1 [ , rows_2 ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the moving minimum of values in a fixed-size window defined by the sort order and arguments:
rows_1 |
rows_2 |
Window |
---|---|---|
positive | - | The current row and rows_1 preceding rows. |
negative | - | The current row and -rows_1 following rows. |
any sign | any sign | rows_1 preceding rows, the current row and rows_2 following rows. |
Window functions with a similar behavior: MSUM, MCOUNT, MMAX, MAVG.
MSUM
Syntax:MSUM( value, rows_1 [ , rows_2 ] )
orMSUM( value, rows_1 [ , rows_2 ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the moving sum of values in a fixed-size window defined by the sort order and arguments:
rows_1 |
rows_2 |
Window |
---|---|---|
positive | - | The current row and rows_1 preceding rows. |
negative | - | The current row and -rows_1 following rows. |
any sign | any sign | rows_1 preceding rows, the current row and rows_2 following rows. |
Window functions with a similar behavior: MCOUNT, MMIN, MMAX, MAVG.
RANK
Syntax:RANK( value [ , direction ] )
orRANK( value [ , direction ] [ TOTAL | WITHIN ... | AMONG ... ] [ BEFORE FILTER BY ... ] )
Returns the rank of the current row if ordered by the given argument. Rows corresponding to the same value used for sorting have the same rank. If the first two rows both have rank of 1
, then the next row (if it features a different value) will have rank 3
, so, in effect, it is rank with gaps.
If direction
is "desc"
or omitted, then ranking is done from greatest to least, if "asc"
, then from least to greatest.
See also RANK_DENSE, RANK_UNIQUE, RANK_PERCENTILE.
RANK_DENSE
Syntax:RANK_DENSE( value [ , direction ] )
orRANK_DENSE( value [ , direction ] [ TOTAL | WITHIN ... | AMONG ... ] [ BEFORE FILTER BY ... ] )
Returns the rank of the current row if ordered by the given argument. Rows corresponding to the same value used for sorting have the same rank. If the first two rows both have rank of 1
, then the next row (if it features a different value) will have rank 2
, (rank without gaps).
If direction
is "desc"
or omitted, then ranking is done from greatest to least, if "asc"
, then from least to greatest.
See also RANK, RANK_UNIQUE, RANK_PERCENTILE.
RANK_PERCENTILE
Syntax:RANK_PERCENTILE( value [ , direction ] )
orRANK_PERCENTILE( value [ , direction ] [ TOTAL | WITHIN ... | AMONG ... ] [ BEFORE FILTER BY ... ] )
Returns the relative rank (from 0
to 1
) of the current row if ordered by the given argument. Calculated as (RANK(...) - 1) / (row count)
.
If direction
is "desc"
or omitted, then ranking is done from greatest to least, if "asc"
, then from least to greatest.
See also RANK, RANK_DENSE, RANK_UNIQUE.
RANK_UNIQUE
Syntax:RANK_UNIQUE( value [ , direction ] )
orRANK_UNIQUE( value [ , direction ] [ TOTAL | WITHIN ... | AMONG ... ] [ BEFORE FILTER BY ... ] )
Returns the rank of the current row if ordered by the given argument. Rows corresponding to the same value have different rank values. This means that rank values are sequential and different for all rows, always increasing by 1
for the next row.
If direction
is "desc"
or omitted, then ranking is done from greatest to least, if "asc"
, then from least to greatest.
See also RANK, RANK_DENSE, RANK_PERCENTILE.
RAVG
Syntax:RAVG( value [ , direction ] )
orRAVG( value [ , direction ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the average of all values in a growing (or shrinking) window defined by the sort order and the value of direction
:
direction |
Window |
---|---|
"asc" |
Starts from the first row and ends at the current row. |
"desc" |
Starts from the current row and ends at the last row. |
By default "asc"
is used.
Window functions with a similar behavior: RSUM, RCOUNT, RMIN, RMAX.
RCOUNT
Syntax:RCOUNT( value [ , direction ] )
orRCOUNT( value [ , direction ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the count of all values in a growing (or shrinking) window defined by the sort order and the value of direction
:
direction |
Window |
---|---|
"asc" |
Starts from the first row and ends at the current row. |
"desc" |
Starts from the current row and ends at the last row. |
By default "asc"
is used.
Window functions with a similar behavior: RSUM, RMIN, RMAX, RAVG.
RMAX
Syntax:RMAX( value [ , direction ] )
orRMAX( value [ , direction ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the maximum of all values in a growing (or shrinking) window defined by the sort order and the value of direction
:
direction |
Window |
---|---|
"asc" |
Starts from the first row and ends at the current row. |
"desc" |
Starts from the current row and ends at the last row. |
By default "asc"
is used.
Window functions with a similar behavior: RSUM, RCOUNT, RMIN, RAVG.
RMIN
Syntax:RMIN( value [ , direction ] )
orRMIN( value [ , direction ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the minimum of all values in a growing (or shrinking) window defined by the sort order and the value of direction
:
direction |
Window |
---|---|
"asc" |
Starts from the first row and ends at the current row. |
"desc" |
Starts from the current row and ends at the last row. |
By default "asc"
is used.
Window functions with a similar behavior: RSUM, RCOUNT, RMAX, RAVG.
RSUM
Syntax:RSUM( value [ , direction ] )
orRSUM( value [ , direction ] [ TOTAL | WITHIN ... | AMONG ... ] [ ORDER BY ... ] [ BEFORE FILTER BY ... ] )
Returns the sum of all values in a growing (or shrinking) window defined by the sort order and the value of direction
:
direction |
Window |
---|---|
"asc" |
Starts from the first row and ends at the current row. |
"desc" |
Starts from the current row and ends at the last row. |
By default "asc"
is used.
Window functions with a similar behavior: RCOUNT, RMIN, RMAX, RAVG.
SUM
Syntax:SUM( value TOTAL | WITHIN ... | AMONG ... )
orSUM( value TOTAL | WITHIN ... | AMONG ... [ BEFORE FILTER BY ... ] )
Returns the sum of all expression values. Applicable to numeric data types only.
SUM_IF
Syntax:SUM_IF( expression, condition TOTAL | WITHIN ... | AMONG ... )
orSUM_IF( expression, condition TOTAL | WITHIN ... | AMONG ... [ BEFORE FILTER BY ... ] )
Returns the sum of all the expression values that meet the condition
condition. Applicable to numeric data types only.