Skip to content

Aggregation Operations#

Aggregate functions operate on values across rows to perform mathematical calculations such as sum, average, counting, minimum/maximum values, standard deviation, and estimation, as well as some non-mathematical operations. Find more information and examples here.

ANY_VALUE#

Syntax

ANY_VALUE(expr)

Description

Returns the first value of a column.

APPROXIMATE_COUNT_DISTINCT#

Syntax

APPROXIMATE_COUNT_DISTINCT(<expr>)

Description

Return an approximate distinct row count.

APPROXIMATE_JACCARD_INDEX#

Syntax

APPROXIMATE_JACCARD_INDEX(<variant_expr>)

Description

Similarity approximation of inputs.

APPROXIMATE_SIMILARITY#

Syntax

APPROXIMATE_SIMILARITY(<variant_expr>)

Description

Similarity approximation of inputs.

APPROX_COUNT_DISTINCT#

Syntax

APPROX_COUNT_DISTINCT(<expr>)

Description

Returns an approximate distinct row count.

APPROX_PERCENTILE#

Syntax

APPROX_PERCENTILE(<float_expr>, <percentile>)

Description

Return a percentile approximation of inputs.

APPROX_PERCENTILE_ACCUMULATE#

Syntax

APPROX_PERCENTILE_ACCUMULATE(<expr>)

Description

A tdigest state in JSON format.

APPROX_PERCENTILE_COMBINE#

Syntax

APPROX_PERCENTILE_COMBINE(<state>)

Description

A tdigest state in JSON format.

APPROX_TOP_K#

Syntax

APPROX_TOP_K(<expr>)

Description

Array containing top k items and their frequencies.

APPROX_TOP_K_ACCUMULATE#

Syntax

APPROX_TOP_K_ACCUMULATE(<expr>)

Description

Returns the Space-Saving summary at the end of aggregation.

APPROX_TOP_K_COMBINE#

Syntax

APPROX_TOP_K_COMBINE(<state>)

Description

Combines (merges) input states into a single output state.

ARRAYAGG#

Syntax

ARRAYAGG(<expr>)

Description

Returns the input values, pivoted into an ARRAY.

ARRAY_UNION_AGG#

Syntax

ARRAY_UNION_AGG(<array_expr>)

Description

Returns an ARRAY that contains the union of the distinct values from the input ARRAYs in a column.

ARRAY_UNIQUE_AGG#

Syntax

ARRAY_UNIQUE_AGG(<expr>)

Description

Returns an ARRAY that contains all of the distinct values from the specified column.

AVG#

Syntax

AVG(<expr>)

Description

Returns the average of the values in a group. Null values are ignored.

BITANDAGG#

Syntax

BITANDAGG(<expr>)

Description

Returns the bitwise AND of a column. Null values are ignored.

BITORAGG#

Syntax

BITORAGG(<expr>)

Description

Returns the bitwise OR of a column. Null values are ignored.

BITXORAGG#

Syntax

BITXORAGG(<expr>)

Description

Returns the bitwise XOR of a column. Null values are ignored.

BIT_AND_AGG#

Syntax

BIT_AND_AGG(<expr>)

Description

Returns the bitwise AND of a column. Null values are ignored.

BIT_OR_AGG#

Syntax

BIT_OR_AGG(<expr>)

Description

Returns the bitwise OR of a column. Null values are ignored.

BOOLAND_AGG#

Syntax

BOOLAND_AGG(<expr>)

Description

Returns the logical (boolean) AND of a column. Null values are ignored.

BOOLOR_AGG#

Syntax

BOOLOR_AGG(<expr>)

Description

Returns the logical (boolean) OR of a column. Null values are ignored.

BOOLORX_AGG#

Syntax

BOOLORX_AGG(<expr>)

Description

Returns the logical (boolean) XOR of a column. Null values are ignored.

COUNT#

Syntax

COUNT(<expr>)

Description

Returns the number of non-null values of a given expression.

COUNT(*)#

Syntax

COUNT(*)

Description

Returns the total number of rows.

COUNT_IF#

Syntax

COUNT_IF(<bool_expr>)

Description

Returns a conditional count.

COVAR_POP#

Syntax

COVAR_POP(<y>, <x>)

Description

Returns the population covariance for non-null pairs in a group.

COVAR_SAMP#

Syntax

COVAR_SAMP(<y>, <x>)

Description

Returns the sample covariance for non-null pairs in a group.

FIRST_VALUE#

Syntax

FIRST_VALUE(<expr>)

Description

Returns the first value of a column.

HASH_AGG#

Syntax

HASH_AGG(<expr>)

Description

Returns an aggregate signed 64-bit hash value over the (unordered) set of input rows.

HLL_ACCUMULATE#

Syntax

HLL_ACCUMULATE(<expr>)

Description

Returns the accumulated HyperLogLog state.

HLL_COMBINE()#

Syntax

HLL_COMBINE(<state>)

Description

Returns the combined HyperLogLog state.

KURTOSIS#

Syntax

KURTOSIS(<expr>)

Description

Returns the sample kurtosis of the values in a group. Null values are ignored.

LAST_VALUE#

Syntax

LAST_VALUE(<expr>)

Description

Returns the last value of a column.

LISTAGG#

Syntax

LISTAGG(<expr>, [, <delimiter>])

Description

Returns the concatenated input values, separated by the delimiter string.

MAX#

Syntax

MAX(<expr>)

Description

Returns the maximum of a column.

MEDIAN#

Syntax

MEDIAN(<expr>)

Description

Returns the median of a set of values.

MIN#

Syntax

MIN(<expr>)

Description

Returns the minimum of a column.

MINHASH#

Syntax

MINHASH(<k>, <expr>)

Description

Minhash state including an array of k (first input arg) minimum hash values.

MINHASH_COMBINE#

Syntax

MINHASH_COMBINE(<state>)

Description

Combined result of two (or more) minhash states.

MODE#

Syntax

MODE(<expr>)

Description

Returns the most frequent item of a column.

OBJECT_AGG#

Syntax

OBJECT_AGG(<key>, <value>)

Description

Aggregate keys and values into an output object.

PERCENTILE_CONT#

Syntax

PERCENTILE_CONT(<percentile>)

Description

Returns the value of sort key at specified percentile.

PERCENTILE_DISK#

Syntax

PERCENTILE_DISK(<percentile>)

Description

Returns the value of sort key at specified percentile.

REGR_AVGX#

Syntax

REGR_AVGX(<y>, <x>)

Description

Returns the average of the independent variable (2nd parameter) for non-null pairs.

REGR_AVGY#

Syntax

REGR_AVGY(<y>, <x>)

Description

Returns the average of the dependent variable (1st parameter) for non-null pairs.

REGR_COUNT#

Syntax

REGR_COUNT(<y>, <x>)

Description

Returns the number of non-null number pairs.

REGR_INTERCEPT#

Syntax

REGR_INTERCEPT(<y>, <x>)

Description

Returns the intercept of the univariate linear regression line for non-null pairs.

REGR_R2#

Syntax

REGR_R2(<y>, <x>)

Description

Returns the coefficient of determination for non-null pairs.

REGR_SLOPE#

Syntax

REGR_SLOPE(<y>, <x>)

Description

Returns the slope of the linear regression line for non-null pairs.

REGR_SXX#

Syntax

REGR_SXX(<y>, <x>)

Description

Returns REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs.

REGR_SYY#

Syntax

REGR_SYY(<y>, <x>)

Description

Returns REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs.

SKEW#

Syntax

SKEW(<expr>)

Description

Returns the sample skew of the values in a group. Null values are ignored.

STDDEV#

Syntax

STDDEV(<expr>)

Description

Returns the sample standard deviation of the values in a group. Null values are ignored.

STDDEV_POP#

Syntax

STDDEV_POP(<expr>)

Description

Returns the population standard deviation of the values in a group. Null values are ignored.

STDDEV_SAMP#

Syntax

STDDEV_SAMP(<expr>)

Description

Returns the sample standard deviation of the values in a group. Null values are ignored.

SUM#

Syntax

SUM(<expr>)

Description

Returns a sum of a column.

VARIANCE#

Syntax

VARIANCE(<expr>)

Description

Returns the sample variance of the values in a group. Null values are ignored.

VARIANCE_POP#

Syntax

VARIANCE_POP(<expr>)

Description

Returns the population variance of the values in a group. Null values are ignored.

VARIANCE_SAMP#

Syntax

VARIANCE_SAMP(<expr>)

Description

Returns the sample variance of the values in a group. Null values are ignored.

VAR_SAMP#

Syntax

VAR_SAMP(<expr>)

Description

Returns the sample variance of the values in a group. Null values are ignored.