ANSI SQL Aggregate Functions

Aggregate functions return a single value based upon a set of other values. If used among other expressions in the item list of a SELECT statement, the SELECT must have a GROUP BY or HAVING clause. No GROUP BY or HAVING clause is required if the aggregate function is the only value retrieved by the SELECT statement. The aggregate functions supported by the ANSI SQL standard and their syntax are listed in Table 4-1.

Table 4-1. ANSI SQL aggregate functions

Function

Usage

AVG(expression)

Computes the average value of a column given by expression

CORR(dependent, independent)

Computes a correlation coefficient

COUNT(expression)

Counts the rows defined by the expression

COUNT(*)

Counts all rows in the specified table or view

COVAR_POP(dependent, independent)

Computes population covariance

COVAR_SAMP(dependent, independent)

Computes sample covariance

CUME_DIST(value_list) WITHIN GROUP (ORDER BY sort_list)

Computes the relative rank of a hypothetical row within a group of rows, where the rank is equal to the number of rows less than or equal to the hypothetical row divided by the number of rows in the group

DENSE_RANK(value_list) WITHIN GROUP (ORDER BY sort_list)

Generates a dense rank (no ranks are skipped) for a hypothetical row (value_list) in a group of rows generated by GROUP BY

MIN(expression)

Finds the minimum value in a column given by expression

MAX(expression)

Finds the maximum value in a column given by expression

PERCENT_RANK(value_list) WITHIN GROUP (ORDER BY sort_list)

Generates a relative rank for a hypothetical row by dividing that row’s rank less 1 by the number of rows in the group

PERCENTILE_CONT(percentile) WITHIN GROUP (ORDER BY sort_list)

Generates an interpolated value that, if added to the group, would correspond to the percentile given

PERCENTILE_DISC(percentile) WITHIN GROUP (ORDER BY sort_list)

Returns the value with the smallest cumulative distribution value greater than or equal to percentile

RANK(value_list) WITHIN GROUP (ORDER BY sort_list)

Generates a rank for a hypothetical row (value_list) in a group of rows generated by GROUP BY

REGR_AVGX(dependent, independent)

Computes the average of the independent variable

REGR_AVGY(dependent, independent)

Computes the average of the dependent variable

REGR_COUNT(dependent, independent)

Counts the number of pairs remaining in the group after any pair with one or more NULL values has been eliminated

REGR_INTERCEPT(dependent, independent)

Computes the y-intercept of the least-squares-fit linear equation

REGR_R2(dependent, independent)

Squares the correlation coefficient

REGR_SLOPE(dependent, independent)

Determines the slope of the least-squares-fit linear equation

REGR_SXX(dependent, independent)

Sums the squares of the independent variables

REGR_SXY(dependent, independent)

Sums the products of each pair of variables

REGR_SYY(dependent, independent)

Sums the squares of the dependent variables

STDDEV_POP(expression)

Computes the population standard deviation of all expression values in a group

STDDEV_SAMP(expression)

Computes the sample standard deviation of all expression values in a group

SUM(expression)

Computes the sum of the column values given by expression

VAR_POP(expression)

Computes the population variance of all expression values in a group

VAR_SAMP(expression)

Computes the sample standard deviation of all expression values in a group

Technically speaking, ALL, ANY, and SOME are considered aggregate functions. However, they have been discussed as range search criteria since they are most often used that way. Refer to Chapter 3 for more information on these functions.

The number of values processed by an aggregate function varies depending on the number of rows queried from the table. This behavior differentiates aggregate functions from scalar functions, which can only operate on the values of a single row per invocation.

The general syntax of an aggregate function is:

aggregate_function_name( [ALL | DISTINCT] expression )

The aggregate_function_name may be AVG, COUNT, MAX, MIN, or SUM, as listed in Table 4-1. The ALL keyword, which specifies the default behavior, evaluates all rows when aggregating the value of the function. The DISTINCT keyword uses only distinct values when evaluating the function.

Warning

All aggregate functions except COUNT(*) will ignore NULL values when computing their results.

Get SQL in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.