23.4. Variance and Standard Deviation

The standard deviation is a measure of how far away from the average the values in a normally distributed population are. It is hard to calculate in SQL, because it involves a square root, and standard SQL has only the basic four arithmetic operators.

Many vendors will allow you to use other math functions, but in all fairness, most SQL databases are in commercial applications and have little or no need for engineering or statistical calculations.

The usual trick is to load the raw data into an appropriate host language, such as FORTRAN, and do the work there. The formula for the standard deviation is:

Get Joe Celko's SQL for Smarties, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.