You want to characterize a dataset by computing general descriptive or summary statistics.

Many common descriptive statistics, such as mean and standard deviation, can be obtained by applying aggregate functions to your data. Others, such as median or mode, can be calculated based on counting queries.

Suppose you have a table `testscore`

containing
observations representing subject ID, age, sex, and test score:

mysql>+---------+-----+-----+-------+ | subject | age | sex | score | +---------+-----+-----+-------+ | 1 | 5 | M | 5 | | 2 | 5 | M | 4 | | 3 | 5 | F | 6 | | 4 | 5 | F | 7 | | 5 | 6 | M | 8 | | 6 | 6 | M | 9 | | 7 | 6 | F | 4 | | 8 | 6 | F | 6 | | 9 | 7 | M | 8 | | 10 | 7 | M | 6 | | 11 | 7 | F | 9 | | 12 | 7 | F | 7 | | 13 | 8 | M | 9 | | 14 | 8 | M | 6 | | 15 | 8 | F | 7 | | 16 | 8 | F | 10 | | 17 | 9 | M | 9 | | 18 | 9 | M | 7 | | 19 | 9 | F | 10 | | 20 | 9 | F | 9 | +---------+-----+-----+-------+`SELECT subject, age, sex, score FROM testscore ORDER BY subject;`

A good first step in analyzing a set of observations is to generate some descriptive statistics that summarize their general characteristics as a whole. Common statistical values of this kind include:

The number of observations, their sum, and their range (minimum and maximum)

Measures of central tendency, such as mean, median, and mode

Measures of variation, such as standard deviation or variance

Aside from the median and mode, all of these can be calculated ...

Start Free Trial

No credit card required