When we talk about an average, most of us talk about what’s called a central tendency. This is the centre or “typical value” of a probability distribution. An example could be the time between a user signing into an e-commerce site and purchasing an item (i.e. a non-random event process) which can be theoretically modelled using the gamma distribution.

Use cases#

Because averages provide a measure of central tendency, this makes it ideal for non-technical stakeholders to make strategic decisions because they can draw attention to sticking points in a customer flow (to carry on the event process anecdote) in a simple to understand manner.

Mean vs median vs mode#

3 popular averages to describe a central tendency are: mean, median and mode.

For symmetrical continuous distributions (e.g. normal), mean $\simeq$ median. Therefore, they’re interchangeable averages (see below).

Distribution of adult heights

When a continuous distribution is asymmetrical (e.g. gamma), mean $\neq$ median whereby the mean tends towards the ‘tail’ of the data and is greatly affected by extreme values (see below). median is the more representative measure in these situations.

Effect of an outlier on positively skewed distribution

Effect of an outlier on negatively skewed distribution

This means that in situations where the assumptions of a distribution can’t be easily validated or the sample size is too small, it’s easiest to use median.

For discrete distributions, mean can be misrepresentative for discrete numerical distributions - especially when a variable can’t be the mean value, e.g. 2.5 doors on a car. When the variable is a label, mode is used because the order of the labels provides no additional information. When a variable is ordered, median can be used.

Where measuring central tendency fails#

Central tendency measures aren’t always useful, however:

  1. When releasing a feature that impacts a subset of your distribution, measures of central tendency don’t show the true effect. It may be more interesting to look at percentiles in this situation.

  2. The underlying distribution is uniform or multi-modal (2+ peaks). In this situation, a central tendency measure does not explain the complexity or lack of information for the distribution. It may be more interesting to look at the range or interquartile range.