Variance
- dataset for entire population
- dataset for a sample
Enter numbers separated by a space.
Variance
In statistics, variance is a measure of dispersion of the values of a series from the mean. It is both equal to :
- the square of the standard deviation
- the arithmetic mean of the squares of deviations from the mean.
Variance is calculated (or estimated) differently depending on whether the data is related to entire population or to only a sample.
Calculation of variance from population dataset
In this case, values are available for the entire population. The calculation of the variance is direct from the above definition :
Let X be a dataset for entire population,
`X = {x_1, x_2, ..., x_n}`
We note `bar x` the average of X, `bar x = 1/n.sum_{i=1}^{i=n}x_i`
The variance can be written,
`\text{Var(X)} = 1/n.sum_{i=1}^{i=n}(x_i-barx)^2`
Example: `X = {1, 2, 5, 3,8}`
First, we compute the mean of X,
`bar x = 1/5.(1+2+5+3+8) = 3.8`
The variance is inferred,
`\text{Var(X)} = 1/5( (1-3.8)^2+(2-3.8)^2+(5-3.8)^2+(3-3.8)^2+(8-3.8)^2) approx 6.16`
Estimation of variance from sample dataset
In this situation, data is available only for a subset of population (sample). The variance cannot be calculated directly from the above definition. We use what we call an estimator.
The most commonly used estimator for sample variance is as follows :
X is a sample series,
`X = {x_1, x_2, ..., x_n}`
An estimator of the mean is `bar x = 1/n.sum_{i=1}^{i=n}x_i`
The variance is estimated as follows,
`\text{Var(X)} = 1/(n-1).sum_{i=1}^{i=n}(x_i-barx)^2`
Example: `X = {1, 2, 5, 3,8}`
We suppose that X is a set of values for a randomly drawn population from the total population.
The average of the sample is first calculated,
`bar x = 1/5.(1+2+5+3+8) = 3.8`
An estimator of variance is inferred,
`\text{Var(X)} = 1/4( (1-3.8)^2+(2-3.8)^2+(5-3.8)^2+(3-3.8)^2+(8-3.8)^2) = 7.7`
See also
Standard deviation
Arithmetic mean