Statistics Fundamentals: Mean, Variance & Standard Deviation

Introduction

Understanding mean, variance, and standard deviation is crucial for analyzing data distributions. This post explains:

  • Key definitions and formulas

  • Differences between calculation (population) vs. estimation (sample)

  • Why adjustments like Bessel's correction are needed

  • Practical implications for data analysis

1. Mean: The Center of Your Data

Population Mean (μ)

Definition: The average of all values in a population.
Formula:

(Where N = population size, xi​ = individual values)

Sample Mean (xˉ)

Definition: An estimate of μ using sample data.
Formula:

(Where n = sample size)

Key Difference:

  • Calculation (μ) uses the entire population.

  • Estimation (xˉ) infers μ from a sample.

2. Variance: Measuring Spread

Population Variance (σ²)

Definition: The average squared distance from the mean.
Formula:

Sample Variance (s²)

Definition: An estimate of σ², with Bessel’s correction.
Formula:

Why Squaring?

  1. Eliminates negative differences (avoiding cancellation).

  2. Emphasizes larger deviations (outliers).

Why n−1?
Using the sample mean (xˉ) introduces bias. Bessel’s correction adjusts for this by widening the estimate.

3. Standard Deviation: Interpretable Spread

Population SD (σ)

Definition: Square root of variance (same units as data).
Formula:

Sample SD (s)

Definition: Adjusted estimate of σ.
Formula:

Why Not Absolute Values?
While Mean Absolute Deviation (MAD) exists, squaring:

  1. Is differentiable (useful in optimization).

  2. Mathematically tractable for inference.

Key Insights

  1. Population vs. Sample:

    • Use population formulas (μ, σ²) if you have all data (rare!).

    • Use sample formulas (xˉ, s²) for inferences.

  2. Bessel’s Correction:

    • Dividing by n underestimates σ² because xˉ is closer to sample data than μ.

    • n−1 accounts for lost degree of freedom.

  3. When to Use Each:

    | Scenario | Mean | Variance/SD | | --- | --- | --- | | Complete data | μ | σ², σ | | Sample data | xˉ | s², s |

Summary

  • Mean: Center of your data.

  • Variance: Squared spread (avoids negative differences).

  • Standard Deviation: Spread in original units.

  • Sample Adjustments: Critical for accurate inference.

Final Note: Always clarify whether you’re reporting population parameters or sample statistics! Miscommunication here can invalidate conclusions.

Shameless Plug: Why Dividing by n Underestimates Variance

Using the sample mean (xˉ) minimizes ∑(xi​−xˉ)2 compared to the true mean (μ). Dividing by n ignores this bias, producing a variance estimate that’s too small.

Analogy: If you’re guessing the average height of a class by measuring only your friends, their mean height will likely be closer to their own heights than the true class mean. Bessel’s correction (n−1) compensates for this self-referential bias!

Got questions? Drop them in the comments! Next up: Covariance vs. Correlation. 🚀

#Statistics #DataScience #MachineLearning #Analytics

0
Subscribe to my newsletter

Read articles from Ashutosh Kurwade directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ashutosh Kurwade
Ashutosh Kurwade