Statistics Fundamentals: Mean, Variance & Standard Deviation

Introduction
Understanding mean, variance, and standard deviation is crucial for analyzing data distributions. This post explains:
Key definitions and formulas
Differences between calculation (population) vs. estimation (sample)
Why adjustments like Bessel's correction are needed
Practical implications for data analysis
1. Mean: The Center of Your Data
Population Mean (μ)
Definition: The average of all values in a population.
Formula:
(Where N = population size, xi = individual values)
Sample Mean (xˉ)
Definition: An estimate of μ using sample data.
Formula:
(Where n = sample size)
Key Difference:
Calculation (μ) uses the entire population.
Estimation (xˉ) infers μ from a sample.
2. Variance: Measuring Spread
Population Variance (σ²)
Definition: The average squared distance from the mean.
Formula:
Sample Variance (s²)
Definition: An estimate of σ², with Bessel’s correction.
Formula:
Why Squaring?
Eliminates negative differences (avoiding cancellation).
Emphasizes larger deviations (outliers).
Why n−1?
Using the sample mean (xˉ) introduces bias. Bessel’s correction adjusts for this by widening the estimate.
3. Standard Deviation: Interpretable Spread
Population SD (σ)
Definition: Square root of variance (same units as data).
Formula:
Sample SD (s)
Definition: Adjusted estimate of σ.
Formula:
Why Not Absolute Values?
While Mean Absolute Deviation (MAD) exists, squaring:
Is differentiable (useful in optimization).
Mathematically tractable for inference.
Key Insights
Population vs. Sample:
Use population formulas (μ, σ²) if you have all data (rare!).
Use sample formulas (xˉ, s²) for inferences.
Bessel’s Correction:
Dividing by n underestimates σ² because xˉ is closer to sample data than μ.
n−1 accounts for lost degree of freedom.
When to Use Each:
| Scenario | Mean | Variance/SD | | --- | --- | --- | | Complete data | μ | σ², σ | | Sample data | xˉ | s², s |
Summary
Mean: Center of your data.
Variance: Squared spread (avoids negative differences).
Standard Deviation: Spread in original units.
Sample Adjustments: Critical for accurate inference.
Final Note: Always clarify whether you’re reporting population parameters or sample statistics! Miscommunication here can invalidate conclusions.
Shameless Plug: Why Dividing by n Underestimates Variance
Using the sample mean (xˉ) minimizes ∑(xi−xˉ)2 compared to the true mean (μ). Dividing by n ignores this bias, producing a variance estimate that’s too small.
Analogy: If you’re guessing the average height of a class by measuring only your friends, their mean height will likely be closer to their own heights than the true class mean. Bessel’s correction (n−1) compensates for this self-referential bias!
Got questions? Drop them in the comments! Next up: Covariance vs. Correlation. 🚀
#Statistics #DataScience #MachineLearning #Analytics
Subscribe to my newsletter
Read articles from Ashutosh Kurwade directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
