📊 P-Hacking: What It Is and How to Avoid It!

"p-hacking...
... don't do it.
If you do it...
it's a shame!" 😅
📌 Introduction
Before we dive deep into p-hacking, make sure you're already familiar with p-values and their interpretation. If you're not, check out my earlier technical blog post on P-Values Explained and the corresponding LinkedIn post here.
Now, let’s uncover one of the dark corners of data analysis — P-Hacking — how it tricks us, why it's dangerous, and what we can do to avoid it.
🕵️ What Is P-Hacking?
P-hacking refers to the practice of manipulating data analysis or experimenting with statistical tests until you get a p-value below the magic threshold (typically 0.05) — which makes a result appear statistically significant even when it's not.
This isn’t necessarily done with bad intentions. Sometimes, it's just a consequence of trying too many different analyses or slicing the data in different ways — but the outcome is the same: false positives.
⚠️ P-hacking = Misuse of statistics + Misleading conclusions.
🔁 The Multiple Testing Problem
Here’s the big issue:
The more tests you run, the more likely you are to get a significant result just by chance — even if there’s no real effect.
This is known as the Multiple Testing Problem.
Imagine flipping a coin 100 times. Even if it’s a fair coin, you might get a long streak of heads — by pure randomness. Similarly, running 100 different statistical tests increases the odds of finding a “significant” result that’s just noise.
🚨 Terminology Alert!
Doing a lot of tests and ending up with a bunch of false positives?
That’s called the Multiple Testing Problem.
But don’t worry — there are statistical techniques to handle this!
✅ Using FDR to Compensate for Multiple Testing
One of the most popular techniques is the False Discovery Rate (FDR), commonly implemented via the Benjamini-Hochberg Procedure.
Here’s how it works:
You feed in the p-values from all your tests.
The method adjusts these values based on the number of comparisons.
The result? Adjusted p-values, typically larger than the originals.
Some tests that seemed significant before (p < 0.05) might now have adjusted p > 0.05, revealing them as likely false positives.
✅ Key Tip: For FDR (or any correction method) to work properly, you must include all p-values from all tests — not just the ones that look promising!
This means: Don’t cherry-pick your results. Report everything — even the “boring” stuff.
🧠 A Subtle Form of P-Hacking
There’s also a sneakier version of p-hacking:
Adding more data after seeing your initial results.
Let’s say your original p-value is close to 0.05. If you add just a few more observations, you might push that p-value below 0.05 — purely by chance.
This breaks the assumptions of classical hypothesis testing.
Remember: the 0.05 threshold assumes you’re testing once, not re-testing after peeking at your data!
So, even though your test seems valid, it’s not — you’ve manipulated the process.
🔬 Power Analysis: Your Anti P-Hack Shield
One of the best ways to avoid p-hacking is to plan your experiment in advance.
That’s where Power Analysis comes in.
It helps you determine how many samples or replicates you need to detect an effect, before running your test.
Power analysis ensures:
You don’t collect too little data (risking false negatives).
You don’t collect data until you get “lucky” (risking false positives).
Think of it as setting the rules before playing the game.
📚 Summary of Concepts
Let’s recap:
P-hacking is manipulating your analysis to get a “significant” p-value.
The Multiple Testing Problem leads to more false positives as you run more tests.
Use False Discovery Rate (FDR) or similar corrections to handle multiple comparisons.
Don’t cherry-pick tests with small p-values — include all tests in your correction.
Avoid peeking at your data and adding more samples just to get significance.
Plan ahead using Power Analysis to determine the right sample size.
🎯 The End… But Not Really!
Science and data are powerful tools — but only when used with integrity. Avoiding p-hacking isn’t just good practice — it’s essential to doing honest, reproducible research.
Want to learn more about p-values, hypothesis testing, or statistical pitfalls? Follow along for more technical blogs and connect with me on LinkedIn.
Subscribe to my newsletter
Read articles from Ashutosh Kurwade directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
