When a P-Value Is Misleading: What to Look for Instead
- Femi Balogun
- 13 hours ago
- 3 min read

The p-value is one of the most familiar numbers in clinical research and one of the most misunderstood.
A result with p < 0.05 is often treated as “real,” while anything above that threshold is dismissed as “negative.” But many misleading or even incorrect conclusions in the medical literature arise not from bad statistics, but from overreliance on the p-value.
This article explains when a p-value can mislead you and, more importantly, what to look at instead, without advanced statistics.
What a P-Value Actually Tells You (and What It Doesn’t)
A p-value answers a very narrow question:
If there were truly no effect, how surprising would these data be?
It does NOT tell you:
How large the effect is
Whether the effect is clinically meaningful
Whether the study design is valid
Whether bias or confounding explains the result
Whether the finding will replicate
A small p-value can coexist with a meaningless or biased result.A large p-value can occur even when a clinically important effect exists.
Situations Where P-Values Commonly Mislead
1. Large Sample Size, Tiny Effect
In large datasets, almost any difference can become “statistically significant.”
Example:
Blood pressure reduced by 0.8 mmHg
p < 0.001
The p-value looks impressive, but the effect is trivial.
What to look for instead
Absolute differences
Effect size (risk difference, mean difference)
Clinical relevance
Ask: Would this change patient care if it were true?
2. Small Sample Size, Important Effect
In small studies, meaningful effects often fail to reach p < 0.05.
Example:
Mortality reduced from 20% to 12%
p = 0.08
Many readers label this “negative,” but the effect may be clinically substantial.
What to look for instead
Direction and magnitude of effect
Confidence interval width
Whether the study was underpowered
Ask: Is this inconclusive or truly no effect?
3. Statistical Significance Without Balance
A p-value cannot rescue a poorly constructed comparison.
If groups differ meaningfully at baseline:
Age
Disease severity
Comorbidities
…then a small p-value may simply reflect confounding, not causation.
What to look for instead
Baseline characteristics (Table 1)
Whether differences favor one group systematically
Clinical plausibility
Ask: Would I expect this outcome even without the exposure?
4. Multiple Testing Without Context
When many outcomes or subgroups are tested, some will be “significant” by chance alone.
This is common in:
Retrospective studies
Registry analyses
Exploratory research
What to look for instead
Whether the outcome was pre-specified
Consistency across related outcomes
Whether the result fits a broader pattern
Ask: Was this hypothesis-driven or discovered after the fact?
5. Confidence Intervals That Tell a Different Story
A p-value can be small while the confidence interval remains wide or unstable.
Example:
Odds ratio: 1.3
95% CI: 1.01–1.68
p = 0.04
The p-value barely crosses the threshold, but uncertainty remains high.
What to look for instead
The full confidence interval
Whether clinically important effects are excluded
How sensitive the estimate appears
Ask: How confident am I about the size of this effect?
Better Questions Than “Is It Significant?”
Instead of asking:
“Is the p-value below 0.05?”
Ask:
How big is the effect?
Is the direction consistent with biology or clinical logic?
Could bias or confounding explain the result?
Is the estimate precise or unstable?
Would this change practice if true?
These questions often tell you more than the p-value alone.
A Simple Mental Reframe
Think of the p-value as:
A screening tool, not a verdict
It can suggest whether a finding is unlikely to be pure noise, but it cannot tell you whether the finding is true, important, or trustworthy.
Why This Matters in Clinical Research
Overemphasis on p-values leads to:
Overstated conclusions
Underappreciated uncertainty
Confusion between statistical and clinical significance
Better interpretation leads to:
More honest science
More reproducible findings
Better clinical judgment
Final Takeaway
A p-value is one piece of evidence, often the smallest one.
When reading a paper, start with:
Study design
Baseline balance
Effect size
Confidence intervals
Clinical plausibility
Only then should the p-value enter the conversation.
That shift alone will improve how you interpret clinical research far more than any advanced statistical method.


Comments