top of page

When a P-Value Is Misleading: What to Look for Instead

When a P-Value Is Misleading: What to Look for Instead
When a P-Value Is Misleading: What to Look for Instead

The p-value is one of the most familiar numbers in clinical research and one of the most misunderstood.

A result with p < 0.05 is often treated as “real,” while anything above that threshold is dismissed as “negative.” But many misleading or even incorrect conclusions in the medical literature arise not from bad statistics, but from overreliance on the p-value.

This article explains when a p-value can mislead you and, more importantly, what to look at instead, without advanced statistics.


What a P-Value Actually Tells You (and What It Doesn’t)

A p-value answers a very narrow question:

If there were truly no effect, how surprising would these data be?

It does NOT tell you:

  • How large the effect is

  • Whether the effect is clinically meaningful

  • Whether the study design is valid

  • Whether bias or confounding explains the result

  • Whether the finding will replicate


A small p-value can coexist with a meaningless or biased result.A large p-value can occur even when a clinically important effect exists.


Situations Where P-Values Commonly Mislead

1. Large Sample Size, Tiny Effect

In large datasets, almost any difference can become “statistically significant.”

Example:

  • Blood pressure reduced by 0.8 mmHg

  • p < 0.001

The p-value looks impressive, but the effect is trivial.

What to look for instead

  • Absolute differences

  • Effect size (risk difference, mean difference)

  • Clinical relevance

Ask: Would this change patient care if it were true?


2. Small Sample Size, Important Effect

In small studies, meaningful effects often fail to reach p < 0.05.

Example:

  • Mortality reduced from 20% to 12%

  • p = 0.08

Many readers label this “negative,” but the effect may be clinically substantial.

What to look for instead

  • Direction and magnitude of effect

  • Confidence interval width

  • Whether the study was underpowered


Ask: Is this inconclusive or truly no effect?


3. Statistical Significance Without Balance

A p-value cannot rescue a poorly constructed comparison.

If groups differ meaningfully at baseline:

  • Age

  • Disease severity

  • Comorbidities

…then a small p-value may simply reflect confounding, not causation.

What to look for instead

  • Baseline characteristics (Table 1)

  • Whether differences favor one group systematically

  • Clinical plausibility


Ask: Would I expect this outcome even without the exposure?


4. Multiple Testing Without Context

When many outcomes or subgroups are tested, some will be “significant” by chance alone.

This is common in:

  • Retrospective studies

  • Registry analyses

  • Exploratory research

What to look for instead

  • Whether the outcome was pre-specified

  • Consistency across related outcomes

  • Whether the result fits a broader pattern


Ask: Was this hypothesis-driven or discovered after the fact?


5. Confidence Intervals That Tell a Different Story

A p-value can be small while the confidence interval remains wide or unstable.

Example:

  • Odds ratio: 1.3

  • 95% CI: 1.01–1.68

  • p = 0.04

The p-value barely crosses the threshold, but uncertainty remains high.

What to look for instead

  • The full confidence interval

  • Whether clinically important effects are excluded

  • How sensitive the estimate appears


Ask: How confident am I about the size of this effect?


Better Questions Than “Is It Significant?”

Instead of asking:

“Is the p-value below 0.05?”

Ask:

  • How big is the effect?

  • Is the direction consistent with biology or clinical logic?

  • Could bias or confounding explain the result?

  • Is the estimate precise or unstable?

  • Would this change practice if true?

These questions often tell you more than the p-value alone.


A Simple Mental Reframe

Think of the p-value as:

A screening tool, not a verdict

It can suggest whether a finding is unlikely to be pure noise, but it cannot tell you whether the finding is true, important, or trustworthy.


Why This Matters in Clinical Research

Overemphasis on p-values leads to:

  • Overstated conclusions

  • Underappreciated uncertainty

  • Confusion between statistical and clinical significance

Better interpretation leads to:

  • More honest science

  • More reproducible findings

  • Better clinical judgment


Final Takeaway

A p-value is one piece of evidence, often the smallest one.

When reading a paper, start with:

  • Study design

  • Baseline balance

  • Effect size

  • Confidence intervals

  • Clinical plausibility

Only then should the p-value enter the conversation.


That shift alone will improve how you interpret clinical research far more than any advanced statistical method.

Comments


Subscribe to our newsletter

bottom of page