t-test vs. Mann-Whitney U: when to use each

Updated March 2026

You have two independent groups and a continuous outcome. Should you use an unpaired t-test or a Mann-Whitney U test? This is one of the most frequent decisions in applied statistics, and the answer depends on whether your data meets the assumptions of the parametric test.

The short answer

Condition Use this test
Data is approximately normal in each group, similar variances Unpaired t-test
Unequal variances but approximately normal Welch's t-test (default in most software)
Data is not normal (skewed, outliers, ordinal) Mann-Whitney U test
Large samples (n > 30 per group), moderate non-normality Either — t-test is robust here

What each test does

Unpaired t-test

The unpaired (independent samples) t-test compares the means of two groups. It assumes:

  1. Independence — observations in the two groups are independent
  2. Normality — the outcome variable is approximately normally distributed in each group
  3. Equal variances — the variances in the two groups are similar (relaxed by Welch's correction)

When these assumptions hold, the t-test is the most powerful test for detecting a difference in means — meaning it has the best chance of finding a real effect.

Mann-Whitney U test

The Mann-Whitney U test compares the distributions (more precisely, the ranks) of two independent groups. It only assumes:

  1. Independence — observations are independent
  2. Ordinal data — the outcome can be ranked (which all continuous data can)

It does not assume normality or equal variances. This makes it the safer choice when assumptions are in doubt.

Key distinction: The t-test compares means. The Mann-Whitney U tests whether one group tends to have larger values than the other. If both distributions have the same shape, these answer the same question. If the distributions differ in shape, they can give different answers — and the Mann-Whitney may be more meaningful.

When to choose each test

Use the t-test when:

Use Mann-Whitney U when:

A common misconception

Many researchers believe that if the Shapiro-Wilk test is significant, they must use Mann-Whitney. This is too rigid. Consider:

The decision should be based on the severity of the violation, not just whether Shapiro-Wilk's p-value is below .05.

Effect sizes

Both tests have appropriate effect size measures:

Test Effect size Interpretation
t-test Cohen's d Small: 0.20, Medium: 0.50, Large: 0.80
Mann-Whitney U Rank-biserial r Small: 0.10, Medium: 0.30, Large: 0.50

Always report an effect size alongside the p-value. A statistically significant result with a tiny effect size may not be practically meaningful. See our effect sizes guide for more detail.

How to report each test in APA format

Unpaired t-test

APA format

An independent-samples t-test indicated that the treatment group (M = 23.4, SD = 5.1) scored significantly higher than the control group (M = 18.7, SD = 4.8), t(48) = 3.45, p = .001, d = 0.97, 95% CI [0.38, 1.56].

Mann-Whitney U test

APA format

A Mann-Whitney U test indicated that pain ratings were significantly lower in the treatment group (Mdn = 3) than in the control group (Mdn = 5), U = 156, p = .003, r = .45.

Note: for Mann-Whitney, report medians (not means), since the test is based on ranks.

Decision checklist

  1. Check normality in each group (Shapiro-Wilk + Q-Q plot)
  2. If normal in both groups: unpaired t-test
  3. If normality fails but n > 30 per group and violation is moderate: t-test is likely fine — report the violation in your methods
  4. If normality fails with small samples, strong skew, or ordinal data: Mann-Whitney U
  5. If in doubt: report both tests. If they agree, it strengthens your conclusion

Join the beta to try this in GraphHelix — the AI checks normality and equal variances automatically, and suggests switching to Mann-Whitney U when assumptions are violated.

Join the beta