Interpreting Cox Proportional Hazards Models: A Complete Worked Example
A reviewer asks: “What does your hazard ratio of 0.68 actually mean? And did you check the proportional hazards assumption?” These two questions decide whether your Cox model analysis stays in the paper. Cox proportional hazards regression is the workhorse of survival analysis — but most practitioners can run one in R without being able to interpret or defend the output.
This post walks a complete worked example: from dataset structure to fitted coefficients to hazard ratio interpretation to the proportional hazards assumption check. Every number is computed from the same 80-patient clinical dataset so you can map each concept to a concrete value.
What a Cox Model Estimates
A Cox proportional hazards model relates the instantaneous risk of an event (the hazard) to one or more covariates. The core equation is:
$h(t | X) = h_0(t) \cdot \exp(\beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_k X_k)$where h(t | X) is the hazard at time t given covariates X, h0(t) is the baseline hazard (unspecified — the “semi-parametric” part), and each β is a coefficient. The model does not estimate the baseline hazard; it estimates how covariates multiply it.
The practical consequence: we cannot report an absolute risk from Cox alone. What we report are hazard ratios — the ratio of hazards between two levels of a covariate.
A coefficient of β = −0.386 means HR = exp(−0.386) = 0.68. A coefficient of β = 0.693 means HR = exp(0.693) = 2.0. Positive coefficients = higher hazard; negative = lower hazard.
The Worked Example: An 80-Patient Trial
We are analyzing a hypothetical oncology trial with 80 patients randomized to treatment (n = 40) or control (n = 40). Follow-up time is months from randomization to death (the event) or last contact (censored). We also have age at randomization and ECOG performance status.
Each patient contributes three variables: time (numeric, months), status (1 = died, 0 = censored), and covariates (treatment, age, ecog). One row per patient.
Summary statistics:
| Group | n | Events | Median follow-up | Median age |
|---|---|---|---|---|
| Treatment | 40 | 18 | 22 months | 62 |
| Control | 40 | 27 | 19 months | 64 |
Fitting the Model in R
Using the survival package:
library(survival)
fit <- coxph(Surv(time, status) ~ treatment + age + ecog, data = trial)
summary(fit)
Simplified output:
| Covariate | β (coef) | HR = exp(β) | 95% CI | z | p |
|---|---|---|---|---|---|
| treatment (1 = drug) | −0.386 | 0.680 | 0.47 – 0.98 | −2.08 | .037 |
| age (per year) | 0.029 | 1.029 | 1.00 – 1.06 | 1.93 | .054 |
| ecog (per unit, 0–3) | 0.511 | 1.667 | 1.15 – 2.42 | 2.69 | .007 |
Concordance index: C = 0.71 (SE = 0.04). Likelihood ratio test: χ²(3) = 18.3, p < .001.
Interpreting Each Hazard Ratio
Treatment effect: HR = 0.68
What it means: At any given time, patients on the experimental drug have 68% the hazard of death compared to controls — a 32% reduction in the instantaneous risk of death. The 95% CI [0.47, 0.98] excludes 1.0, so this is statistically significant.
What it does NOT mean: It does not mean “32% fewer patients died” — that is a risk ratio, which compares cumulative proportions over a fixed interval. It does not mean “patients lived 32% longer” — that is a median survival ratio. The HR describes the rate of events occurring, not the count of events or the survival duration.
Reporting “treatment reduced death by 32%” is the #1 interpretation error with Cox output. A 32% hazard reduction translates to very different absolute survival differences depending on baseline event rate and follow-up time. Say “32% lower hazard” or “hazard ratio 0.68” — these are precise and defensible.
Age effect: HR = 1.029 per year
Each additional year of age at randomization multiplies the hazard by 1.029 — a 2.9% higher hazard per year. Over a 10-year age difference, hazards multiply by 1.029¹° = 1.33, a 33% relative increase. The CI [1.00, 1.06] borderline contains 1.0, so the age effect is marginal (p = .054).
ECOG effect: HR = 1.667 per unit
Each 1-unit worsening of ECOG performance status (0 = fully active to 3 = severely limited) multiplies hazard by 1.667. A patient with ECOG 2 has hazard 1.667² = 2.78 times that of an ECOG 0 patient. This is the strongest effect in the model.
Checking the Proportional Hazards Assumption
The Cox model assumes that hazards are proportional over time — that is, the ratio of hazards between any two patients stays constant. If this assumption is violated (for example, a treatment that helps early but harms later), the reported HR is a misleading time-average.
The standard check is Schoenfeld residuals using cox.zph():
zph <- cox.zph(fit)
print(zph)
| Covariate | χ² | df | p |
|---|---|---|---|
| treatment | 1.12 | 1 | .29 |
| age | 0.54 | 1 | .46 |
| ecog | 4.81 | 1 | .028 |
| GLOBAL | 6.34 | 3 | .096 |
The global test is not significant (p = .096), but ECOG is (p = .028). Visual inspection of the Schoenfeld residual plot for ECOG would show whether there is a clear time trend. Reviewers increasingly want both the formal test and the residual plot.
The Schoenfeld test is sample-size dependent. With n > 200 you will detect trivial violations; with n < 50 you will miss large ones. Always complement the p-value with a residual plot. A violation that is practically meaningful shows a clear slope in the plot, not just a small p-value.
What to do if the PH assumption fails
Three options in order of increasing complexity:
- Stratify on the violating variable.
coxph(Surv(time, status) ~ treatment + age + strata(ecog)). You lose the HR for ECOG but the other HRs remain valid. - Fit a time-varying coefficient. Use
tt()or split the time axis and fit different HRs before and after a cutpoint. - Switch to a different model. Accelerated failure time (AFT) models, flexible parametric models (Royston-Parmar), or restricted mean survival time (RMST) do not require proportional hazards.
Reporting the Results
Here is a complete, publishable results paragraph for the worked example:
“A multivariable Cox proportional hazards model was fit with treatment arm, age, and ECOG performance status as covariates. After adjustment, experimental treatment was associated with a lower hazard of death (HR 0.68, 95% CI 0.47–0.98, p = .037). ECOG performance status showed the strongest effect (HR 1.67 per unit, 95% CI 1.15–2.42, p = .007). Age showed a borderline association (HR 1.03 per year, 95% CI 1.00–1.06, p = .054). Model discrimination was moderate (Harrell’s C = 0.71). The proportional hazards assumption was tested using Schoenfeld residuals; the global test was not violated (p = .096), though a time-dependent effect was detected for ECOG (p = .028), which was addressed by stratification in a sensitivity analysis.”
Assumption Checks Beyond Proportional Hazards
Three other assumptions deserve a check in any serious Cox analysis:
1. Linearity of continuous covariates
Cox assumes that the log-hazard is linear in each continuous covariate. Test by plotting Martingale residuals against each covariate. If the residual plot shows a clear curve, use splines or transform the covariate (for example, age², log(age)).
2. Influential observations
Use dfbetas (one per covariate and observation). A value > 2/√n signals disproportionate influence. One patient should not drive the result.
3. Overfitting
The standard rule is at least 10–20 events per covariate. With 45 events in our example and 3 covariates (45/3 = 15 events per covariate), we are at the lower end of acceptable. With fewer events, consider penalized regression or reduce the covariate count.
When Cox Is Not the Right Model
Cox regression is the default for time-to-event analysis, but it is not always appropriate. Use different models when:
- Competing risks are present. If patients can die from causes unrelated to the disease of interest, treating those as censored biases Cox estimates. Use Fine-Gray subdistribution hazards instead.
- Cure fraction exists. If a portion of patients are effectively cured, standard Cox underestimates long-term survival. Mixture cure models are appropriate.
- You want absolute risk, not relative. Cox gives hazard ratios, not absolute probabilities. For clinical prediction tools showing 5-year survival probability, use a flexible parametric or pseudo-observation approach.
- Non-proportional hazards are severe. RMST or time-dependent models handle crossing curves more honestly.
For a visual introduction before the regression, start with the Kaplan-Meier curve and log-rank test. The KM curve tells the cohort’s story over time; Cox regression adds the adjustment machinery needed for publication.
References for Reporting Standards
For survival analysis in clinical or preclinical publications, these are the authoritative references reviewers expect:
- CONSORT guidelines — randomized trial reporting, including survival endpoints
- Survival analysis—part 2: Cox proportional hazards model (PMC) — peer-reviewed methodology reference
- STROBE guidelines — observational studies, including time-to-event analyses
When you present Cox results, the expected package is: HR with 95% CI, p-value, concordance index, a PH assumption test, and a plot showing either the Kaplan-Meier curves by group or the model-adjusted survival. Missing any of these in a modern biomedical journal is grounds for revision.