Kaplan-Meier Survival Curves in R: ggsurvplot, Log-Rank Tests, and Publication Formatting

April 9, 2026

Why Survival Curves Trip Up Even Experienced R Users

If you have time-to-event data and need a Kaplan-Meier survival curve for a manuscript, R’s survminer package is the standard tool. But getting from a raw dataset to a publication-ready figure with the correct log-rank annotation, risk table, and journal-compliant formatting involves more steps than the documentation suggests. This guide walks through how to create survival curves in R using Kaplan-Meier estimation and ggsurvplot — including the formatting details that reviewers actually check.

Step 1: Fit the Kaplan-Meier Model

The survival package provides the core infrastructure. The Surv() function creates a survival object from your time and event columns, and survfit() fits the Kaplan-Meier estimator.

library(survival)
library(survminer)

# Fit Kaplan-Meier for two treatment arms
fit <- survfit(Surv(time, status) ~ treatment, data = mydata)
summary(fit)

The status column should be coded as 1 = event (e.g., death, relapse) and 0 = censored. This coding convention is critical — reversing it will invert your entire curve.

Common Mistake

Coding censored observations as 1 and events as 0. The Surv() function treats 1 as the event by default. If your dataset uses a different convention, explicitly set Surv(time, status == "Dead") or recode before fitting.

Step 2: Create the Basic Kaplan-Meier Plot with ggsurvplot

The ggsurvplot() function from survminer wraps ggplot2 to produce a survival curve with minimal code:

ggsurvplot(fit,
           data = mydata,
           pval = TRUE,
           conf.int = TRUE,
           risk.table = TRUE,
           xlab = "Time (months)",
           ylab = "Survival probability")

This single call produces the curve, shaded confidence intervals, a log-rank p-value annotation, and the number-at-risk table. These are the four elements most journals require for a Kaplan-Meier figure.

Step 3: Add the Log-Rank Test Annotation

Setting pval = TRUE overlays the log-rank test p-value on the plot. For multi-arm studies comparing more than two groups, this reports the omnibus log-rank test. If you need pairwise comparisons, use pairwise_survdiff():

# Pairwise log-rank with Bonferroni correction
pairwise_survdiff(Surv(time, status) ~ treatment,
                  data = mydata,
                  p.adjust.method = "bonferroni")

Report the omnibus test in the figure and pairwise results in the text. CONSORT and STROBE guidelines recommend reporting hazard ratios from Cox models alongside Kaplan-Meier curves when comparing treatments — the survival curve shows the pattern, the hazard ratio quantifies the effect.

Step 4: Format the Risk Table

The number-at-risk table below the curve tells readers how many subjects remain at each time point. Without it, reviewers cannot assess whether late-stage survival estimates are reliable.

ggsurvplot(fit,
           data = mydata,
           risk.table = TRUE,
           risk.table.col = "strata",
           risk.table.y.text = FALSE,
           break.time.by = 6,
           xlab = "Time (months)",
           tables.theme = theme_cleantable())

Key formatting decisions:

  • break.time.by = 6 — sets x-axis tick marks at 6-month intervals (adjust to match your study timeline)
  • risk.table.col = "strata" — colors the risk table text to match the curve colors
  • tables.theme = theme_cleantable() — removes gridlines for a cleaner look
Tip

Most oncology journals expect number-at-risk tables at regular intervals (6 or 12 months). If your study spans 36 months, use break.time.by = 12. For shorter studies (under 12 months), monthly intervals are standard.

Step 5: Customize Colors and Themes for Publication

Default ggsurvplot colors work for presentations but not for journal figures. Many journals print in grayscale, and colorblind-safe palettes are increasingly required.

ggsurvplot(fit,
           data = mydata,
           palette = c("#0073C2", "#EFC000"),
           linetype = c("solid", "dashed"),
           ggtheme = theme_bw(),
           font.main = c(14, "bold"),
           font.x = c(12, "plain"),
           font.y = c(12, "plain"),
           font.tickslab = c(10, "plain"))

Use both color and line type to distinguish groups — this ensures the curves remain distinguishable in grayscale reproduction. Export at 300 DPI minimum for print journals:

ggsave("km_curve.tiff", plot = print(p),
       width = 7, height = 5, dpi = 300)

Step 6: Add Median Survival and Censoring Marks

Median survival time is a standard summary statistic for survival data. Add it to the plot with a horizontal/vertical reference line:

ggsurvplot(fit,
           data = mydata,
           surv.median.line = "hv",
           censor = TRUE,
           censor.shape = "|",
           censor.size = 3)

The small tick marks (censor.shape = "|") on the curve indicate censored observations. Reviewers expect to see these — they communicate where data was lost to follow-up versus where events occurred. Omitting censoring marks is a common oversight that triggers revision requests.

Step 7: Report Results in the Manuscript

The figure alone is not sufficient. Your methods and results sections need complementary text. A complete survival analysis report includes:

  • Methods: “Survival was estimated using the Kaplan-Meier method. Groups were compared using the log-rank test. Hazard ratios were estimated using Cox proportional hazards regression.”
  • Results: “Median overall survival was 14.2 months (95% CI: 11.8–17.4) in the treatment group versus 9.6 months (95% CI: 7.3–12.1) in the control group (log-rank p = 0.003; HR = 0.62, 95% CI: 0.45–0.85).”

Note the required components: median survival with confidence intervals for each group, the log-rank p-value, and the Cox hazard ratio with its own CI. The CONSORT guidelines formalize these requirements for randomized trials, and the STROBE statement covers observational cohort studies.

Worked Example

From survfit() output: median survival = 310 days (treatment) vs. 270 days (control). The log-rank test yields chi-squared = 8.41, df = 1, p = 0.0037. The Cox model gives HR = 0.68 (95% CI: 0.52–0.89). Report: “Treatment was associated with significantly longer survival (median 310 vs. 270 days; log-rank p = 0.004; HR = 0.68, 95% CI: 0.52–0.89).”

Common Pitfalls and Reviewer Feedback

These are the issues that generate revision requests most frequently:

  1. Missing number-at-risk table — nearly universal requirement for oncology and clinical journals
  2. No censoring marks — reviewers cannot assess data completeness without them
  3. Reporting only p-values — current guidelines require hazard ratios and confidence intervals alongside p-values, as discussed in our guide on sample size calculation for clinical trials
  4. Truncating the x-axis — if you cut the axis at a time point where few subjects remain, the curve appears more optimistic than the data supports
  5. Wrong censoring code — the Surv() function treats 1 as event by default; reversed coding inverts the curve entirely

If your study involves pre-study power analysis for sample size justification, make sure the survival analysis assumptions (expected median survival, accrual rate, dropout rate) match what was specified in the protocol.

Putting It All Together

Here is a complete, publication-ready code block combining all the elements above:

library(survival)
library(survminer)

fit <- survfit(Surv(time, status) ~ treatment, data = mydata)

p <- ggsurvplot(fit,
           data = mydata,
           pval = TRUE,
           conf.int = TRUE,
           risk.table = TRUE,
           risk.table.col = "strata",
           surv.median.line = "hv",
           censor = TRUE,
           censor.shape = "|",
           censor.size = 3,
           palette = c("#0073C2", "#EFC000"),
           linetype = c("solid", "dashed"),
           break.time.by = 6,
           xlab = "Time (months)",
           ylab = "Survival probability",
           ggtheme = theme_bw(),
           tables.theme = theme_cleantable(),
           font.main = c(14, "bold"),
           font.x = c(12, "plain"),
           font.y = c(12, "plain"))

ggsave("figure1_km.tiff", plot = print(p),
       width = 7, height = 5, dpi = 300)

This produces a figure with confidence bands, censoring marks, median survival lines, a log-rank p-value, a clean risk table, and colorblind-safe colors with distinguishable line types for grayscale printing.

Ready to analyze your data?

Join the beta waitlist and be the first to try GraphHelix.