Forest Plots in R with metafor: From Basic to Publication-Ready

This is Part 3 of a series on meta-analysis in clinical evidence synthesis. Part 1 covers the standard meta-analytic toolkit, and Part 2 extends it to multilevel and multivariate models.

Introduction

Forest plots are the core visual output of any meta-analysis. They communicate more in a single figure than most narrative summaries manage in several pages: individual study estimates, their precision, the pooled effect, and the degree of agreement (or disagreement) across the evidence base. A poorly constructed forest plot obscures the data. A well-constructed one makes the evidence speak for itself.

This post walks through four progressively more detailed forest plot variants using the metafor package in R (Viechtbauer, 2010). All examples use the same synthetic dataset of randomised controlled trials reporting odds ratios for a hypothetical medical device intervention. The code is fully reproducible — you can paste it into R, RStudio, or Positron and get the same output.

Setting Up the Environment

IDE Options

RStudio remains the standard IDE for R-based statistical work. It is mature, well-documented, and has excellent integration with R Markdown and Quarto for reproducible reporting.

Positron, developed by Posit (formerly RStudio, Inc.), is the newer alternative — built on VS Code's architecture with native R and Python support. If you are already working in VS Code for other tasks, Positron offers a unified environment without switching between editors. For pure R work, either IDE is fine; the code below runs identically in both.

Installing metafor

# Install metafor (only needed once)
install.packages("metafor")

# Load the package
library(metafor)

metafor is the most comprehensive R package for meta-analysis. It handles effect size calculation, model fitting (fixed-effect, random-effects, multilevel, multivariate), and produces publication-quality forest plots out of the box. For the plots in this post, no additional packages are required.

The Example Dataset

We use a synthetic dataset of 10 RCTs comparing an intervention against control, with binary outcomes reported as odds ratios. This is representative of a typical medical device or pharmaceutical meta-analysis.

# -----------------------------------------------------------
# Synthetic dataset: 10 RCTs with binary outcome data
# ai/n1i = events/total in intervention group
# ci/n2i = events/total in control group
# -----------------------------------------------------------

dat <- data.frame(
  study = c("Andersen 2018", "Becker 2019", "Cho 2020",
            "Delgado 2020", "Evans 2021", "Fernandez 2021",
            "Gruber 2022", "Hoffmann 2022", "Ibrahim 2023",
            "Jensen 2023"),
  ai = c(12, 23, 8, 31, 15, 19, 27, 10, 22, 14),   # events intervention
  n1i = c(80, 120, 55, 150, 90, 100, 140, 65, 130, 85),  # total intervention
  ci = c(22, 30, 15, 40, 25, 28, 35, 18, 33, 23),   # events control
  n2i = c(78, 118, 60, 148, 88, 102, 138, 70, 125, 82)   # total control
)

# Calculate log odds ratios and sampling variances
dat <- escalc(measure = "OR", ai = ai, n1i = n1i,
              ci = ci, n2i = n2i, data = dat,
              slab = study)

The escalc() function computes log odds ratios (yi) and their sampling variances (vi) from the 2×2 table data. The slab argument assigns study labels used in all subsequent plots. The measure = "OR" argument specifies that we want odds ratios — metafor internally works with log odds ratios and back-transforms for display.

Variant 1: The Simple Forest Plot

The most basic forest plot shows each study's effect estimate with its confidence interval, plus the pooled estimate from a random-effects model.

# -----------------------------------------------------------
# Variant 1: Simple forest plot (random-effects model)
# -----------------------------------------------------------

# Fit random-effects model (REML estimation, default)
res_re <- rma(yi, vi, data = dat, method = "REML")

# Basic forest plot — odds ratio scale
forest(res_re,
       atransf = exp,          # back-transform to OR scale
       xlab = "Odds Ratio",
       header = TRUE)          # adds "Author(s) and Year" / "Odds Ratio [95% CI]"

What this shows: Each study as a square (sized by weight) with a horizontal 95% CI line. The diamond at the bottom represents the pooled random-effects estimate. The vertical dashed line at OR = 1 marks the null effect.

This is the minimum viable forest plot. It is sufficient for an internal working document or a quick check, but it omits information that reviewers and regulators expect: the fixed-effect comparison, heterogeneity statistics, and prediction intervals.

Variant 2: Fixed-Effect and Random-Effects Models with Heterogeneity

The second variant adds the fixed-effect (common-effect) model alongside the random-effects model, and displays heterogeneity statistics directly in the plot.

# -----------------------------------------------------------
# Variant 2: Fixed-effect + random-effects with heterogeneity
# -----------------------------------------------------------

# Fit fixed-effect model
res_fe <- rma(yi, vi, data = dat, method = "FE")

# Fit random-effects model
res_re <- rma(yi, vi, data = dat, method = "REML")

# Forest plot with both models
forest(res_re,
       atransf = exp,
       xlab = "Odds Ratio",
       header = TRUE,
       mlab = "Random-Effects Model",   # label for RE diamond
       order = "obs",                    # order studies by effect size
       addfit = FALSE)                   # suppress default diamond (we add manually)

# Add fixed-effect model as first row
addpoly(res_fe, row = -1, atransf = exp,
        mlab = "Fixed-Effect Model",
        col = "blue", border = "blue")

# Add random-effects model as second row
addpoly(res_re, row = -2, atransf = exp,
        mlab = "Random-Effects Model",
        col = "red", border = "red")

# Add heterogeneity stats as text
text(-6, -3.5, pos = 4, cex = 0.75, font = 2,
     bquote(paste("Heterogeneity: ",
                  tau^2, " = ", .(formatC(res_re$tau2, digits = 3, format = "f")),
                  "; ", I^2, " = ", .(formatC(res_re$I2, digits = 1, format = "f")), "%",
                  "; ", H^2, " = ", .(formatC(res_re$H2, digits = 2, format = "f")),
                  "; Q(", .(res_re$k - 1), ") = ",
                  .(formatC(res_re$QE, digits = 2, format = "f")),
                  ", p = ", .(formatC(res_re$QEp, digits = 4, format = "f")))))

What this shows: Both pooled estimates side by side. When heterogeneity is substantial (I² > 50%), the random-effects CI will be wider than the fixed-effect CI — this is expected and correct. The heterogeneity statistics (τ², I², H², Q-test) are printed below the plot.

Why both models? The fixed-effect model answers "what is the common effect assuming all studies share the same true effect?" The random-effects model answers "what is the average effect across a distribution of true effects?" For regulatory submissions under EU MDR or FDA guidance, showing both models demonstrates that your conclusions are (or are not) robust to model choice. If they disagree substantially, that itself is an important finding.

Variant 3: Adding Prediction Intervals

The prediction interval is arguably the most underused and most informative element in a meta-analysis forest plot. While the confidence interval around the pooled estimate tells you about the average true effect, the prediction interval tells you what range of true effects you might expect in a new study (Higgins et al., 2009; IntHout et al., 2016).

# -----------------------------------------------------------
# Variant 3: Forest plot with prediction interval
# -----------------------------------------------------------

res_re <- rma(yi, vi, data = dat, method = "REML")

# Forest plot with prediction interval
forest(res_re,
       atransf = exp,
       xlab = "Odds Ratio",
       header = TRUE,
       mlab = "RE Model",
       addpred = TRUE,            # <-- adds the prediction interval
       order = "obs",
       shade = TRUE)              # subtle shading for readability

# If you prefer to add it manually for more control:
# pred <- predict(res_re)
# addpoly(res_re, row = -1, atransf = exp,
#         mlab = "RE Model (95% PI)",
#         pi.type = "line")

What this shows: The diamond is the pooled estimate with its 95% CI as before. The extended dashed lines (or thinner lines, depending on metafor version) show the 95% prediction interval — the range within which the true effect of 95% of future studies is expected to fall.

Why this matters: In clinical evidence synthesis, the prediction interval answers the question that decision-makers actually care about: "If we conduct another trial, what effect size should we expect?" A pooled OR of 0.60 [95% CI: 0.45–0.80] looks convincingly protective. But if the prediction interval is [0.25–1.45], it means that some future studies could plausibly show harm. This fundamentally changes the clinical interpretation, and it is exactly the kind of nuance that a simple CI diamond hides.

For EU MDR clinical evaluations, where you need to demonstrate that the clinical evidence supports a favorable benefit-risk profile in the intended patient population, the prediction interval is directly relevant to the generalizability argument.

Variant 4: Publication-Ready Plot with Full Study Details

The final variant is a fully annotated forest plot suitable for journal submission or a clinical evaluation report. It includes study-level event counts, sample sizes, odds ratios with CIs as text, and model summaries.

# -----------------------------------------------------------
# Variant 4: Publication-ready forest plot with study details
# -----------------------------------------------------------

res_re <- rma(yi, vi, data = dat, method = "REML")

# Calculate study-level ORs and CIs for annotation
dat$OR  <- exp(dat$yi)
dat$ci_lb <- exp(dat$yi - 1.96 * sqrt(dat$vi))
dat$ci_ub <- exp(dat$yi + 1.96 * sqrt(dat$vi))
dat$OR_text <- paste0(formatC(dat$OR, format = "f", digits = 2),
                      " [", formatC(dat$ci_lb, format = "f", digits = 2),
                      ", ", formatC(dat$ci_ub, format = "f", digits = 2), "]")

# Weights (as percentage of total weight)
wi <- weights(res_re)
dat$weight_pct <- formatC(wi, format = "f", digits = 1)

# Set up plot with extra columns
# par() settings control the overall layout
par(mar = c(4, 0, 1, 0), font = 1)

forest(res_re,
       atransf = exp,
       xlab = "Odds Ratio (95% CI)",
       header = TRUE,
       mlab = "Random-Effects Model (REML)",
       addpred = TRUE,
       order = "obs",
       ilab = cbind(
         paste0(dat$ai, "/", dat$n1i),     # events/n intervention
         paste0(dat$ci, "/", dat$n2i),      # events/n control
         dat$weight_pct                      # weight %
       ),
       ilab.xpos = c(-5.5, -4.0, -2.5),    # positions for extra columns
       ilab.pos = 2,                         # left-align
       xlim = c(-9, 5),                      # widen plot area
       cex = 0.80)

# Column headers for the extra annotations
text(c(-5.5, -4.0, -2.5), res_re$k + 2, pos = 2, cex = 0.75, font = 2,
     c("Intervention\n(Events/N)",
       "Control\n(Events/N)",
       "Weight\n(%)"))

# Add heterogeneity and test statistics below the plot
text(-9, -1.5, pos = 4, cex = 0.70,
     bquote(paste("RE Model: ",
                  hat(mu), " = ", .(formatC(exp(coef(res_re)), digits = 2, format = "f")),
                  " [", .(formatC(exp(res_re$ci.lb), digits = 2, format = "f")),
                  ", ", .(formatC(exp(res_re$ci.ub), digits = 2, format = "f")), "]",
                  ";  ", tau^2, " = ", .(formatC(res_re$tau2, digits = 4, format = "f")),
                  ";  ", I^2, " = ", .(formatC(res_re$I2, digits = 1, format = "f")), "%",
                  ";  Q(", .(res_re$k - 1), ") = ",
                  .(formatC(res_re$QE, digits = 2, format = "f")),
                  ", p = ", .(formatC(res_re$QEp, digits = 4, format = "f")))))

What this shows: The complete picture. Each study row now displays the event data (events/total for both groups), the individual study weight in the pooled analysis, and the point estimate with CI is shown both graphically and as text on the right side (the default behavior of forest()). The prediction interval extends beyond the diamond. Below the plot, the model summary includes the back-transformed pooled OR, τ², I², and the Q-test.

Column-by-column, a reviewer can immediately see: which studies are large vs. small (event counts and sample sizes), which studies contribute most to the pooled estimate (weight %), whether the treatment effect is consistent across studies (visual inspection of overlap), whether the pooled effect is driven by a few large trials or is broadly supported, and what the heterogeneity picture looks like.

Exporting Plots for Reports and Publications

For inclusion in Word documents, clinical evaluation reports, or journal submissions, export as high-resolution PNG or PDF:

# -----------------------------------------------------------
# Export as high-resolution PNG (300 DPI)
# -----------------------------------------------------------

png("forest_plot_variant4.png", width = 10, height = 7,
    units = "in", res = 300)

# ... insert the forest() call from Variant 4 here ...

dev.off()


# -----------------------------------------------------------
# Export as PDF (vector graphics, preferred for journals)
# -----------------------------------------------------------

pdf("forest_plot_variant4.pdf", width = 10, height = 7)

# ... insert the forest() call from Variant 4 here ...

dev.off()

For PDF output, the fonts remain crisp at any zoom level — this is generally preferred for regulatory documents and journal submissions. PNG at 300 DPI is acceptable for most contexts and easier to embed in Word or PowerPoint.

A Note on Reproducibility

All code in this post was developed and tested with R 4.4.x and metafor 4.6-0. The metafor package is actively maintained by Wolfgang Viechtbauer and receives regular updates. If you are using an older version, some arguments (e.g., shade, addpred) may not be available — update with update.packages("metafor").

For full reproducibility, consider using renv to lock package versions, or include sessionInfo() at the end of your analysis script. In a regulatory context, documenting the exact software versions used for statistical analyses is not optional — it is a requirement under ISO 14155 and expected by notified bodies reviewing clinical evidence under EU MDR.

Key Takeaways

The simple forest plot (Variant 1) is a starting point, not a deliverable. At minimum, a meta-analysis figure for any serious document should include both fixed-effect and random-effects models (Variant 2) so the reader can assess sensitivity to model choice.

Prediction intervals (Variant 3) answer the question that matters most for clinical decision-making: what effect would we expect in a future study or patient population? Always include them.

For regulatory submissions and journal publications, the fully annotated plot (Variant 4) with study-level data, weights, and heterogeneity statistics is the standard to aim for. It allows the reader to critically appraise the analysis without flipping between the figure and a supplementary table.

metafor handles all of this in base R with no additional dependencies. Learn the forest(), addpoly(), and predict() functions well — they cover 90% of what you will need.

References

Higgins, J. P., Thompson, S. G., & Spiegelhalter, D. J. (2009). A re-evaluation of random-effects meta-analysis. Journal of the Royal Statistical Society: Series A, 172(1), 137–159.
IntHout, J., Ioannidis, J. P. A., Rovers, M. M., & Goeman, J. J. (2016). Plea for routinely presenting prediction intervals in meta-analysis. BMJ Open, 6(7), e010247.
Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48.

For questions about this topic or support with your clinical evidence synthesis, feel free to contact us.