---
title: "Bayesian Model Averaging"
author: "František Bartoš"
date: "4th of May 2026"
output:
  rmarkdown::html_vignette:
    self_contained: yes
bibliography: ../inst/REFERENCES.bib
csl: ../inst/apa.csl
link-citations: true
vignette: >
  %\VignetteIndexEntry{Bayesian Model Averaging}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown_notangle}
---

```{r child = "_vignette-nowrap.md", echo = FALSE, eval = TRUE}
```

```{r setup, include = FALSE}
source("vignette-cache.R", local = knitr::knit_global())

cached_fits <- vignette_cache(
  name     = "v20-bayesian-model-averaging",
  objects  = c("fit_RE", "fit_FE", "fit_BMA_he", "fit_BMA", "fit_BMA_mod"),
  packages = c("metafor", "metadat")
)

knitr::opts_chunk$set(
  collapse    = TRUE,
  comment     = "#>",
  eval        = vignette_cache_eval(cached_fits),
  message     = FALSE,
  warning     = FALSE,
  fig.width   = 7,
  fig.height  = 3.5,
  dev         = "png",
  fig.retina  = 3
)
if (.Platform$OS.type == "windows") {
  knitr::opts_chunk$set(dev.args = list(type = "cairo"))
}
```

```{r load-models, include = FALSE}
library("RoBMA")

data("dat.baskerville2012", package = "metadat")
dat <- dat.baskerville2012

vignette_cache_load(cached_fits)
```

```{r precompute-models, include = FALSE, eval = FALSE}
library("RoBMA")

data("dat.baskerville2012", package = "metadat")
dat <- dat.baskerville2012
dat$tailor <- as.factor(dat$tailor)

fit_RE <- brma(
  yi = smd, sei = se, measure = "SMD",
  data = dat, seed = 1
)
fit_RE <- add_marglik(fit_RE)

fit_FE <- brma(
  yi = smd, sei = se, measure = "SMD",
  prior_heterogeneity = prior("spike", list(location = 0)),
  data = dat, seed = 1
)
fit_FE <- add_marglik(fit_FE)

fit_BMA_he <- BMA(
  yi = smd, sei = se, measure = "SMD",
  prior_effect_null = NULL,
  data = dat, seed = 1
)

fit_BMA <- BMA(
  yi = smd, sei = se, measure = "SMD",
  data = dat, seed = 1
)

fit_BMA_mod <- BMA(
  yi = smd, sei = se, measure = "SMD",
  mods = ~ tailor,
  data = dat, seed = 1
)

vignette_cache_save(cached_fits)
```

In meta-analysis, analysts often have to balance different assumptions about the data-generating process and decide what kind of model to fit, which covariates to include, and what publication-bias adjustment method to use.
Bayesian model averaging (BMA) tries to alleviate this problem by combining competing meta-analytic models based on their prior predictive performance.
Models that predicted the data better receive higher posterior model probability and more weight in the overall inference.
The *model-averaged estimates* allow us to propagate uncertainty about the selected model and *inclusion Bayes factors* quantify evidence across sets of models [@hoeting1999bayesian].

In this vignette we illustrate Bayesian model-averaged meta-analysis using `BMA()` on the `dat.baskerville2012` dataset to highlight the basics of Bayesian model-averaging using the `RoBMA` R package [@RoBMA].
In a later vignette [*Robust Bayesian Meta-Analysis*](v21-robust-bayesian-meta-analysis.html), we illustrate robust Bayesian meta-analysis that extends this concept to model-averaging across publication-bias adjustment models.
All models use the default prior distributions; see the [*Prior Distributions*](v01-prior-distributions.html) vignette for the prior definitions and customization options.

We start by estimating simple fixed-effect and random-effects Bayesian meta-analytic models, turn the model-comparison into a model-averaged fit, and finally also average over the presence vs absence of the effect.
Finally, we extend the model into a Bayesian model-averaged meta-regression.
Note that we do not demonstrate all of the diagnostics and summary features available to all models fitted via the `RoBMA` R package (see the [*Bayesian Meta-Analysis*](v02-bayesian-meta-analysis.html) vignette for more details).

## Dataset

The example dataset is distributed with the `metadat` package as `dat.baskerville2012` and contains 23 randomized and controlled trials of practice-facilitation interventions in primary care.
The effect size `smd` is the standardized mean difference for adherence to evidence-based primary-care guidelines and `se` is its standard error.
The binary `tailor` indicator records whether the intervention was tailored to the participating practice (1) or applied as a uniform protocol (0); we treat it as a factor and use it as a moderator later in the vignette.

```{r set-up, include = TRUE}
library("RoBMA")

data("dat.baskerville2012", package = "metadat")
dat <- dat.baskerville2012
dat$tailor <- as.factor(dat$tailor)
head(dat[, c("year", "smd", "se", "tailor")])
```

## Comparing Fixed and Random Effects

`metafor::rma.uni()` fits the conventional random-effects model on the standardized mean differences.

```{r metafor-RE}
res <- metafor::rma.uni(
  yi = smd, sei = se,
  data = dat, method = "REML"
)
res
```

The Bayesian counterpart uses `brma()` with the default prior distributions: a `Normal(0, 0.71)` unit-information prior distribution on the effect and a `Normal(0, 0.35)[0, Inf]` half-normal prior distribution on the heterogeneity.

```{r fit-RE, eval = FALSE}
fit_RE <- brma(
  yi = smd, sei = se, measure = "SMD",
  data = dat, seed = 1
)
fit_RE <- add_marglik(fit_RE)
```
```{r fit-RE-summary}
summary(fit_RE, include_mcmc_diagnostics = FALSE)
```

Setting the heterogeneity prior distribution to a spike at zero turns the random-effects model into a fixed-effect model.

```{r fit-FE, eval = FALSE}
fit_FE <- brma(
  yi = smd, sei = se, measure = "SMD",
  prior_heterogeneity = prior("spike", list(location = 0)),
  data = dat, seed = 1
)
fit_FE <- add_marglik(fit_FE)
```
```{r fit-FE-summary}
summary(fit_FE, include_mcmc_diagnostics = FALSE)
```

With marginal likelihoods attached to the models via `add_marglik()`, `bf()` returns the Bayes factor between the two fits.

```{r bf-FE-RE}
bf(fit_RE, fit_FE)
```

The Bayes factor is close to one and neither model is decisively preferred.
Rather than picking one of the two models, we can model-average over them and weight the inference by the posterior model probabilities.

## From Two Fits to Model Averaging

`BMA()` fits a mixture of the two models in one call.
Each top-level prior-distribution argument has a paired `_null` version; when we supply both, the model averages over that component.
Setting `prior_effect_null = NULL` drops the null component on the effect (spike at zero) and leaves a 2-model average over presence vs absence of heterogeneity, which is the BMA analogue of the comparison above.

```{r fit-BMA-he, eval = FALSE}
fit_BMA_he <- BMA(
  yi = smd, sei = se, measure = "SMD",
  prior_effect_null = NULL,
  data = dat, seed = 1
)
```
```{r fit-BMA-he-summary}
summary(fit_BMA_he)
```

The *Components* table reports the posterior probability and inclusion Bayes factor for heterogeneity.
This BF closely matches `bf(fit_RE, fit_FE)` above up to MCMC noise.
The two routes compute the same quantity differently: `bf()` bridge-samples marginal likelihoods from two separate fits, while `BMA()` runs a single MCMC over the joint model space with a model indicator as a latent variable and recovers the posterior model probabilities from the indicator frequencies [@lodewyckx2011tutorial].
Adding a third or fourth component to this product-space sampler costs one more indicator instead of a set of full model fits and bridge-sampling runs, which is what makes the larger ensembles in `RoBMA()` tractable.

## Adding Effect Averaging

The default `BMA()` call also averages over the presence vs absence of the effect, giving a 4-model ensemble (effect $\times$ heterogeneity, each present or absent).

```{r fit-BMA, eval = FALSE}
fit_BMA <- BMA(
  yi = smd, sei = se, measure = "SMD",
  data = dat, seed = 1
)
```
```{r fit-BMA-summary}
summary(fit_BMA)
```

Both inclusion Bayes factors come from the same MCMC.
The effect BF is overwhelming on this dataset; the heterogeneity BF is the same as before.
The model-averaged estimates are spike-and-slab mixtures: a point mass at the null value with weight equal to the posterior probability that the parameter is zero, and a slab carrying the remaining weight.

`plot()` draws both components together.
The arrow at zero is the spike's posterior probability (right axis), the curve is the slab's density, and the grey shapes show the corresponding visualization for the prior distribution.

```{r plot-BMA-mu, fig.width = 6, fig.height = 4, out.width = "75%", fig.align = "center"}
par(mar = c(4, 4, 1, 4))
plot(fit_BMA, parameter = "mu", prior = TRUE, xlim = c(-1, 1.5), lwd = 2)
```
```{r plot-BMA-tau, fig.width = 6, fig.height = 4, out.width = "75%", fig.align = "center"}
par(mar = c(4, 4, 1, 4))
plot(fit_BMA, parameter = "tau", prior = TRUE, xlim = c(0, 1), lwd = 2)
```

The *Model-averaged estimates* summarize the marginal posterior (the full mixture).
For estimates conditional on the parameter being non-zero (slab only) use the `conditional = TRUE` argument.
The marginal and conditional summaries diverge for $\tau$, which keeps substantial posterior weight on the spike, and coincide for $\mu$, where the spike's posterior weight is essentially zero.

```{r pooled-conditional}
summary(fit_BMA, conditional = TRUE)
```

## Meta-Regression with Model Averaging

Adding a moderator extends the averaging to its inclusion.
With `mods = ~ tailor`, `BMA()` averages over 8 sub-models (effect $\times$ heterogeneity $\times$ `tailor` coefficient, each present or absent).

```{r metafor-mod}
res_mod <- metafor::rma.uni(
  yi = smd, sei = se, mods = ~ tailor,
  data = dat, method = "REML"
)
res_mod
```

```{r fit-BMA-mod, eval = FALSE}
fit_BMA_mod <- BMA(
  yi = smd, sei = se, measure = "SMD",
  mods = ~ tailor,
  data = dat, seed = 1
)
```
```{r fit-BMA-mod-summary}
summary(fit_BMA_mod)
```

The *Components* table now also lists `tailor` and its inclusion BF.
Because we average the moderation across models in which the slope is zero, `regplot()` shrinks the difference towards zero when the inclusion BF is modest.

```{r regplot-BMA-mod, fig.width = 5, fig.height = 3.5, fig.align = "center"}
par(mar = c(4, 4, 1, 1))
regplot(fit_BMA_mod, mod = "tailor")
```

Alternatively, we can directly compute the estimated marginal means via the `marginal_means()` function and visualize the prior and posterior distributions at each moderator level.

```{r regplot-BMA-emm, fig.width = 5, fig.height = 3.5, fig.align = "center"}
par(mar = c(4, 4, 1, 1))
emm <- marginal_means(fit_BMA_mod)
emm

plot(emm, parameter = "tailor", prior = TRUE)
```


## Customizing Prior Components

In Bayesian model-averaged meta-analysis, every component has a paired `_null` argument (i.e. `prior_effect` and `prior_effect_null`).
Either side can be `NULL` (remove the component), a single prior distribution, or a list of prior distributions (a sub-mixture).
Common customizations are tightening or widening the slab, replacing the point-zero null with a region of practical equivalence, or forcing a moderator into every model.
The examples below use `only_priors = TRUE` to print the prior structure without fitting the models.

Tighten the effect slab with `rescale_priors = 0.5`:

```{r priors-rescale}
fit_priors <- BMA(
  yi = smd, sei = se, measure = "SMD",
  rescale_priors = 0.5,
  data = dat, only_priors = TRUE
)
print_prior(fit_priors, parameter = "mu")
```

Replace the point-zero null on the effect with a small spread around zero (a perinull) to test against a region of practical equivalence:

```{r priors-perinull}
fit_priors <- BMA(
  yi = smd, sei = se, measure = "SMD",
  prior_effect_null = prior("normal", list(mean = 0, sd = 0.05)),
  data = dat, only_priors = TRUE
)
print_prior(fit_priors, parameter = "mu")
```

Force `tailor` into every model by dropping its null:

```{r priors-force-mod}
fit_priors <- BMA(
  yi = smd, sei = se, measure = "SMD",
  mods = ~ tailor,
  prior_mods_null = list(tailor = NULL),
  data = dat, only_priors = TRUE
)
print_prior(fit_priors, parameter_mods = "tailor")
```

A tighter half-normal slab on $\tau$:

```{r priors-tau}
fit_priors <- BMA(
  yi = smd, sei = se, measure = "SMD",
  prior_heterogeneity = prior(
    distribution = "normal",
    parameters   = list(mean = 0, sd = 0.25),
    truncation   = list(lower = 0)
  ),
  data = dat, only_priors = TRUE
)
print_prior(fit_priors, parameter = "tau")
```

Empirical informed prior distributions based on the Cochrane database are available via `prior_informed_field` and `prior_informed_subfield`; see [*Informed Bayesian Model-Averaged Meta-Analysis in Medicine*](v34-bma-norm-medicine.html).

## Other `brma` Features

A `BMA()` fit is also a `brma` object: `summary()`, `plot()`, `print_prior()`, `predict()`, `marginal_means()`, `funnel()`, `rstudent()`, `qqnorm()`, and `influence()` work the same way as on a single-model fit; see [*Bayesian Meta-Analysis*](v02-bayesian-meta-analysis.html).
Prior-distribution families and informed prior distributions are covered in [*Prior Distributions*](v01-prior-distributions.html).
For binary or count outcomes, `BMA.glmm()` does the same product-space averaging on a binomial-normal or Poisson-normal likelihood.
For model averaging that also adjusts for publication bias, see [*Robust Bayesian Meta-Analysis*](v21-robust-bayesian-meta-analysis.html).

## References