---
title: "Prior Distributions"
author: "František Bartoš"
date: "28th of April 2026"
output:
  rmarkdown::html_vignette:
    self_contained: yes
bibliography: ../inst/REFERENCES.bib
csl: ../inst/apa.csl
link-citations: true
vignette: >
  %\VignetteIndexEntry{Prior Distributions}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown_notangle}
---

```{r child = "_vignette-nowrap.md", echo = FALSE, eval = TRUE}
```

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse   = TRUE,
  comment    = "#>",
  message    = FALSE,
  warning    = FALSE,
  fig.width  = 6,
  fig.height = 4
)

library(RoBMA)

has_metafor <- requireNamespace("metafor", quietly = TRUE)
has_bcg     <- has_metafor && requireNamespace("metadat", quietly = TRUE)
```

Prior distributions describe which parameter values are plausible before seeing the data.
They are an essential part of any Bayesian analysis, and they matter especially in meta-analysis: many datasets contain only a small or moderate number of studies, and reasonable prior distributions stabilize estimates that the data alone would identify weakly.

The role of prior distributions differs substantially between *estimation* and *hypothesis testing* (or Bayesian model averaging).
For estimation, the goal is to recover a parameter value, and weakly informative prior distributions regularize the posterior without dominating it; the data quickly pull the posterior toward the likelihood as the number of studies grows.
For hypothesis testing and model averaging the prior distribution under the alternative is part of the hypothesis: the Bayes factor compares how well competing prior predictions match the data.
A very wide prior distribution on the alternative spreads probability across implausible values, so ordinary data look surprisingly consistent with the null, and the Bayes factor leans toward the null even when the effect is real.
Bayesian model averaging amplifies this: the alternative model receives little posterior probability and the model-averaged estimate is shrunk toward zero.

The default prior distributions in the `RoBMA` R package are therefore deliberately not "flat".
They are weakly informative: wide enough to regularize estimates, but informative enough to keep hypothesis tests and Bayesian model averaging well-calibrated.
This vignette walks through the practical paths to specifying prior distributions in the `RoBMA` R package, following the discussion of prior distributions in @bartos2025bayesian.
Also see @mulder2024bayesian for further discussion and resources.

## Specifying Prior Distributions

Prior distributions in the `RoBMA` R package come in two parts.
Prior distributions for the *meta-analytic parameters* (the average effect, between-study heterogeneity, and any meta-regression coefficients) can be specified using package defaults, empirical (informed) prior distributions, or fully custom prior distributions.
Prior distributions for the *publication-bias adjustment models*, used by `RoBMA()` and the related single-model functions, are treated separately and select among preset ensembles or a custom ensemble.

| Approach | Description | Main arguments |
|:--|:--|:--|
| Default prior distributions | Standardized prior distributions automatically scaled to the effect-size measure | `measure`, `rescale_priors`, `prior_unit_information_sd` |
| Empirical/informed prior distributions | Empirical prior distributions derived from previous meta-analyses | `prior_informed_field`, `prior_informed_subfield`, `prior_informed()` |
| Custom prior distributions | Fully user-specified prior distributions via `prior()` and helpers | `prior_effect`, `prior_heterogeneity`, `prior_mods`, `prior_scale` |
| Publication-bias prior distributions | Bias-adjustment models and ensembles | `model_type`, `prior_bias`, `prior_bias_null` |

Throughout this vignette we pass `only_priors = TRUE` to ask the package to assemble the prior distributions and stop before fitting the model.
This makes prior specifications visible without running MCMC.
Drop `only_priors = TRUE` to fit the corresponding model.

## Default Prior Distributions

### Basic example

The simplest way to use the defaults is to tell the `RoBMA` R package what effect-size measure is being analyzed; the package then specifies prior distributions on the matching scale.
We start with `brma()` because it fits one meta-analytic model and makes the prior distributions easiest to inspect.

The BCG vaccine dataset from the `metadat` package contains two-by-two tables.
We compute log risk ratios and their standard errors, then use `brma()` to show the default prior distributions for a simple random-effects meta-analysis.

```{r bcg-data, eval = has_bcg}
data("dat.bcg", package = "metadat")

bcg <- metafor::escalc(
  ai = tpos, bi = tneg, ci = cpos, di = cneg,
  measure = "RR", data = dat.bcg
)
```

```{r bcg-default-priors, eval = has_bcg}
bcg_priors <- brma(
  yi = yi, vi = vi, measure = "RR",
  data = bcg, only_priors = TRUE
)

plot_prior(bcg_priors,  parameter = "mu")
plot_prior(bcg_priors,  parameter = "tau")
print_prior(bcg_priors)
```

The first plot shows the prior distribution for the average effect, $\mu$.
The second plot shows the prior distribution for the between-study heterogeneity, $\tau$.
For `measure = "RR"` the package places a `Normal(0, 1)` prior distribution on the log risk ratio and a positive `Normal(0, 0.5)` prior distribution on heterogeneity.
The next subsection explains where these scales come from.

### Effect-size measures and their scales

Different effect-size measures live on different scales: while a standardized mean difference of 1 is large, a Fisher's $z$ of 1 is enormous.
As such, a single fixed prior such as `Normal(0, 1)` would not be a sensible default across different effect sizes.

The `RoBMA` R package deals with this problem by relying on the *unit-information standard deviation* (UISD) concept [@rover2021weakly].
Standardized effect sizes have known UISD values, and for non-standardized inputs (`measure = "GEN"`) the UISD can be estimated from the per-study standard errors and sample sizes (or specified manually):

| Effect-size measure | `measure` | UISD used by default |
|:--|:--|:--|
| Standardized mean difference (Cohen's $d$, Hedges' $g$) | `"SMD"` | `sqrt(2)` |
| Fisher's $z$-transformed correlation | `"ZCOR"` | `1` |
| Log risk ratio | `"RR"` | `sqrt(4)` |
| Log odds ratio | `"OR"` | `sqrt(4)` |
| Log hazard ratio | `"HR"` | `sqrt(4)` |
| Log incidence rate ratio | `"IRR"` | `sqrt(4)` |
| Generic / non-standardized | `"GEN"` | estimated from `sei` and `ni`, or set manually |

The default prior distributions then use fractions of the UISD to specify the scale of each prior distribution.

| Parameter | Meaning | Default prior distribution |
|:--|:--|:--|
| `mu`    | average effect                                 | `Normal(0, 0.5 * UISD)` |
| `tau`   | heterogeneity                                  | positive `Normal(0, 0.25 * UISD)` |
| `mods`  | meta-regression coefficients                   | `Normal(0, 0.25 * UISD)` |
| `scale` | heterogeneity-regression coefficients          | `Normal(0, 0.5)` |
| `rho`   | heterogeneity allocation in multilevel models  | `Beta(1, 1)` |

The `scale` regression coefficients are not on the effect-size scale: they describe changes in log heterogeneity, a multiplicative effect, and are therefore scale-free.
The notation for the meta-regression and multilevel parameters follows the `RoBMA` R package meta-regression and multilevel papers [@bartos2023robust; @bartos2025robust].

The default prior distributions can be made wider or narrower with `rescale_priors`.
Values larger than one make the prior distributions wider; values smaller than one make them narrower.

```{r bcg-rescale, eval = has_bcg}
bcg_wider_priors <- brma(
  yi = yi, vi = vi, measure = "RR",
  rescale_priors = 2,
  data = bcg, only_priors = TRUE
)

print_prior(bcg_wider_priors)
```

The effect prior distribution changed from `Normal(0, 1)` to `Normal(0, 2)`, and the heterogeneity prior distribution from positive `Normal(0, 0.5)` to positive `Normal(0, 1)`.

### Standardized effect-size inputs

Several common effect-size measures live on familiar but inconvenient scales.
Correlations are bounded in $(-1, 1)$ with non-linear behaviour near the boundaries, and their sampling variance depends on the effect size itself.
Odds ratios and risk ratios are positive and asymmetric around the no-effect value of 1.
For each of these, the standard practice is to transform the inputs to a related scale on which the sampling distribution is approximately normal: Fisher's $z$ for correlations, log odds ratio for odds ratios, log risk ratio for risk ratios.
`metafor::escalc()` performs all these transformations, and the `RoBMA` R package is set up to operate on the transformed scale.
Use `measure = "ZCOR"` for Fisher's $z$, `measure = "OR"` for log odds ratios, `measure = "RR"` for log risk ratios, and so on.
The default prior distributions then use the UISD that matches the transformed scale.

The `Kroupova2021` dataset of correlations between student employment and educational outcomes makes the workflow concrete.
We use `escalc()` to convert correlations and sample sizes into Fisher's $z$ effect sizes and their sampling variances:

```{r Kroupova2021-data, eval = has_metafor}
data("Kroupova2021", package = "RoBMA")

Kroupova2021 <- metafor::escalc( 
  ri = r, ni = sample_size, measure = "ZCOR",
  data = Kroupova2021
)
```

`brma()` recognises `measure = "ZCOR"` and uses the default Fisher's-$z$ prior distributions:

```{r Kroupova2021-priors, eval = has_metafor}
Kroupova2021_priors <- brma(
  yi = yi, vi = vi, measure = "ZCOR",
  data = Kroupova2021, only_priors = TRUE
)

print_prior(Kroupova2021_priors)
```

The same pattern applies to other standardized measures: compute the appropriate transformation in `escalc()` and pass the matching `measure` to `brma()`.

The model and its prior distributions then operate on the transformed scale, but posterior summaries can be reported back on the familiar original scale via the `output_measure` and `transform` arguments of the relevant summary functions (e.g., `pooled_effect()`, `marginal_means()`, and `plot()` functions).
For correlations fit with `measure = "ZCOR"`, set `output_measure = "COR"` to obtain a pooled correlation.
For log-scale measures (`"OR"`, `"RR"`, `"HR"`, `"IRR"`), set `transform = "EXP"` to obtain the corresponding ratio.

### Non-standardized inputs

Some studies report effects on a scale specific to the analysis: a regression coefficient on raw units, an unstandardized mean difference, or a measure outside the standardized catalogue.
Use `measure = "GEN"` for these.
Because the package cannot know what counts as a "moderate" effect on an arbitrary scale, the default prior scale must be supplied: either directly via `prior_unit_information_sd`, or implicitly by passing the per-study sample sizes in `ni`, from which the package estimates the UISD via `estimate_unit_information_sd()`.

The `Havrankova2025` dataset reports regression coefficients estimating the beauty premium across studies on a non-standardized scale.

```{r Havrankova-data}
data("Havrankova2025", package = "RoBMA")
head(Havrankova2025)
```

The simplest specification passes the sample sizes in `ni` so that the package estimates the UISD:

```{r Havrankova-priors}
Havrankova_priors <- brma(
  yi = y, sei = se, ni = N, measure = "GEN",
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_priors)
```

The UISD can also be set directly when sample sizes are unavailable, when an alternative scale is more appropriate, or when performing subgroup analyses (it is sensible to compute UISD based on the full dataset so the prior distributions match across models):

```{r Havrankova-manual-uisd-calc}
Havrankova_UISD <- estimate_unit_information_sd(
  sei = Havrankova2025$se,
  ni  = Havrankova2025$N
)
Havrankova_UISD
```

```{r Havrankova-manual-uisd}
Havrankova_manual_priors <- brma(
  yi = y, sei = se, measure = "GEN",
  prior_unit_information_sd = Havrankova_UISD,
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_manual_priors)
```

The manually specified value takes precedence over the automatic estimate.

### Meta-regression

Meta-regression introduces moderators to explain part of the between-study variation.
The default prior distributions for moderator coefficients reuse the same UISD machinery as the prior distributions on the average effect and heterogeneity, but two specification choices change how those prior distributions are interpreted:
the contrast scheme used for categorical moderators, and the standardization applied to continuous moderators.

We illustrate both with `Havrankova2025`, fitting two moderators at once: the categorical `facing_customer` (No / Some / Direct) and the continuous `N` (per-study sample size).
Each moderator's prior can be inspected separately by passing its name to `parameter_mods` in `print_prior()` or `plot_prior()`.

**Categorical moderators.**
The `set_contrast_factor_predictors` argument controls how categorical predictors are coded, applying a single contrast scheme jointly to every categorical moderator in the model (per-moderator overrides are described in *Per-moderator overrides* below).
The choice affects both how the intercept is interpreted and what the inclusion Bayes factor for the moderator tests.

With *treatment contrasts* (`"treatment"`, the default in `brma()` and matching base R), one level is chosen as a baseline.
The intercept is the average effect for the baseline level, and each remaining coefficient is the difference between its level and the baseline.
In parameter estimation, the regression coefficients are regularized towards the baseline.
In hypothesis testing, removing the moderator coefficient tests whether each level's effect differs from the baseline effect.
The choice of reference level matters for both the prior distribution on the intercept and the resulting inclusion Bayes factor.
Treatment contrasts are most natural when one level has a clear baseline interpretation (a control condition, an absence of treatment, a default policy).

With *mean-difference contrasts* (`"meandif"`, the default in `BMA()` and `RoBMA()`), the levels are treated as exchangeable.
The intercept is the adjusted average effect (averaged with equal weight across levels), and each coefficient describes how a given level deviates from the grand mean.
In parameter estimation, the regression coefficients are regularized towards the average effect.
In hypothesis testing, removing the moderator tests whether any level differs from the average effect, irrespective of which level was chosen as a label.
This is the recommended default for model-averaged inference because it gives a well-defined "is there moderation by this factor?" interpretation that does not depend on relabelling the levels.

The two contrast schemes can be compared on the categorical moderator while the continuous moderator is included in the same model:

```{r Havrankova-treatment}
Havrankova_treatment <- brma(
  yi = y, sei = se, ni = N, measure = "GEN",
  mods = ~ facing_customer + N,
  set_contrast_factor_predictors = "treatment",
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_treatment, parameter_mods = "facing_customer")
```

```{r Havrankova-meandif}
Havrankova_meandif <- brma(
  yi = y, sei = se, ni = N, measure = "GEN",
  mods = ~ facing_customer + N,
  set_contrast_factor_predictors = "meandif",
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_meandif, parameter_mods = "facing_customer")
```

The treatment-contrast version places a `Normal(0, 12.9)` prior distribution on each non-reference level's deviation from the reference level.
The mean-difference version places a multivariate `mNormal(0, 12.9)` prior distribution on the deviations of all levels from the grand mean.
The continuous moderator `N` is unaffected by the contrast choice and receives the same default prior distribution in both fits.

**Continuous moderators.**
Continuous predictors are standardized internally so that the default prior scale applies regardless of the predictor's units.
The default prior distribution on a meta-regression coefficient therefore describes a plausible change per one *standard deviation* of the predictor, not per one original unit.
By default, `plot_prior()` displays the prior distribution on this standardized scale; pass `standardized_coefficients = FALSE` to obtain the prior distribution per one original unit of the moderator.

We inspect the prior distribution on the continuous moderator `N` from the same `Havrankova_meandif` fit:

```{r Havrankova-continuous}
plot_prior(Havrankova_meandif, parameter_mods = "N")
plot_prior(Havrankova_meandif, parameter_mods = "N", 
  standardized_coefficients = FALSE)
```

The first plot shows the prior distribution on the standardized scale used internally (a plausible change per one standard deviation of `N`).
The second shows the same prior distribution on the original scale of `N` (a plausible change per one unit of `N`); because `N` ranges over orders of magnitude, the per-unit prior distribution is correspondingly tight.

### Per-moderator overrides

The `set_contrast_factor_predictors` argument is a joint setting and `rescale_priors` scales every default prior distribution by the same factor.
Finer control is available through `prior_mods`, which accepts a named list keyed by moderator name.
Each entry can be a regular `prior()` (for continuous moderators) or a `prior_factor()` with its own `contrast` setting (for categorical moderators), so different moderators can use different contrast schemes and prior scales within the same model.

For example, override only the prior distribution on `facing_customer`, switching its contrast and tightening its scale, while leaving `N` on the default per-standard-deviation prior distribution:

```{r Havrankova-per-moderator}
Havrankova_per_mod <- brma(
  yi = y, sei = se, ni = N, measure = "GEN",
  mods = ~ facing_customer + N,
  prior_mods = list(
    facing_customer = prior_factor(
      distribution = "mnormal",
      parameters = list(mean = 0, sd = 5),
      contrast   = "meandif"
    )
  ),
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_per_mod)
```

The `facing_customer` prior distribution is now the user-specified `mNormal(0, 5)` mean-difference contrast, while `N` retains the default per-standard-deviation prior distribution.
The named list extends naturally: with multiple moderators, each entry can specify its own prior family, scale, and (for categorical moderators) `contrast` value.

## Informed Empirical Prior Distributions

Default prior distributions are designed to work broadly.
*Informed empirical prior distributions* use previous meta-analyses to define what effects and heterogeneity values have looked like in a specific research area.

For medical meta-analyses, the `RoBMA` R package can use empirical prior distributions based on the Cochrane Database of Systematic Reviews.
Continuous outcomes are described in @bartos2021bayesian, and binary or time-to-event outcomes in @bartos2023empirical.
The shortcut is `prior_informed_field = "medicine"`.
If no subfield is given, the package uses the full Cochrane database.

```{r bcg-informed, eval = has_bcg}
bcg_informed_priors <- brma(
  yi = yi, vi = vi, measure = "RR",
  prior_informed_field    = "medicine",
  prior_informed_subfield = "Cochrane",
  data = bcg, only_priors = TRUE
)

print_prior(bcg_informed_priors)
```

The package also re-exports `prior_informed()` from the `BayesTools` package.
This can be used for other informed prior distributions from psychology, such as the informed prior distribution for the effect size from @gronau2017bayesian or the heterogeneity prior distribution from @erp2017estimates.
Pass them directly to the relevant arguments:

```{r psychology-informed, eval = has_metafor}
psychology_priors <- brma(
  yi = yi, vi = vi, measure = "ZCOR",
  prior_effect        = prior_informed("Oosterwijk"),
  prior_heterogeneity = prior_informed("van Erp", parameter = "heterogeneity"),
  data = Kroupova2021, only_priors = TRUE
)

print_prior(psychology_priors)
```


## Custom Prior Distributions

Custom prior distributions are useful when there is a specific substantive reason for the scale of the effect, heterogeneity, moderator coefficients, or when specifying highly informed hypothesis tests.
They are created with `prior()` and related helper functions.

```{r custom-effect-priors, eval = has_bcg}
bcg_custom_priors <- brma(
  yi = yi, vi = vi, measure = "RR",
  prior_effect        = prior(
    distribution = "normal",
    parameters = list(mean = 0, sd = 0.5)
  ),
  prior_heterogeneity = prior(
    distribution = "normal",
    parameters = list(mean = 0, sd = 0.25),
    truncation = list(lower = 0, upper = Inf)
  ),
  data = bcg, only_priors = TRUE
)

print_prior(bcg_custom_priors)
```


## Prior Distributions and Bayesian Model Averaging

The examples above use `brma()` because the prior distributions are easiest to understand in a single model.
`BMA()` and `RoBMA()` use these prior distributions inside Bayesian model averaging: they average across models that include or exclude the effect, heterogeneity, and any moderators.
Each component is then represented by a *spike-and-slab* prior distribution: a point prior distribution at zero (the spike) for the model that excludes the parameter, and a continuous prior distribution (the slab) for the model that includes it.

```{r bma-priors, eval = has_bcg}
bcg_bma_priors <- BMA(
  yi = yi, vi = vi, measure = "RR",
  data = bcg, only_priors = TRUE
)

print_prior(bcg_bma_priors)
```

By default, `BMA()` and `RoBMA()` include the null hypothesis (a spike at zero) for every component.
Removing it (for example by passing `prior_effect_null = NULL`) drops the no-effect model and yields estimation under an assumed effect rather than a test of effect versus no effect.

```{r bma-no-null, eval = has_bcg}
bcg_estimation_priors <- BMA(
  yi = yi, vi = vi, measure = "RR",
  prior_effect_null = NULL,
  data = bcg, only_priors = TRUE
)

print_prior(bcg_estimation_priors, parameter = "mu")
```

## Publication-Bias Prior Distributions

Publication-bias prior distributions describe the bias-adjustment component (`prior_bias`) and are independent of the prior distributions on the effect, heterogeneity, and moderators.
Because they typically attach to specialised single-model fits or to model-averaged ensembles, the argument-level details are documented in `?publication_bias_prior_specification`, with worked examples in the corresponding vignettes.

For *single-model* bias adjustment with `bPET()`, `bPEESE()`, or `bselmodel()`, see the [*Publication-Bias Adjustment*](v11-metafor-parity-publication-bias.html) vignette: it lists the default `prior_bias` for each function, the rationale for the one-sided defaults, and the flexible alternatives that drop the truncation or the monotonicity constraint.
For *model-averaged* bias adjustment with `RoBMA()`, see the [*Robust Bayesian Meta-Analysis*](v21-robust-bayesian-meta-analysis.html) vignette: it describes the `model_type` presets (`"PSMA"`, `"PP"`, `"6w"`, `"2w"`), the weight functions and PET/PEESE components they include, and how to build a custom ensemble via `prior_bias` and `prior_bias_null`.

## General Considerations and Reporting

Specifying prior distributions is a substantive modelling decision rather than a technical detail, and a few principles deserve to be stated explicitly even at the cost of repeating points made above.

**Wide prior distributions are not "uninformative" in testing or model averaging.**
A very wide prior distribution on the alternative can look like a conservative choice for estimation, but it strongly biases hypothesis tests in favour of the null and shrinks model-averaged estimates toward zero, because the alternative model receives little posterior probability.
Bayes factors and model-averaged inference depend on the prior distribution under the alternative just as much as on the data.

**Report everything.**
Fully report the prior distributions.
Run a *prior sensitivity analysis* by re-fitting under at least one alternative reasonable prior distribution; if the conclusions are robust, the analysis stands on firmer ground.

**Decide prior distributions before looking at the data.**
Preregistering the analysis plan, including the prior distributions, avoids the temptation to tune prior distributions after seeing partial results and keeps the inference honestly Bayesian.

## References