Prior Distributions

František Bartoš

28th of April 2026

Prior distributions describe which parameter values are plausible before seeing the data. They are an essential part of any Bayesian analysis, and they matter especially in meta-analysis: many datasets contain only a small or moderate number of studies, and reasonable prior distributions stabilize estimates that the data alone would identify weakly.

The role of prior distributions differs substantially between estimation and hypothesis testing (or Bayesian model averaging). For estimation, the goal is to recover a parameter value, and weakly informative prior distributions regularize the posterior without dominating it; the data quickly pull the posterior toward the likelihood as the number of studies grows. For hypothesis testing and model averaging the prior distribution under the alternative is part of the hypothesis: the Bayes factor compares how well competing prior predictions match the data. A very wide prior distribution on the alternative spreads probability across implausible values, so ordinary data look surprisingly consistent with the null, and the Bayes factor leans toward the null even when the effect is real. Bayesian model averaging amplifies this: the alternative model receives little posterior probability and the model-averaged estimate is shrunk toward zero.

The default prior distributions in the RoBMA R package are therefore deliberately not “flat”. They are weakly informative: wide enough to regularize estimates, but informative enough to keep hypothesis tests and Bayesian model averaging well-calibrated. This vignette walks through the practical paths to specifying prior distributions in the RoBMA R package, following the discussion of prior distributions in Bartoš & Wagenmakers (2025). Also see Mulder & Aert (2025) for further discussion and resources.

Specifying Prior Distributions

Prior distributions in the RoBMA R package come in two parts. Prior distributions for the meta-analytic parameters (the average effect, between-study heterogeneity, and any meta-regression coefficients) can be specified using package defaults, empirical (informed) prior distributions, or fully custom prior distributions. Prior distributions for the publication-bias adjustment models, used by RoBMA() and the related single-model functions, are treated separately and select among preset ensembles or a custom ensemble.

Approach Description Main arguments
Default prior distributions Standardized prior distributions automatically scaled to the effect-size measure measure, rescale_priors, prior_unit_information_sd
Empirical/informed prior distributions Empirical prior distributions derived from previous meta-analyses prior_informed_field, prior_informed_subfield, prior_informed()
Custom prior distributions Fully user-specified prior distributions via prior() and helpers prior_effect, prior_heterogeneity, prior_mods, prior_scale
Publication-bias prior distributions Bias-adjustment models and ensembles model_type, prior_bias, prior_bias_null

Throughout this vignette we pass only_priors = TRUE to ask the package to assemble the prior distributions and stop before fitting the model. This makes prior specifications visible without running MCMC. Drop only_priors = TRUE to fit the corresponding model.

Default Prior Distributions

Basic example

The simplest way to use the defaults is to tell the RoBMA R package what effect-size measure is being analyzed; the package then specifies prior distributions on the matching scale. We start with brma() because it fits one meta-analytic model and makes the prior distributions easiest to inspect.

The BCG vaccine dataset from the metadat package contains two-by-two tables. We compute log risk ratios and their standard errors, then use brma() to show the default prior distributions for a simple random-effects meta-analysis.

data("dat.bcg", package = "metadat")

bcg <- metafor::escalc(
  ai = tpos, bi = tneg, ci = cpos, di = cneg,
  measure = "RR", data = dat.bcg
)
bcg_priors <- brma(
  yi = yi, vi = vi, measure = "RR",
  data = bcg, only_priors = TRUE
)

plot_prior(bcg_priors,  parameter = "mu")

plot_prior(bcg_priors,  parameter = "tau")

print_prior(bcg_priors)
#> mu:
#>   Normal(0, 1)
#> tau:
#>   Normal(0, 0.5)[0, Inf]

The first plot shows the prior distribution for the average effect, \(\mu\). The second plot shows the prior distribution for the between-study heterogeneity, \(\tau\). For measure = "RR" the package places a Normal(0, 1) prior distribution on the log risk ratio and a positive Normal(0, 0.5) prior distribution on heterogeneity. The next subsection explains where these scales come from.

Effect-size measures and their scales

Different effect-size measures live on different scales: while a standardized mean difference of 1 is large, a Fisher’s \(z\) of 1 is enormous. As such, a single fixed prior such as Normal(0, 1) would not be a sensible default across different effect sizes.

The RoBMA R package deals with this problem by relying on the unit-information standard deviation (UISD) concept (Röver et al., 2021). Standardized effect sizes have known UISD values, and for non-standardized inputs (measure = "GEN") the UISD can be estimated from the per-study standard errors and sample sizes (or specified manually):

Effect-size measure measure UISD used by default
Standardized mean difference (Cohen’s \(d\), Hedges’ \(g\)) "SMD" sqrt(2)
Fisher’s \(z\)-transformed correlation "ZCOR" 1
Log risk ratio "RR" sqrt(4)
Log odds ratio "OR" sqrt(4)
Log hazard ratio "HR" sqrt(4)
Log incidence rate ratio "IRR" sqrt(4)
Generic / non-standardized "GEN" estimated from sei and ni, or set manually

The default prior distributions then use fractions of the UISD to specify the scale of each prior distribution.

Parameter Meaning Default prior distribution
mu average effect Normal(0, 0.5 * UISD)
tau heterogeneity positive Normal(0, 0.25 * UISD)
mods meta-regression coefficients Normal(0, 0.25 * UISD)
scale heterogeneity-regression coefficients Normal(0, 0.5)
rho heterogeneity allocation in multilevel models Beta(1, 1)

The scale regression coefficients are not on the effect-size scale: they describe changes in log heterogeneity, a multiplicative effect, and are therefore scale-free. The notation for the meta-regression and multilevel parameters follows the RoBMA R package meta-regression and multilevel papers (Bartoš et al., 2025, 2026).

The default prior distributions can be made wider or narrower with rescale_priors. Values larger than one make the prior distributions wider; values smaller than one make them narrower.

bcg_wider_priors <- brma(
  yi = yi, vi = vi, measure = "RR",
  rescale_priors = 2,
  data = bcg, only_priors = TRUE
)

print_prior(bcg_wider_priors)
#> mu:
#>   Normal(0, 2)
#> tau:
#>   Normal(0, 1)[0, Inf]

The effect prior distribution changed from Normal(0, 1) to Normal(0, 2), and the heterogeneity prior distribution from positive Normal(0, 0.5) to positive Normal(0, 1).

Standardized effect-size inputs

Several common effect-size measures live on familiar but inconvenient scales. Correlations are bounded in \((-1, 1)\) with non-linear behaviour near the boundaries, and their sampling variance depends on the effect size itself. Odds ratios and risk ratios are positive and asymmetric around the no-effect value of 1. For each of these, the standard practice is to transform the inputs to a related scale on which the sampling distribution is approximately normal: Fisher’s \(z\) for correlations, log odds ratio for odds ratios, log risk ratio for risk ratios. metafor::escalc() performs all these transformations, and the RoBMA R package is set up to operate on the transformed scale. Use measure = "ZCOR" for Fisher’s \(z\), measure = "OR" for log odds ratios, measure = "RR" for log risk ratios, and so on. The default prior distributions then use the UISD that matches the transformed scale.

The Kroupova2021 dataset of correlations between student employment and educational outcomes makes the workflow concrete. We use escalc() to convert correlations and sample sizes into Fisher’s \(z\) effect sizes and their sampling variances:

data("Kroupova2021", package = "RoBMA")

Kroupova2021 <- metafor::escalc( 
  ri = r, ni = sample_size, measure = "ZCOR",
  data = Kroupova2021
)

brma() recognises measure = "ZCOR" and uses the default Fisher’s-\(z\) prior distributions:

Kroupova2021_priors <- brma(
  yi = yi, vi = vi, measure = "ZCOR",
  data = Kroupova2021, only_priors = TRUE
)

print_prior(Kroupova2021_priors)
#> mu:
#>   Normal(0, 0.5)
#> tau:
#>   Normal(0, 0.25)[0, Inf]

The same pattern applies to other standardized measures: compute the appropriate transformation in escalc() and pass the matching measure to brma().

The model and its prior distributions then operate on the transformed scale, but posterior summaries can be reported back on the familiar original scale via the output_measure and transform arguments of the relevant summary functions (e.g., pooled_effect(), marginal_means(), and plot() functions). For correlations fit with measure = "ZCOR", set output_measure = "COR" to obtain a pooled correlation. For log-scale measures ("OR", "RR", "HR", "IRR"), set transform = "EXP" to obtain the corresponding ratio.

Non-standardized inputs

Some studies report effects on a scale specific to the analysis: a regression coefficient on raw units, an unstandardized mean difference, or a measure outside the standardized catalogue. Use measure = "GEN" for these. Because the package cannot know what counts as a “moderate” effect on an arbitrary scale, the default prior scale must be supplied: either directly via prior_unit_information_sd, or implicitly by passing the per-study sample sizes in ni, from which the package estimates the UISD via estimate_unit_information_sd().

The Havrankova2025 dataset reports regression coefficients estimating the beauty premium across studies on a non-standardized scale.

data("Havrankova2025", package = "RoBMA")
head(Havrankova2025)
#>        y        se facing_customer study_id    N
#> 1  8.910  2.960133              No        1 1806
#> 2  8.964  5.272941              No        1 1806
#> 3  2.484  3.029268              No        1 1806
#> 4  9.180  4.831579              No        1 1806
#> 5 21.600  6.467066              No        1 1806
#> 6 15.012 10.075168              No        1 1806

The simplest specification passes the sample sizes in ni so that the package estimates the UISD:

Havrankova_priors <- brma(
  yi = y, sei = se, ni = N, measure = "GEN",
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_priors)
#> mu:
#>   Normal(0, 25.81)
#> tau:
#>   Normal(0, 12.9)[0, Inf]

The UISD can also be set directly when sample sizes are unavailable, when an alternative scale is more appropriate, or when performing subgroup analyses (it is sensible to compute UISD based on the full dataset so the prior distributions match across models):

Havrankova_UISD <- estimate_unit_information_sd(
  sei = Havrankova2025$se,
  ni  = Havrankova2025$N
)
Havrankova_UISD
#> [1] 51.61241
Havrankova_manual_priors <- brma(
  yi = y, sei = se, measure = "GEN",
  prior_unit_information_sd = Havrankova_UISD,
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_manual_priors)
#> mu:
#>   Normal(0, 25.81)
#> tau:
#>   Normal(0, 12.9)[0, Inf]

The manually specified value takes precedence over the automatic estimate.

Meta-regression

Meta-regression introduces moderators to explain part of the between-study variation. The default prior distributions for moderator coefficients reuse the same UISD machinery as the prior distributions on the average effect and heterogeneity, but two specification choices change how those prior distributions are interpreted: the contrast scheme used for categorical moderators, and the standardization applied to continuous moderators.

We illustrate both with Havrankova2025, fitting two moderators at once: the categorical facing_customer (No / Some / Direct) and the continuous N (per-study sample size). Each moderator’s prior can be inspected separately by passing its name to parameter_mods in print_prior() or plot_prior().

Categorical moderators. The set_contrast_factor_predictors argument controls how categorical predictors are coded, applying a single contrast scheme jointly to every categorical moderator in the model (per-moderator overrides are described in Per-moderator overrides below). The choice affects both how the intercept is interpreted and what the inclusion Bayes factor for the moderator tests.

With treatment contrasts ("treatment", the default in brma() and matching base R), one level is chosen as a baseline. The intercept is the average effect for the baseline level, and each remaining coefficient is the difference between its level and the baseline. In parameter estimation, the regression coefficients are regularized towards the baseline. In hypothesis testing, removing the moderator coefficient tests whether each level’s effect differs from the baseline effect. The choice of reference level matters for both the prior distribution on the intercept and the resulting inclusion Bayes factor. Treatment contrasts are most natural when one level has a clear baseline interpretation (a control condition, an absence of treatment, a default policy).

With mean-difference contrasts ("meandif", the default in BMA() and RoBMA()), the levels are treated as exchangeable. The intercept is the adjusted average effect (averaged with equal weight across levels), and each coefficient describes how a given level deviates from the grand mean. In parameter estimation, the regression coefficients are regularized towards the average effect. In hypothesis testing, removing the moderator tests whether any level differs from the average effect, irrespective of which level was chosen as a label. This is the recommended default for model-averaged inference because it gives a well-defined “is there moderation by this factor?” interpretation that does not depend on relabelling the levels.

The two contrast schemes can be compared on the categorical moderator while the continuous moderator is included in the same model:

Havrankova_treatment <- brma(
  yi = y, sei = se, ni = N, measure = "GEN",
  mods = ~ facing_customer + N,
  set_contrast_factor_predictors = "treatment",
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_treatment, parameter_mods = "facing_customer")
#> mu_facing_customer:
#>   treatment contrast: Normal(0, 12.9)
Havrankova_meandif <- brma(
  yi = y, sei = se, ni = N, measure = "GEN",
  mods = ~ facing_customer + N,
  set_contrast_factor_predictors = "meandif",
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_meandif, parameter_mods = "facing_customer")
#> mu_facing_customer:
#>   mean difference contrast: mNormal(0, 12.9)

The treatment-contrast version places a Normal(0, 12.9) prior distribution on each non-reference level’s deviation from the reference level. The mean-difference version places a multivariate mNormal(0, 12.9) prior distribution on the deviations of all levels from the grand mean. The continuous moderator N is unaffected by the contrast choice and receives the same default prior distribution in both fits.

Continuous moderators. Continuous predictors are standardized internally so that the default prior scale applies regardless of the predictor’s units. The default prior distribution on a meta-regression coefficient therefore describes a plausible change per one standard deviation of the predictor, not per one original unit. By default, plot_prior() displays the prior distribution on this standardized scale; pass standardized_coefficients = FALSE to obtain the prior distribution per one original unit of the moderator.

We inspect the prior distribution on the continuous moderator N from the same Havrankova_meandif fit:

plot_prior(Havrankova_meandif, parameter_mods = "N")

plot_prior(Havrankova_meandif, parameter_mods = "N", 
  standardized_coefficients = FALSE)

The first plot shows the prior distribution on the standardized scale used internally (a plausible change per one standard deviation of N). The second shows the same prior distribution on the original scale of N (a plausible change per one unit of N); because N ranges over orders of magnitude, the per-unit prior distribution is correspondingly tight.

Per-moderator overrides

The set_contrast_factor_predictors argument is a joint setting and rescale_priors scales every default prior distribution by the same factor. Finer control is available through prior_mods, which accepts a named list keyed by moderator name. Each entry can be a regular prior() (for continuous moderators) or a prior_factor() with its own contrast setting (for categorical moderators), so different moderators can use different contrast schemes and prior scales within the same model.

For example, override only the prior distribution on facing_customer, switching its contrast and tightening its scale, while leaving N on the default per-standard-deviation prior distribution:

Havrankova_per_mod <- brma(
  yi = y, sei = se, ni = N, measure = "GEN",
  mods = ~ facing_customer + N,
  prior_mods = list(
    facing_customer = prior_factor(
      distribution = "mnormal",
      parameters = list(mean = 0, sd = 5),
      contrast   = "meandif"
    )
  ),
  data = Havrankova2025, only_priors = TRUE
)

print_prior(Havrankova_per_mod)
#> mu_intercept:
#>   Normal(0, 25.81)
#> mu_facing_customer:
#>   mean difference contrast: mNormal(0, 5)
#> mu_N:
#>   Normal(0, 12.9)
#> tau:
#>   Normal(0, 12.9)[0, Inf]

The facing_customer prior distribution is now the user-specified mNormal(0, 5) mean-difference contrast, while N retains the default per-standard-deviation prior distribution. The named list extends naturally: with multiple moderators, each entry can specify its own prior family, scale, and (for categorical moderators) contrast value.

Informed Empirical Prior Distributions

Default prior distributions are designed to work broadly. Informed empirical prior distributions use previous meta-analyses to define what effects and heterogeneity values have looked like in a specific research area.

For medical meta-analyses, the RoBMA R package can use empirical prior distributions based on the Cochrane Database of Systematic Reviews. Continuous outcomes are described in Bartoš et al. (2021), and binary or time-to-event outcomes in Bartoš et al. (2023). The shortcut is prior_informed_field = "medicine". If no subfield is given, the package uses the full Cochrane database.

bcg_informed_priors <- brma(
  yi = yi, vi = vi, measure = "RR",
  prior_informed_field    = "medicine",
  prior_informed_subfield = "Cochrane",
  data = bcg, only_priors = TRUE
)

print_prior(bcg_informed_priors)
#> mu:
#>   Student-t(0, 0.32, 3)
#> tau:
#>   InvGamma(1.51, 0.23)

The package also re-exports prior_informed() from the BayesTools package. This can be used for other informed prior distributions from psychology, such as the informed prior distribution for the effect size from Gronau et al. (2017) or the heterogeneity prior distribution from Erp et al. (2017). Pass them directly to the relevant arguments:

psychology_priors <- brma(
  yi = yi, vi = vi, measure = "ZCOR",
  prior_effect        = prior_informed("Oosterwijk"),
  prior_heterogeneity = prior_informed("van Erp", parameter = "heterogeneity"),
  data = Kroupova2021, only_priors = TRUE
)

print_prior(psychology_priors)
#> mu:
#>   Student-t(0.35, 0.1, 3)[0, Inf]
#> tau:
#>   InvGamma(1, 0.15)

Custom Prior Distributions

Custom prior distributions are useful when there is a specific substantive reason for the scale of the effect, heterogeneity, moderator coefficients, or when specifying highly informed hypothesis tests. They are created with prior() and related helper functions.

bcg_custom_priors <- brma(
  yi = yi, vi = vi, measure = "RR",
  prior_effect        = prior(
    distribution = "normal",
    parameters = list(mean = 0, sd = 0.5)
  ),
  prior_heterogeneity = prior(
    distribution = "normal",
    parameters = list(mean = 0, sd = 0.25),
    truncation = list(lower = 0, upper = Inf)
  ),
  data = bcg, only_priors = TRUE
)

print_prior(bcg_custom_priors)
#> mu:
#>   Normal(0, 0.5)
#> tau:
#>   Normal(0, 0.25)[0, Inf]

Prior Distributions and Bayesian Model Averaging

The examples above use brma() because the prior distributions are easiest to understand in a single model. BMA() and RoBMA() use these prior distributions inside Bayesian model averaging: they average across models that include or exclude the effect, heterogeneity, and any moderators. Each component is then represented by a spike-and-slab prior distribution: a point prior distribution at zero (the spike) for the model that excludes the parameter, and a continuous prior distribution (the slab) for the model that includes it.

bcg_bma_priors <- BMA(
  yi = yi, vi = vi, measure = "RR",
  data = bcg, only_priors = TRUE
)

print_prior(bcg_bma_priors)
#> mu:
#>   alternative:
#>     (1/2) * Normal(0, 1)
#>   null:
#>     (1/2) * Spike(0)
#> tau:
#>   alternative:
#>     (1/2) * Normal(0, 0.5)[0, Inf]
#>   null:
#>     (1/2) * Spike(0)

By default, BMA() and RoBMA() include the null hypothesis (a spike at zero) for every component. Removing it (for example by passing prior_effect_null = NULL) drops the no-effect model and yields estimation under an assumed effect rather than a test of effect versus no effect.

bcg_estimation_priors <- BMA(
  yi = yi, vi = vi, measure = "RR",
  prior_effect_null = NULL,
  data = bcg, only_priors = TRUE
)

print_prior(bcg_estimation_priors, parameter = "mu")
#> mu:
#>   alternative:
#>     (1/1) * Normal(0, 1)

Publication-Bias Prior Distributions

Publication-bias prior distributions describe the bias-adjustment component (prior_bias) and are independent of the prior distributions on the effect, heterogeneity, and moderators. Because they typically attach to specialised single-model fits or to model-averaged ensembles, the argument-level details are documented in ?publication_bias_prior_specification, with worked examples in the corresponding vignettes.

For single-model bias adjustment with bPET(), bPEESE(), or bselmodel(), see the Publication-Bias Adjustment vignette: it lists the default prior_bias for each function, the rationale for the one-sided defaults, and the flexible alternatives that drop the truncation or the monotonicity constraint. For model-averaged bias adjustment with RoBMA(), see the Robust Bayesian Meta-Analysis vignette: it describes the model_type presets ("PSMA", "PP", "6w", "2w"), the weight functions and PET/PEESE components they include, and how to build a custom ensemble via prior_bias and prior_bias_null.

General Considerations and Reporting

Specifying prior distributions is a substantive modelling decision rather than a technical detail, and a few principles deserve to be stated explicitly even at the cost of repeating points made above.

Wide prior distributions are not “uninformative” in testing or model averaging. A very wide prior distribution on the alternative can look like a conservative choice for estimation, but it strongly biases hypothesis tests in favour of the null and shrinks model-averaged estimates toward zero, because the alternative model receives little posterior probability. Bayes factors and model-averaged inference depend on the prior distribution under the alternative just as much as on the data.

Report everything. Fully report the prior distributions. Run a prior sensitivity analysis by re-fitting under at least one alternative reasonable prior distribution; if the conclusions are robust, the analysis stands on firmer ground.

Decide prior distributions before looking at the data. Preregistering the analysis plan, including the prior distributions, avoids the temptation to tune prior distributions after seeing partial results and keeps the inference honestly Bayesian.

References

Bartoš, F., Gronau, Q. F., Timmers, B., Otte, W. M., Ly, A., & Wagenmakers, E.-J. (2021). Bayesian model-averaged meta-analysis in medicine. Statistics in Medicine, 40(30), 6743–6761. https://doi.org/10.1002/sim.9170
Bartoš, F., Maier, M., Stanley, T., & Wagenmakers, E.-J. (2025). Robust Bayesian meta-regression: Model-averaged moderation analysis in the presence of publication bias. Psychological Methods. https://doi.org/10.1037/met0000737
Bartoš, F., Maier, M., & Wagenmakers, E.-J. (2026). Robust Bayesian multilevel meta-analysis: Adjusting for publication bias in the presence of dependent effect sizes. Behavior Research Methods. https://doi.org/10.31234/osf.io/9tgp2_v2
Bartoš, F., Otte, W. M., Gronau, Q. F., Timmers, B., Ly, A., & Wagenmakers, E.-J. (2023). Empirical prior distributions for Bayesian meta-analyses of binary and time-to-event outcomes. https://doi.org/10.48550/arXiv.2306.11468
Bartoš, F., & Wagenmakers, E.-J. (2025). Meta-analysis with JASP, Part II: Bayesian approaches. https://doi.org/10.48550/arXiv.2509.09850
Erp, S. van, Verhagen, J., Grasman, R. P., & Wagenmakers, E.-J. (2017). Estimates of between-study heterogeneity for 705 meta-analyses reported in Psychological Bulletin from 1990–2013. Journal of Open Psychology Data, 5(1), 1–5. https://doi.org/10.5334/jopd.33
Gronau, Q. F., Van Erp, S., Heck, D. W., Cesario, J., Jonas, K. J., & Wagenmakers, E.-J. (2017). A Bayesian model-averaged meta-analysis of the power pose effect with informed and default priors: The case of felt power. Comprehensive Results in Social Psychology, 2(1), 123–138. https://doi.org/10.1080/23743603.2017.1326760
Mulder, J., & Aert, R. C. van. (2025). Bayes Factor hypothesis testing in meta-analyses: Practical advantages and methodological considerations. Research Synthesis Methods. https://doi.org/10.1017/rsm.2025.10060
Röver, C., Bender, R., Dias, S., Schmid, C. H., Schmidli, H., Sturtz, S., Weber, S., & Friede, T. (2021). On weakly informative prior distributions for the heterogeneity parameter in Bayesian random-effects meta-analysis. Research Synthesis Methods, 12(4), 448–474. https://doi.org/10.1002/jrsm.1475