---
title: "Tutorial: Adjusting for Publication Bias in JASP and R - Selection Models, PET-PEESE, and Robust Bayesian Meta-Analysis"
author: "František Bartoš, Maximilian Maier, Daniel S. Quintana & Eric-Jan Wagenmakers"
date: "31st of May 2023 (updated: 30th of April 2026)"
output:
  rmarkdown::html_vignette:
    self_contained: yes
bibliography: ../inst/REFERENCES.bib
csl: ../inst/apa.csl
link-citations: true
vignette: >
  %\VignetteIndexEntry{Tutorial: Adjusting for Publication Bias in JASP and R - Selection Models, PET-PEESE, and Robust Bayesian Meta-Analysis}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown_notangle}
---

```{r child = "_vignette-nowrap.md", echo = FALSE, eval = TRUE}
```

```{r setup, include = FALSE}
source("vignette-cache.R", local = knitr::knit_global())

cached_fits <- vignette_cache(
  name     = "v30-tutorial",
  objects  = c(
    fit_RoBMA  = "fit_RoBMA_Lui2015.RDS",
    fit_RoBMA2 = "fit_RoBMA_perinull_Lui2015.RDS"
  ),
  packages = c("metafor", "weightr")
)

knitr::opts_chunk$set(
  collapse    = TRUE,
  comment     = "#>",
  eval        = vignette_cache_eval(cached_fits),
  message     = FALSE,
  warning     = FALSE,
  fig.width   = 7,
  fig.height  = 3.5,
  dev         = "png",
  fig.retina  = 3
)
if (.Platform$OS.type == "windows") {
  knitr::opts_chunk$set(dev.args = list(type = "cairo"))
}

prepare_30_tutorial <- function(envir = parent.frame()) {

  library("metafor")
  library("weightr")
  library("RoBMA")

  data("Lui2015", package = "RoBMA")
  df <- Lui2015

  dfz <- metafor::escalc(
    ri = r, ni = n, measure = "ZCOR",
    data = df
  )

  prior_effect_361        <- prior("normal", list(mean = 0, sd = 0.5))
  prior_heterogeneity_361 <- prior("invgamma", list(shape = 1, scale = 0.075))
  prior_bias_361          <- list(
    prior_weightfunction("two-sided", c(0.05),              wf_cumulative(c(1, 1)),       prior_weights = 1/12),
    prior_weightfunction("two-sided", c(0.05, 0.10),        wf_cumulative(c(1, 1, 1)),    prior_weights = 1/12),
    prior_weightfunction("one-sided", c(0.05),              wf_cumulative(c(1, 1)),       prior_weights = 1/12),
    prior_weightfunction("one-sided", c(0.025, 0.05),       wf_cumulative(c(1, 1, 1)),    prior_weights = 1/12),
    prior_weightfunction("one-sided", c(0.05, 0.50),        wf_cumulative(c(1, 1, 1)),    prior_weights = 1/12),
    prior_weightfunction("one-sided", c(0.025, 0.05, 0.50), wf_cumulative(c(1, 1, 1, 1)), prior_weights = 1/12),
    prior_PET("Cauchy",   list(location = 0, scale = 1),  truncation = list(0, Inf), prior_weights = 1/4),
    prior_PEESE("Cauchy", list(location = 0, scale = 10), truncation = list(0, Inf), prior_weights = 1/4)
  )

  prior_effect_perinull_361      <- prior("normal", list(mean = 0.30, sd = 0.10), truncation = list(0, Inf))
  prior_effect_null_perinull_361 <- prior("normal", list(mean = 0, sd = 0.05))

  list2env(
    list(
      df                                = df,
      dfz                               = dfz,
      prior_effect_361                  = prior_effect_361,
      prior_heterogeneity_361           = prior_heterogeneity_361,
      prior_bias_361                    = prior_bias_361,
      prior_effect_perinull_361         = prior_effect_perinull_361,
      prior_effect_null_perinull_361    = prior_effect_null_perinull_361
    ),
    envir = envir
  )

  invisible(NULL)
}
```

```{r load-models, include = FALSE}
prepare_30_tutorial()
vignette_cache_load(cached_fits)
```

```{r precompute-models, include = FALSE, eval = FALSE}
prepare_30_tutorial()
fit_RoBMA <- RoBMA(
  yi = yi, vi = vi, measure = "ZCOR",
  prior_effect                 = prior_effect_361,
  prior_heterogeneity          = prior_heterogeneity_361,
  prior_bias                   = prior_bias_361,
  data = dfz, seed = 1, parallel = TRUE
)

fit_RoBMA2 <- RoBMA(
  yi = yi, vi = vi, measure = "ZCOR",
  prior_effect                 = prior_effect_perinull_361,
  prior_effect_null            = prior_effect_null_perinull_361,
  prior_heterogeneity          = prior_heterogeneity_361,
  prior_bias                   = prior_bias_361,
  data = dfz, seed = 2, parallel = TRUE
)

vignette_cache_save(cached_fits)
```

**This R markdown file accompanies the tutorial [Adjusting for publication bias in JASP and R: Selection models, PET-PEESE, and robust Bayesian meta-analysis](https://doi.org/10.1177/25152459221109259) published in *Advances in Methods and Practices in Psychological Science* [@bartos2020adjusting].**


The following R-markdown file illustrates how to:

- Load a CSV file into R,
- Transform effect sizes,
- Perform a random effect meta-analysis,
- Adjust for publication bias with:
  - PET-PEESE [@stanley2014meta; @stanley2017limitations], 
  - Selection models [@iyengar1988selection; @vevea1995general],
  - Robust Bayesian meta-analysis (RoBMA) [@maier2020robust; @bartos2021no].

See the full paper for additional details regarding the data set, methods, and interpretation.


### Set-up

Before we start, we need to install `JAGS` (which is needed for installation of the `RoBMA` R package) and the R packages that we use in the analysis. Specifically, the `RoBMA`, `weightr`, and `metafor` R packages.

JAGS can be downloaded from the [JAGS website](https://sourceforge.net/projects/mcmc-jags/). Subsequently, we install the R packages with the `install.packages()` function.

```r
install.packages(c("RoBMA", "weightr", "metafor"))
```

Once all of the packages are installed, we can load them into the workspace with the `library()` function.

```{r}
library("metafor")
library("weightr")
library("RoBMA")
```


### Lui (2015)
@lui2015intergenerational studied how the acculturation mismatch (AM) that is the result of the contrast between the collectivist cultures of Asian and Latin immigrant groups and the individualist culture in the United States correlates with intergenerational cultural conflict (ICC). 
@lui2015intergenerational meta-analyzed 18 independent studies correlating AM with ICC. 
A standard reanalysis indicates a significant effect of AM on increased ICC, r = 0.250, p < .001.

#### Data manipulation
First, we load the Lui2015.csv file into R with the `read.csv()` function and inspect the first six data entries with the `head()` function (the data set is also included in the package and can be accessed via the `data("Lui2015", package = "RoBMA")` call).

```r
df <- read.csv(file = "Lui2015.csv")
```
```{r}
head(df)
```

We see that the data set contains three columns. 
The first column called `r` contains the effect sizes coded as correlation coefficients, the second column called `n` contains the sample sizes, and the third column called `study` contains names of the individual studies.

We can access the individual variables using the data set name and the dollar (`$`) sign followed by the name of the column. 
For example, we can print all of the effect sizes with the `df$r` command.

```{r}
df$r
```

The printed output shows that the data set contains mostly positive effect sizes with the largest correlation coefficient r = 0.54.

#### Effect size transformations
Before we start analyzing the data, we transform the effect sizes from correlation coefficients $\rho$ to Fisher's *z*. 
Correlation coefficients are not well suited for meta-analysis because (1) they are bounded to a range (-1, 1) with non-linear increases near the boundaries and (2) the standard error of the correlation coefficients is related to the effect size. 
Fisher's *z* transformation mitigates both issues. It unwinds the (-1, 1) range to ($-\infty$, $\infty$), makes the sampling distribution approximately normal, and breaks the dependency between standard errors and effect sizes. 

To apply the transformation, we use the `escalc()` function from the `metafor` package. 
We pass the correlation coefficients into the `ri` argument, the sample sizes to the `ni` argument, and set `measure = "ZCOR"`. 
The function `escalc()` then saves the transformed effect size estimates into a data frame called `dfz`, where the `yi` column corresponds to Fisher's *z* transformation of the correlation coefficient and `vi` column corresponds to the sampling variance of Fisher's *z*.

```{r}
dfz <- metafor::escalc(
  ri = r, ni = n, measure = "ZCOR",
  data = df
)
head(dfz)
```


#### Re-analysis with random effect meta-analysis
We now estimate a random effect meta-analysis with the `rma()` function imported from the `metafor` package [@metafor] and verify that we arrive at the same results as reported in the @lui2015intergenerational paper. 
The `yi` argument is used to pass the column name containing effect sizes, the `vi` argument is used to pass the column name containing sampling variances, and the `data` argument is used to pass the data frame containing both variables.

```{r}
fit_rma <- metafor::rma(yi = yi, vi = vi, data = dfz)
fit_rma
```

Indeed, we find that the effect size estimate from the random effect meta-analysis corresponds to the one reported in @lui2015intergenerational. 
It is important to remember that we used Fisher's *z* to estimate the models; therefore, the estimated results are on the Fisher's *z* scale. 
To transform the effect size estimate to the correlation coefficients, we can use `metafor::transf.ztor()`,

```{r}
metafor::transf.ztor(fit_rma$b)
```

Transforming the effect size estimate results in the correlation coefficient $\rho$ = 0.25.


### PET-PEESE
The first publication bias adjustment that we perform is PET-PEESE. 
PET-PEESE adjusts for the relationship between effect sizes and standard errors. 
To our knowledge, PET-PEESE is not currently implemented in any R package.
However, since PET and PEESE are weighted regressions of effect sizes on standard errors (PET) or standard errors squared (PEESE), we can estimate both PET and PEESE models with the `lm()` function. 
Inside the `lm()` function call, we specify that `yi` is the response variable (left hand side of the `~` sign) and `I(sqrt(vi))` is the predictor (the right-hand side). 
Furthermore, we specify the `weights` argument that allows us to weight the meta-regression by inverse variance and set the `data = dfz` argument, which specifies that all of the variables come from the transformed, `dfz`, data set.

```{r}
fit_PET <- lm(yi ~ I(sqrt(vi)), weights = 1/vi, data = dfz)
summary(fit_PET)
```

The `summary()` function allows us to explore details of the fitted model. 
The `(Intercept)` coefficient refers to the meta-analytic effect size (corrected for the correlation with standard errors). 
Again, it is important to keep in mind that the effect size estimate is on the Fisher's *z* scale. 
We obtain the estimate on correlation scale with the `metafor::transf.ztor()` function (we pass the estimated effect size using the `summary(fit_PET)$coefficients["(Intercept)", "Estimate"]` command, which extracts the estimate from the fitted model).

```{r}
metafor::transf.ztor(summary(fit_PET)$coefficients["(Intercept)", "Estimate"])
```
Since the Fisher's *z* transformation is almost linear around zero, we obtain an almost identical estimate.

More importantly, since the test for the effect size with PET was not significant at $\alpha = .10$, we interpret the PET model. 
However, if the test for effect size were significant, we would fit and interpret the PEESE model. 
The PEESE model can be fitted in an analogous way, by replacing the predictor of standard errors with standard errors squared, which corresponds to the sampling variances in `vi`.

```{r}
fit_PEESE <- lm(yi ~ vi, weights = 1/vi, data = dfz)
summary(fit_PEESE)
```


### Selection models
The second publication bias adjustment that we will perform is selection models. 
Selection models adjust for the different publication probabilities in different *p*-value intervals. 
Selection models are implemented in the `weightr` package (`weightfunct()` function; @weightr) and newly also in the `metafor` package (`selmodel()` function; @metafor).
First, we use the `weightr` implementation and fit the "4PSM" selection model that specifies three distinct *p*-value intervals: 
(1) covering the range of significant *p*-values for effect sizes in the expected direction (0.00-0.025), 
(2) covering the range of "marginally" significant *p*-values for effect sizes in the expected direction (0.025-0.05), 
and (3) covering the range of non-significant *p*-values (0.05-1). We use the Fisher's *z* effect sizes from `dfz`. 
To fit the model, we need to pass the effect sizes (`dfz$yi`) into the `effect` argument and variances (`dfz$vi`) into the `v` argument (note that we need to pass the vector of values directly since the `weightfunct()` function does not allow us to pass the data frame directly as did the previous functions). 
We further set `steps = c(0.025, 0.05)` to specify the appropriate cut-points (note that the steps correspond to one-sided *p*-values), and we set `table = TRUE` to obtain the frequency of *p* values in each of the specified intervals.

```{r}
fit_4PSM <- weightr::weightfunct(
  effect = dfz$yi, v = dfz$vi,
  steps = c(0.025, 0.05), table = TRUE
)
fit_4PSM
```

Note the warning message informing us about the fact that our data do not contain a sufficient number of *p*-values in one of the *p*-value intervals. 
The model output obtained by printing the fitted model object `fit_4PSM` shows that there is only one *p*-value in the (0.025, 0.05) interval. 
We can deal with this issue by joining the "marginally" significant and non-significant *p*-value interval, resulting in the "3PSM" model.

```{r}
fit_3PSM <- weightr::weightfunct(
  effect = dfz$yi, v = dfz$vi,
  steps = c(0.025), table = TRUE
)
fit_3PSM
```

The new model does not suffer from the estimation problem due to the limited number of *p*-values in the intervals, so we can now interpret the results with more confidence. 
First, we check the test for heterogeneity that clearly rejects the null hypothesis `Q(df = 17) = 75.4999, $p$ = 5.188348e-09` (if we did not find evidence for heterogeneity, we could have proceeded by fitting the fixed-effect version of the model by specifying the `fe = TRUE` argument).
We follow by checking the test for publication bias which is a likelihood ratio test comparing the unadjusted and adjusted estimate `X^2(df = 1) = 3.107176, $p$ = 0.077948`.
The result of the test is slightly ambiguous: we would reject the null hypothesis of no publication bias with $\alpha = 0.10$ but not with $\alpha = 0.05$.

If we decide to interpret the estimated effect size, we have to again transform it back to the correlation scale. 
The effect size estimate corresponds to the second value in the `fit_3PSM$adj_est` object for the random effect model.

```{r}
metafor::transf.ztor(fit_3PSM$adj_est[2])
```

Alternatively, we could have conducted the analysis analogously but with the `metafor` package. 
We use the random effect meta-analysis object from above and specify the `type = "stepfun"` argument to obtain a step weight function and set the appropriate steps with the `steps = c(0.025)` argument.

```{r}
fit_sel_z <- metafor::selmodel(fit_rma, type = "stepfun", steps = c(0.025))
fit_sel_z
```

The output verifies the results obtained in the previous analysis.


### Robust Bayesian meta-analysis
The third and final publication bias adjustment that we will perform is robust Bayesian meta-analysis (RoBMA). 
RoBMA uses Bayesian model-averaging to combine inference from both PET-PEESE and selection models.
We use the `RoBMA` R package (and the `RoBMA()` function; @RoBMA) to fit the 36-model ensemble based on an orthogonal combination of models assuming the presence and absence of the effect size, heterogeneity, and publication bias.
The models assuming the presence of publication bias are further split into six weight function models and models utilizing the PET and PEESE publication bias adjustment.
To fit the model, we pass the Fisher's *z* effect sizes and sampling variances into the `yi` and `vi` arguments and set `measure = "ZCOR"`.
We use the `seed` argument to make the analysis reproducible (it uses MCMC sampling, in contrast to the previous methods). We turn on parallel estimation by setting the `parallel = TRUE` argument; if parallel processing fails, try rerunning the model or turning parallel processing off.

The default prior distributions and estimation algorithm changed after version 3.6.1, so we specify the prior distributions explicitly.

```{r prior-specification}
prior_effect_361        <- prior("normal", list(mean = 0, sd = 0.5))
prior_heterogeneity_361 <- prior("invgamma", list(shape = 1, scale = 0.075))
prior_bias_361          <- list(
  prior_weightfunction("two-sided", c(0.05),             wf_cumulative(c(1, 1)),       prior_weights = 1/12),
  prior_weightfunction("two-sided", c(0.05, 0.10),       wf_cumulative(c(1, 1, 1)),    prior_weights = 1/12),
  prior_weightfunction("one-sided", c(0.05),             wf_cumulative(c(1, 1)),       prior_weights = 1/12),
  prior_weightfunction("one-sided", c(0.025, 0.05),      wf_cumulative(c(1, 1, 1)),    prior_weights = 1/12),
  prior_weightfunction("one-sided", c(0.05, 0.50),       wf_cumulative(c(1, 1, 1)),    prior_weights = 1/12),
  prior_weightfunction("one-sided", c(0.025, 0.05, 0.50), wf_cumulative(c(1, 1, 1, 1)), prior_weights = 1/12),
  prior_PET("Cauchy",   list(location = 0, scale = 1),  truncation = list(0, Inf), prior_weights = 1/4),
  prior_PEESE("Cauchy", list(location = 0, scale = 10), truncation = list(0, Inf), prior_weights = 1/4)
)
```

These prior distributions reproduce the version 3.6.1 specification after the old approximate Cohen's *d* to Fisher's *z* prior rescaling.
On the Cohen's *d* scale, the effect prior was Normal(0, 1), the heterogeneity prior was InvGamma(1, 0.15), and the PEESE prior scale was 5.

```{r fit-RoBMA, eval = FALSE}
fit_RoBMA <- RoBMA(
  yi = yi, vi = vi, measure = "ZCOR",
  prior_effect                 = prior_effect_361,
  prior_heterogeneity          = prior_heterogeneity_361,
  prior_bias                   = prior_bias_361,
  data = dfz, seed = 1, parallel = TRUE
)
```
This step used to take some time depending on your CPU.
The 4.0 version of the package uses product-space parameterization, which notably speeds up the sampling and usually takes under a minute for moderately large datasets.

We use the `summary()` function to explore details of the fitted model.

```{r}
summary(fit_RoBMA)
```

The printed output consists of three parts.
The first table called `Component Inclusion` contains information about the fitted models.
The second column summarizes the prior model probabilities of the models assuming the presence of the individual components.
The third column contains information about the posterior probability of models assuming the presence of the components; we can observe that the posterior model probabilities of models assuming the presence of an effect slightly increased.
The last column contains information about the evidence in favor of the presence of any of those components. 
Evidence for the presence of an effect is undecided; the models assuming the presence of an effect are only ~1.2 times more likely given the data than the models assuming the absence of an effect. 
However, we find overwhelming evidence in favor of heterogeneity, with the models assuming the presence of heterogeneity being more than 10,000 times more likely given the data than models assuming the absence of heterogeneity, and moderate evidence in favor of publication bias. 

The second table called `Estimates` contains information about the model-averaged estimates. 
The first row labeled `mu` corresponds to the model-averaged effect size estimate on Fisher's *z* scale and the second row labeled `tau` corresponds to the model-averaged heterogeneity estimates. 
Below are the estimated model-averaged weights for the different *p*-value intervals and the PET and PEESE regression coefficients. 
We obtain the effect size estimate on the correlation scale with `pooled_effect()`.

```{r}
pooled_effect(fit_RoBMA, output_measure = "COR")
```

Now, we have obtained the model-averaged effect size estimate on the correlation scale. 
If we were interested in estimates model-averaged only across the models assuming the presence of an effect, heterogeneity, and publication bias, we could add the `conditional = TRUE` argument to the `summary()` function.

```{r}
summary(fit_RoBMA, conditional = TRUE)
```

We can also obtain summary information about the individual models with `summary_models()`. 
The resulting table shows the prior and posterior model probabilities and inclusion Bayes factors for the individual models.

```{r}
summary_models(fit_RoBMA, type = "individual")
```

MCMC diagnostics are included in the summary output. 
Visual diagnostics can be requested with `plot_diagnostic_trace()`, `plot_diagnostic_density()`, and `plot_diagnostic_autocorrelation()`.

```{r}
plot_diagnostic_trace(fit_RoBMA, parameter = "mu")
```

Finally, we can also plot the model-averaged posterior distribution with the `plot()` function. 
We set the `prior = TRUE` argument to include the prior distribution as a grey line (and arrow for the point density at zero) and `output_measure = "COR"` to transform the posterior distribution to the correlation scale. 
(The `par(mar = c(4, 4, 1, 4))` call increases the left margin of the figure, so the secondary y-axis text is not cut off.)

```{r, fig.width = 6, fig.height = 4, out.width = "75%", fig.align = "center"}
par(mar = c(4, 4, 1, 4))
plot(fit_RoBMA, parameter = "mu", prior = TRUE, output_measure = "COR", xlim = c(-1, 1))
```

#### Specifying Different Prior Distributions
The `RoBMA` R package allows us to fit ensembles of highly customized meta-analytic models.
Here we reproduce the ensemble for the perinull directional hypothesis test from the Appendix (see the R package vignettes for more examples and details).
We explicitly specify the prior distribution for the models assuming the presence of the effect with the `prior_effect` argument, which assigns a Normal(0.30, 0.10) distribution bounded to positive values to the $\mu$ parameter on the Fisher's *z* scale.
Similarly, we replace the default prior distribution for the models assuming absence of the effect with a perinull hypothesis with the `prior_effect_null` argument.

```{r perinull-priors}
prior_effect_perinull_361      <- prior("normal", list(mean = 0.30, sd = 0.10), truncation = list(0, Inf))
prior_effect_null_perinull_361 <- prior("normal", list(mean = 0, sd = 0.05))
```

On the Cohen's *d* scale, these correspond to the version 3.6.1 Appendix prior distributions Normal(0.60, 0.20) truncated to positive values and Normal(0, 0.10).

```{r fit-RoBMA2, eval = FALSE}
fit_RoBMA2 <- RoBMA(
  yi = yi, vi = vi, measure = "ZCOR",
  prior_effect                 = prior_effect_perinull_361,
  prior_effect_null            = prior_effect_null_perinull_361,
  prior_heterogeneity          = prior_heterogeneity_361,
  prior_bias                   = prior_bias_361,
  data = dfz, seed = 2, parallel = TRUE
)
```

As previously, we can use `summary_models()` to inspect the model fit and verify that the specified models correspond to the settings.

```{r}
summary_models(fit_RoBMA2, type = "individual")
```


### References