Estimating Population Average Treatment Effects with the borrowr Package

Introduction

The borrowr package estimates the population average treatment effect (PATE) from a primary data source with borrowing from supplemental sources.

To adjust for confounding confounding, the estimation is done by fitting a model for the conditional mean given treatment and confounders. Currently, two models are available, a Bayesian linear model with an inverse-gamma prior, and Bayesian Additive Regression Trees (BART). The user must specify a formula for the conditional mean. This requires more thought for the Bayesian linear model as the analyst must carefully consider the functional form of the regression relationship. For BART, the right hand side of the formula need only include the confounders and the treatment variable without specification of the functional form.

Borrowing between data sources is done with Multisource Exchangeability Models (MEMs; Kaizer et al., 2018) . MEMs borrow by assuming that each supplementary data source is either “exchangeable”, or not, with the primary data source. Two data sources are considered exchangeable if their model parameters are equal. Each data source can be exchangeable with the primary data, or not, so if there are \(r\) data sources, there are \(2 ^ r\) possible configurations regarding the exchangeability assumptions. Each of these configurations corresponds to a single MEM. The parameters for each MEM are estimated, and we compute a posterior probability for each. The posterior density of the PATE is a weighted posterior across all possible MEMs.

The adapt Data

We illustrate usage of the borrowr package with the adapt data:

library(borrowr)
data(adapt)
head(adapt)
#>           y        x  source treatment
#> 1  97.37711 55.51128 Primary         1
#> 2  87.29579 26.38264 Primary         0
#> 3  93.07157 41.99595 Primary         0
#> 4  83.78115 32.49223 Primary         0
#> 5  96.03471 36.34139 Primary         1
#> 6 100.89876 58.61346 Primary         1

The data include 3 data sources with a univariate confounding variable x and the treatment variable treatment:

library(ggplot2)
ggplot(data = adapt, mapping = aes(x = x, y = y, color = as.factor(treatment))) +
  geom_point() +
  geom_smooth(se = FALSE) +
  facet_wrap(~ source) +
  theme_classic()
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

We will estimate the PATE while adjusting for the confounding variable x.

Using the pate function to estimate the PATE

pate is the primary function of the borrowr package. The following arguments are required:

We will estimate the PATE using a quadratic model for x, allowing for different quadratic relationships between treated and untreated:

est <- pate(y ~ treatment*x + treatment*I(x ^ 2), data = adapt, 
  estimator = "bayesian_lm", src_var = "source", primary_source = "Primary", 
  trt_var = "treatment")

The print method shows some information about the posterior:

est
#> Population Average Treatment Effect (PATE)
#> 
#> PATE Posterior Summary Statistics (Treated vs. Untreated)
#> 
#> Mean Treatment Effect             Std. Dev.          Pr(PATE > 0) 
#>              3.496863              1.534991              1.000000

And a summary method that gives more info:

summary(est)
#> 
#> Population Average Treatment Effect (PATE)
#> 
#> Call:
#> 
#> pate(formula = y ~ treatment * x + treatment * I(x^2), estimator = "bayesian_lm", 
#>     data = adapt, src_var = "source", primary_source = "Primary", 
#>     trt_var = "treatment")
#> 
#> PATE Posterior Summary Statistics (Treated vs. Untreated)
#> 
#> Mean Treatment Effect             Std. Dev.          Pr(PATE > 0) 
#>              3.496863              1.534991              1.000000 
#> 
#> Exchangeability Matrix (1 == Exchangeable with primary source):
#> 
#>          MEM
#> source    1 2 3 4
#>   Primary 1 1 1 1
#>   Supp1   1 0 1 0
#>   Supp2   1 1 0 0
#> 
#> MEM Posterior Probability:
#>        MEM_1        MEM_2        MEM_3        MEM_4 
#> 1.255412e-02 9.225502e-01 6.489563e-02 5.065254e-09

References

Chipman, H. & McCulloch, R. (2010) BART: Bayesian additive regression trees. Annals of Applied Statistics, 4(1): 266-298.

Kaizer, Alexander M., Koopmeiners, Joseph S., Hobbs, Brian P. (2018) Bayesian hierarchical modeling based on multisource exchangeability. Biostatistics, 19(2): 169-184.