CRAN Task View: Mixed, Multilevel, and Hierarchical Models in R

Maintainer:Ben Bolker, Julia Piaskowski, Emi Tanaka, Phillip Alday, Wolfgang Viechtbauer
Contact:bolker at mcmaster.ca
Version:2024-05-08
URL:https://CRAN.R-project.org/view=MixedModels
Source:https://github.com/cran-task-views/MixedModels/
Contributions:Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide.
Citation:Ben Bolker, Julia Piaskowski, Emi Tanaka, Phillip Alday, Wolfgang Viechtbauer (2024). CRAN Task View: Mixed, Multilevel, and Hierarchical Models in R. Version 2024-05-08. URL https://CRAN.R-project.org/view=MixedModels.
Installation:The packages from this task view can be installed automatically using the ctv package. For example, ctv::install.views("MixedModels", coreOnly = TRUE) installs all the core packages or ctv::update.views("MixedModels") installs all packages that are not yet installed and up-to-date. See the CRAN Task View Initiative for more details.

Contributors: Maintainers plus Michael Agronah, Matthew Fidler, Thierry Onkelinx

Mixed (or mixed-effect) models are a broad class of statistical models used to analyze data where observations can be assigned a priori to discrete groups, and where the parameters describing the differences between groups are treated as random (or latent) variables. They are one category of multilevel, or hierarchical models; longitudinal data are often analyzed in this framework. In econometrics, longitudinal or cross-sectional time series data are often referred to as panel data and are sometimes fitted with mixed models. Mixed models can be fitted in either frequentist or Bayesian frameworks.

This task view only includes models that incorporate continuous (usually although not always Gaussian) latent variables. This excludes packages that handle hidden Markov models, latent Markov models, and finite (discrete) mixture models (some of these are covered by the Cluster task view). Dynamic linear models and other state-space models that do not incorporate a discrete grouping variable are also excluded (some of these are covered by the TimeSeries task view). Bioinformatic applications of mixed models hosted on Bioconductor are mostly excluded as well.

Basic model fitting

Linear mixed models

Linear mixed models (LMMs) make the following assumptions:

Frequentist:

The most commonly used packages and/or functions for frequentist LMMs are:

Bayesian:

Most Bayesian R packages use Markov chain Monte Carlo (MCMC) estimation: MCMCglmm, rstanarm, and brms; the latter two packages use the Stan infrastructure. blme, built on lme4, uses maximum a posteriori (MAP) estimation. bamlss provides a flexible set of modular functions for Bayesian regression modeling.

Generalized linear mixed models

Generalized linear mixed models (GLMMs) can be described as hierarchical extensions of generalized linear models (GLMs), or as extensions of LMMs to different response distributions, typically in the exponential family. The random-effect distributions are typically assumed to be Gaussian on the scale of the linear predictor.

Frequentist:

Bayesian:

Most Bayesian mixed model packages use some form of Markov chain Monte Carlo (or other Monte Carlo methods).

The following packages (in addition to bamlss) find maximum a posteriori fits to Bayesian (G)LMMs by optimization:

vglmer estimates GLMMs by variational Bayesian methods.

Nonlinear mixed models

Nonlinear mixed models incorporate arbitrary nonlinear responses that cannot be accommodated in the framework of GLMMs. Only a few packages can accommodate generalized nonlinear mixed models (i.e., parametric nonlinear mixed models with non-Gaussian responses). However, many packages allow smooth nonparametric components (see “Additive models” below). Otherwise, users may need to implement GNLMMs themselves in a more general hierarchical modeling framework.

Frequentist:

Bayesian:

Generalized estimating equations

General estimating equations (GEEs) are an alternative approach to fitting clustered, longitudinal, or otherwise correlated data. These models produce estimates of the marginal effects (averaged across the group-level variation) rather than conditional effects (conditioned on group-level information).

Specialized models/tasks

Hierarchical modeling frameworks

These packages do not directly provide functions to fit mixed models, but instead implement interfaces to general-purpose sampling and optimization toolboxes that can be used to fit mixed models. While models require extra effort to set up, and often require programming in a domain-specific language other than R, these frameworks are more flexible than most of the other packages listed here.

Model diagnostics and summary statistics

Model diagnostics

Summary statistics

Derivatives

The first and second derivatives of log-likelihood with respect to parameters can be useful for various model evaluation tasks (e.g., computing sensitivities, robust variance-covariance matrices, or delta-method variances).

Data sets

Many packages include small example data sets (e.g., lme4, nlme). These packages provide previously described data sets often used in evaluating mixed models.

Model presentation and prediction

Functions and frameworks for convenient and tabular and graphical output of mixed model results:

Convenience wrappers

These functions provide convenient frameworks to fit and interpret mixed models.

Inference and model selection

Hypothesis testing

Prediction and estimation

Bootstrapping

Power analysis and simulation

These topics are closely related because there are few available analytical methods for computing statistical power for mixed models; power usually needs to be estimated by simulation.

Model selection

Commercial software interfaces

CRAN packages

Core:brms, broom.mixed, geepack, glmmTMB, lavaan, lme4, MCMCglmm, multilevelmod, nlme, sommer.
Regular:afex, aod, aods3, ARpLMEC, asremlPlus, babelmixr2, bamlss, bcmixed, BGLR, blavaan, blme, blmeco, boot.pval, boxcoxmix, buildmer, cAIC4, car, CARBayesST, CLME, clubSandwich, coxme, CpGassoc, cplm, CRTgeeDR, DHARMa, dhglm, dotwhisker, effects, emmeans, ez, faux, galamm, gamlss, gamm4, gee, geeM, geesmv, ggeffects, ggResidpanel, glmertree, glmm, GLMMadaptive, glmmEP, glmmfields, glmmLasso, glmmrBase, GLMMRR, glmtoolbox, glmulti, gpboost, greta, hglm, HLMdiag, huxtable, iccbeta, influence.ME, influence.SEM, inlabru, insight, JointAI, kinship2, languageR, lmeInfo, LMERConvenienceFunctions, lmeresampler, lmerTest, lmeSplines, lmmpar, longpower, lqmm, marginaleffects, MarginalMediation, margins, MASS, mbest, mclogit, MCMC.qpcr, mdhglm, mdmb, merDeriv, merTools, mgcv, mice, mixedsde, mixlm, mlmhelpr, mlmRev, mlmtools, mmrm, modelsummary, MplusAutomation, mrgsolve, multgee, multilevelTools, MuMIn, mvctm, mvglmmRank, nimble, nlmeU, nlmixr2, nlmixr2data, nlmm, ordbetareg, ordinal, pan, parameters, partR2, pass.lme, pbkrtest, pedigreemm, performance, pez, phyr, piecewiseSEM, PKNCA, PKPDsim, plm, powerEQTL, QGglmm, qrLMM, qrNLMM, QTLRel, R2BayesX, r2glmm, R2jags, R2OpenBUGS, regress, repeated, rjags, RLRsim, robustBLME, robustlmm, rockchalk, rptR, rr2, rrBLUP, rstan, rstanarm, RTMB, RVAideMemoire, rxode2, saemix, SASmixed, sem, semtree, simr, sjPlot, skewlmm, spaMM, sphet, spind, splmm, StroupGLMM, TMB, tmbstan, varTestnlme, VetResearchLMM, vglmer, WeMix, zoib.

Related links

Other resources