| Type: | Package |
| Title: | 'ggplot2' Based Plots with Statistical Details |
| Version: | 1.0.0 |
| Maintainer: | Indrajeet Patil <patilindrajeet.science@gmail.com> |
| Description: | Extension of 'ggplot2', 'ggstatsplot' creates graphics with details from statistical tests included in the plots themselves. It provides an easier syntax to generate information-rich plots for statistical analysis of continuous (violin plots, scatterplots, histograms, dot plots, dot-and-whisker plots) or categorical (pie and bar charts) data. Currently, it supports the most common types of statistical approaches and tests: parametric, nonparametric, robust, and Bayesian versions of t-test/ANOVA, correlation analyses, contingency table analysis, meta-analysis, and regression analyses. References: Patil (2021) <doi:10.21105/joss.03236>. |
| License: | MIT + file LICENSE |
| URL: | https://www.indrapatil.com/ggstatsplot/, https://github.com/IndrajeetPatil/ggstatsplot |
| BugReports: | https://github.com/IndrajeetPatil/ggstatsplot/issues |
| Depends: | R (≥ 4.3.0) |
| Imports: | correlation (≥ 0.8.8), datawizard (≥ 1.3.0), dplyr (≥ 1.2.1), forcats (≥ 1.0.1), ggcorrplot (≥ 0.1.4.1), ggplot2 (≥ 4.0.2), ggrepel (≥ 0.9.8), ggside (≥ 0.4.1), ggsignif (≥ 0.6.4), glue (≥ 1.8.1), insight (≥ 1.5.0), paletteer (≥ 1.7.0), parameters (≥ 0.28.3), patchwork (≥ 1.3.2), performance (≥ 0.16.0), purrr (≥ 1.2.2), rlang (≥ 1.2.0), statsExpressions (≥ 2.0.0), tidyr (≥ 1.3.2), utils |
| Suggests: | afex, BayesFactor (≥ 0.9.12-4.7), bayestestR, gapminder, knitr, lme4 (≥ 1.1-37), MASS, metaBMA, metafor, metaplus, patrick, psych, rmarkdown, rstantools, stats, survival, testthat (≥ 3.3.2), tibble, vdiffr (≥ 1.0.8), withr, WRS2 |
| VignetteBuilder: | knitr |
| Config/Needs/check: | anthonynorth/roxyglobals |
| Config/roxyglobals/unique: | TRUE |
| Config/testthat/edition: | 3 |
| Config/testthat/parallel: | true |
| Encoding: | UTF-8 |
| Language: | en-US |
| LazyData: | true |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | no |
| Packaged: | 2026-04-23 14:10:30 UTC; runner |
| Author: | Indrajeet Patil |
| Repository: | CRAN |
| Date/Publication: | 2026-04-23 16:10:03 UTC |
ggstatsplot: 'ggplot2' Based Plots with Statistical Details
Description
{ggstatsplot} is an extension of {ggplot2} package. It creates
graphics with details from statistical tests included in the plots
themselves. It provides an easier API to generate information-rich plots
for statistical analysis of continuous (violin plots, scatterplots,
histograms, dot plots, dot-and-whisker plots) or categorical (pie and bar
charts) data. Currently, it supports the most common types of statistical
tests: parametric, nonparametric, robust, and Bayesian versions of
t-test/ANOVA, correlation analyses, contingency table analysis,
meta-analysis, and regression analyses.
Details
ggstatsplot
The main functions are:
-
ggbetweenstats()function to produce information-rich comparison plot between different groups or conditions with{ggplot2}and details from the statistical tests in the subtitle. -
ggwithinstats()function to produce information-rich comparison plot within different groups or conditions with{ggplot2}and details from the statistical tests in the subtitle. -
ggscatterstats()function to produce{ggplot2}scatterplots along with a marginal distribution plots from{ggside}package and details from the statistical tests in the subtitle. -
ggpiestats()function to produce pie chart with details from the statistical tests in the subtitle. -
ggbarstats()function to produce stacked bar chart with details from the statistical tests in the subtitle. -
gghistostats()function to produce histogram for a single variable with results from one sample test displayed in the subtitle. -
ggdotplotstats()function to produce Cleveland-style dot plots/charts for a single variable with labels and results from one sample test displayed in the subtitle. -
ggcorrmat()function to visualize the correlation matrix. -
ggcoefstats()function to visualize results from regression analyses. -
combine_plots()helper function to combine multiple{ggstatsplot}plots usingpatchwork::wrap_plots().
References: Patil (2021) doi:10.21105/joss.03236.
For more documentation, see the dedicated Website.
Author(s)
Maintainer: Indrajeet Patil patilindrajeet.science@gmail.com (ORCID) [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/IndrajeetPatil/ggstatsplot/issues
Split data frame into a list by grouping variable
Description
This function splits the data frame into a list, with the length of the list equal to the factor levels of the grouping variable.
Usage
.grouped_list(data, grouping.var)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
grouping.var |
A single grouping variable. |
Examples
ggstatsplot:::.grouped_list(ggplot2::msleep, grouping.var = vore)
Check if palette has enough number of colors
Description
Aborts with an informative error if the number of factor levels exceeds the number of colors available in the specified palette.
Usage
.is_palette_sufficient(palette, min_length)
Examples
ggstatsplot:::.is_palette_sufficient("ggthemes::gdoc", 6L)
try(ggstatsplot:::.is_palette_sufficient("ggthemes::gdoc", 30L))
Titanic dataset.
Description
Titanic dataset.
Usage
Titanic_full
Format
A data frame with 2201 rows and 5 variables
id. Dummy identity number for each person.
Class. 1st, 2nd, 3rd, Crew.
Sex. Male, Female.
Age. Child, Adult.
Survived. No, Yes.
Details
This data set provides information on the fate of passengers on the fatal maiden voyage of the ocean liner 'Titanic', summarized according to economic status (class), sex, age and survival.
This is a modified dataset from {datasets} package.
Examples
dim(Titanic_full)
head(Titanic_full)
dplyr::glimpse(Titanic_full)
Tidy version of the "Bugs" dataset.
Description
Tidy version of the "Bugs" dataset.
Usage
bugs_long
Format
A data frame with 372 rows and 6 variables
subject. Dummy identity number for each participant.
gender. Participant's gender (Female, Male).
region. Region of the world the participant was from.
education. Level of education.
condition. Condition of the experiment the participant gave rating for (LDLF: low freighteningness and low disgustingness; LFHD: low freighteningness and high disgustingness; HFHD: high freighteningness and low disgustingness; HFHD: high freighteningness and high disgustingness).
desire. The desire to kill an arthropod was indicated on a scale from 0 to 10.
Details
This data set, "Bugs", provides the extent to which men and women want to kill arthropods that vary in freighteningness (low, high) and disgustingness (low, high). Each participant rates their attitudes towards all anthropods. Subset of the data reported by Ryan et al. (2013).
References
Ryan, R. S., Wilde, M., & Crist, S. (2013). Compared to a small, supervised lab experiment, a large, unsupervised web-based experiment on a previously unknown effect has benefits that outweigh its potential costs. Computers in Human Behavior, 29(4), 1295-1301.
Examples
dim(bugs_long)
head(bugs_long)
dplyr::glimpse(bugs_long)
Combining and arranging multiple plots in a grid
Description
Wrapper around patchwork::wrap_plots() that will return a combined grid
of plots with annotations. In case you want to create a grid of plots, it is
highly recommended that you use {patchwork} package directly and not
this wrapper around it which is mostly useful with {ggstatsplot} plots. It
is exported only for backward compatibility.
Usage
combine_plots(
plotlist,
plotgrid.args = list(),
annotation.args = list(),
guides = "collect",
...
)
Arguments
plotlist |
A list containing |
plotgrid.args |
A |
annotation.args |
A |
guides |
A string specifying how guides should be treated in the layout.
|
... |
Currently ignored. |
Value
A combined plot with annotation labels.
Examples
library(ggplot2)
# first plot
p1 <- ggplot(
data = subset(iris, iris$Species == "setosa"),
aes(x = Sepal.Length, y = Sepal.Width)
) +
geom_point() +
labs(title = "setosa")
# second plot
p2 <- ggplot(
data = subset(iris, iris$Species == "versicolor"),
aes(x = Sepal.Length, y = Sepal.Width)
) +
geom_point() +
labs(title = "versicolor")
# combining the plot with a title and a caption
combine_plots(
plotlist = list(p1, p2),
plotgrid.args = list(nrow = 1),
annotation.args = list(
tag_levels = "a",
title = "Dataset: Iris Flower dataset",
subtitle = "Edgar Anderson collected this data",
caption = "Note: Only two species of flower are displayed",
theme = theme(
plot.subtitle = element_text(size = 20),
plot.title = element_text(size = 30)
)
)
)
Extracting data frames or expressions from {ggstatsplot} plots
Description
Extracting data frames or expressions from {ggstatsplot} plots
Usage
extract_stats(p)
extract_subtitle(p)
extract_caption(p)
Arguments
p |
A plot from |
Details
These are convenience functions to extract data frames or expressions with
statistical details that are used to create expressions displayed in
{ggstatsplot} plots as subtitle, caption, etc. Note that all of this
analysis is carried out by the {statsExpressions}
package. And so if you
are using these functions only to extract data frames, you are better off
using that package.
The only exception is the ggcorrmat() function. But, if a data frame is
what you want, you shouldn't be using ggcorrmat() anyway. You can use
correlation::correlation() function which provides tidy data frames by
default.
Value
A list of tibbles containing summaries of various statistical analyses. The exact details included will depend on the function.
Examples
set.seed(123)
# non-grouped plot
p1 <- ggbetweenstats(mtcars, cyl, mpg)
# grouped plot
p2 <- grouped_ggbarstats(Titanic_full, Survived, Sex, grouping.var = Age)
# extracting expressions -----------------------------
extract_subtitle(p1)
extract_caption(p1)
extract_subtitle(p2)
extract_caption(p2)
# extracting data frames -----------------------------
extract_stats(p1)
extract_stats(p2)
Stacked bar charts with statistical tests
Description
Bar charts for categorical data with statistical details included in the plot as a subtitle.
Usage
ggbarstats(
data,
x,
y = NULL,
counts = NULL,
type = "parametric",
paired = FALSE,
results.subtitle = TRUE,
label = "percentage",
label.args = list(alpha = 1, fill = "white"),
sample.size.label.args = list(size = 4),
digits = 2L,
proportion.test = results.subtitle,
digits.perc = 0L,
bf.message = TRUE,
ratio = NULL,
alternative = "two.sided",
conf.level = 0.95,
p.adjust.method = "holm",
title = NULL,
subtitle = NULL,
caption = NULL,
legend.title = NULL,
xlab = NULL,
ylab = NULL,
ggtheme = ggstatsplot::theme_ggstatsplot(),
palette = "ggthemes::gdoc",
ggplot.component = NULL,
...
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The variable to use as the rows in the contingency table. Please note that if there are empty factor levels in your variable, they will be dropped. |
y |
The variable to use as the columns in the contingency table.
Please note that if there are empty factor levels in your variable, they
will be dropped. Default is |
counts |
The variable in data containing counts, or |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
paired |
Logical indicating whether data came from a within-subjects or
repeated measures design study (Default: |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
label |
Character decides what information needs to be displayed
on the label in each pie slice. Possible options are |
label.args |
Additional aesthetic arguments that will be passed to
|
sample.size.label.args |
Additional aesthetic arguments that will be
passed to |
digits |
Number of digits for rounding or significant figures. May also
be |
proportion.test |
Decides whether proportion test for |
digits.perc |
Numeric that decides number of decimal places for
percentage labels (Default: |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
ratio |
A vector of proportions: the expected proportions for the
proportion test (should sum to |
alternative |
a character string specifying the alternative
hypothesis, must be one of |
conf.level |
Scalar between |
p.adjust.method |
Adjustment method for p-values for multiple
comparisons. Possible methods are: |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
legend.title |
Title text for the legend. |
xlab |
Label for |
ylab |
Labels for |
ggtheme |
A |
palette |
Name of the palette in |
ggplot.component |
A |
... |
Currently ignored. |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggpiestats.html
Summary of graphics
| graphical element | geom used | argument for further modification |
| bars | ggplot2::geom_bar() | NA |
| descriptive labels | ggplot2::geom_label() | label.args |
| sample size labels | ggplot2::geom_text() | sample.size.label.args
|
Contingency table analyses
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
two-way table
Hypothesis testing
| Type | Design | Test | Function used |
| Parametric/Non-parametric | Unpaired | Pearson's chi-squared test | stats::chisq.test() |
| Bayesian | Unpaired | Bayesian Pearson's chi-squared test | BayesFactor::contingencyTableBF() |
| Parametric/Non-parametric | Paired | McNemar's chi-squared test | stats::mcnemar.test() |
| Bayesian | Paired | No | No |
Effect size estimation
| Type | Design | Effect size | CI available? | Function used |
| Parametric/Non-parametric | Unpaired | Cramer's V | Yes | effectsize::cramers_v() |
| Bayesian | Unpaired | Cramer's V | Yes | effectsize::cramers_v() |
| Parametric/Non-parametric | Paired | Cohen's g | Yes | effectsize::cohens_g() |
| Bayesian | Paired | No | No | No |
one-way table
Hypothesis testing
| Type | Test | Function used |
| Parametric/Non-parametric | Goodness of fit chi-squared test | stats::chisq.test() |
| Bayesian | Bayesian Goodness of fit chi-squared test | (custom) |
Effect size estimation
| Type | Effect size | CI available? | Function used |
| Parametric/Non-parametric | Pearson's C | Yes | effectsize::pearsons_c() |
| Bayesian | No | No | No |
Pairwise comparisons
When there is a two-way table and x has more than two levels, pairwise
contingency table analyses (Fisher's exact tests) are computed using
statsExpressions::pairwise_contingency_table(). These pairwise results are not
displayed in the plot because bar and pie charts lack a natural visual
representation for pairwise significance annotations (unlike box/violin
plots, which use bracket annotations). Additionally, there is no
established convention for overlaying pairwise comparisons on pie charts,
and both ggpiestats() and ggbarstats() are designed to remain visually
congruent. The pairwise results are available as a data frame via
extract_stats(plot)$pairwise_comparisons_data.
See Also
grouped_ggbarstats, ggpiestats,
grouped_ggpiestats
Examples
# for reproducibility
set.seed(123)
# one sample goodness of fit proportion test
p <- ggbarstats(mtcars, vs)
# looking at the plot
p
# extracting details from statistical tests
extract_stats(p)
# association test (or contingency table analysis)
ggbarstats(mtcars, vs, cyl)
# with 3+ x levels, pairwise comparisons are available
ggbarstats(mtcars, cyl, am)
# Bayesian test
ggbarstats(mtcars, vs, cyl, type = "bayes")
# using pre-aggregated data with counts
ggbarstats(as.data.frame(Titanic), x = Survived, y = Sex, counts = Freq)
Box/Violin plots for between-subjects comparisons
Description
A combination of box and violin plots along with jittered data points for between-subjects designs with statistical details included in the plot as a subtitle.
Usage
ggbetweenstats(
data,
x,
y,
type = "parametric",
pairwise.display = "significant",
pairwise.alpha = 0.05,
p.adjust.method = "holm",
bf.prior = 0.707,
bf.message = TRUE,
results.subtitle = TRUE,
xlab = NULL,
ylab = NULL,
caption = NULL,
title = NULL,
subtitle = NULL,
digits = 2L,
conf.level = 0.95,
tr = 0.2,
alternative = "two.sided",
centrality.plotting = TRUE,
centrality.type = type,
centrality.point.args = list(size = 5, color = "darkred"),
centrality.label.args = list(size = 3, nudge_x = 0.4, segment.linetype = 4,
min.segment.length = 0),
point.args = list(position = ggplot2::position_jitterdodge(dodge.width = 0.6), alpha =
0.4, size = 3, stroke = 0, na.rm = TRUE),
boxplot.args = list(width = 0.3, alpha = 0.2, na.rm = TRUE),
violin.args = list(width = 0.5, alpha = 0.2, na.rm = TRUE),
ggsignif.args = list(textsize = 3, tip_length = 0.01, na.rm = TRUE),
ggtheme = ggstatsplot::theme_ggstatsplot(),
palette = "ggthemes::gdoc",
ggplot.component = NULL,
...
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The grouping (or independent) variable from |
y |
The response (or outcome or dependent) variable from |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
pairwise.display |
Decides which pairwise comparisons to display. Available options are:
You can use this argument to make sure that your plot is not uber-cluttered
when you have multiple groups being compared and scores of pairwise
comparisons being displayed. If set to |
pairwise.alpha |
Numeric alpha threshold used to decide which pairwise
comparisons are displayed when |
p.adjust.method |
Adjustment method for p-values for multiple
comparisons. Possible methods are: |
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
xlab |
Label for |
ylab |
Labels for |
caption |
The text for the plot caption. This argument is relevant only
if |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
digits |
Number of digits for rounding or significant figures. May also
be |
conf.level |
Scalar between |
tr |
Trim level for the mean when carrying out |
alternative |
a character string specifying the alternative
hypothesis, must be one of |
centrality.plotting |
Logical that decides whether centrality tendency
measure is to be displayed as a point with a label (Default:
If you want default centrality parameter, you can specify this using
|
centrality.type |
Decides which centrality parameter is to be displayed.
The default is to choose the same as
Just as |
centrality.point.args, centrality.label.args |
A list of additional aesthetic
arguments to be passed to |
point.args |
A list of additional aesthetic arguments to be passed to
the |
boxplot.args |
A list of additional aesthetic arguments passed on to
|
violin.args |
A list of additional aesthetic arguments to be passed to
the |
ggsignif.args |
A list of additional aesthetic
arguments to be passed to |
ggtheme |
A |
palette |
Name of the palette in |
ggplot.component |
A |
... |
Currently ignored. |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggbetweenstats.html
Summary of graphics
| graphical element | geom used | argument for further modification |
| raw data | ggplot2::geom_point() | point.args |
| box plot | ggplot2::geom_boxplot() | boxplot.args |
| density plot | ggplot2::geom_violin() | violin.args |
| centrality measure point | ggplot2::geom_point() | centrality.point.args |
| centrality measure label | ggrepel::geom_label_repel() | centrality.label.args |
| pairwise comparisons | ggsignif::geom_signif() | ggsignif.args
|
Statistical defaults
This function uses statistically justified defaults that are not user-configurable:
-
Effect sizes are always unbiased (Hedges' g instead of Cohen's d, omega-squared instead of eta-squared). Unbiased estimators correct for the positive bias present in their biased counterparts, especially in small samples, and are recommended for meta-analytic work.
-
Welch's t-test and one-way test are used instead of Student's versions (i.e., equal variances are not assumed). Welch's test performs as well as Student's when variances are equal and is substantially more accurate when they are not, making it the unconditionally better default.
Users who need non-default values for these settings can call
{statsExpressions} directly.
Centrality measures
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
| Type | Measure | Function used |
| Parametric | mean | datawizard::describe_distribution() |
| Non-parametric | median | datawizard::describe_distribution() |
| Robust | trimmed mean | datawizard::describe_distribution() |
| Bayesian | MAP | datawizard::describe_distribution()
|
Two-sample tests
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
between-subjects
Hypothesis testing
| Type | No. of groups | Test | Function used |
| Parametric | 2 | Student's or Welch's t-test | stats::t.test() |
| Non-parametric | 2 | Mann-Whitney U test | stats::wilcox.test() |
| Robust | 2 | Yuen's test for trimmed means | WRS2::yuen() |
| Bayesian | 2 | Student's t-test | BayesFactor::ttestBF()
|
Effect size estimation
| Type | No. of groups | Effect size | CI available? | Function used |
| Parametric | 2 | Cohen's d, Hedge's g | Yes | effectsize::cohens_d(), effectsize::hedges_g() |
| Non-parametric | 2 | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
| Robust | 2 | Algina-Keselman-Penfield robust standardized difference | Yes | WRS2::akp.effect() |
| Bayesian | 2 | difference | Yes | bayestestR::describe_posterior()
|
within-subjects
Data requirement: Paired tests assume exactly one observation per subject per condition. If your data has multiple trials per cell, aggregate first (e.g., take the mean).
Hypothesis testing
| Type | No. of groups | Test | Function used |
| Parametric | 2 | Student's t-test | stats::t.test() |
| Non-parametric | 2 | Wilcoxon signed-rank test | stats::wilcox.test() |
| Robust | 2 | Yuen's test on trimmed means for dependent samples | WRS2::yuend() |
| Bayesian | 2 | Student's t-test | BayesFactor::ttestBF()
|
Effect size estimation
| Type | No. of groups | Effect size | CI available? | Function used |
| Parametric | 2 | Cohen's d, Hedge's g | Yes | effectsize::cohens_d(), effectsize::hedges_g() |
| Non-parametric | 2 | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
| Robust | 2 | Algina-Keselman-Penfield robust standardized difference | Yes | WRS2::wmcpAKP() |
| Bayesian | 2 | difference | Yes | bayestestR::describe_posterior()
|
One-way ANOVA
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
between-subjects
Hypothesis testing
| Type | No. of groups | Test | Function used |
| Parametric | > 2 | Fisher's or Welch's one-way ANOVA | stats::oneway.test() |
| Non-parametric | > 2 | Kruskal-Wallis one-way ANOVA | stats::kruskal.test() |
| Robust | > 2 | Heteroscedastic one-way ANOVA for trimmed means | WRS2::t1way() |
| Bayesian | > 2 | Fisher's ANOVA | BayesFactor::anovaBF()
|
Effect size estimation
| Type | No. of groups | Effect size | CI available? | Function used |
| Parametric | > 2 | partial eta-squared, partial omega-squared | Yes | effectsize::omega_squared(), effectsize::eta_squared() |
| Non-parametric | > 2 | rank epsilon squared | Yes | effectsize::rank_epsilon_squared() |
| Robust | > 2 | Explanatory measure of effect size | Yes | WRS2::t1way() |
| Bayesian | > 2 | Bayesian R-squared | Yes | performance::r2_bayes()
|
within-subjects
Data requirement: Repeated measures tests assume a complete design with
exactly one observation per subject per condition. If your data has multiple
trials per cell, aggregate first (e.g., take the mean). Verify with
table(data$subject, data$condition) — every cell should equal 1.
Hypothesis testing
| Type | No. of groups | Test | Function used |
| Parametric | > 2 | One-way repeated measures ANOVA | afex::aov_ez() |
| Non-parametric | > 2 | Friedman rank sum test | stats::friedman.test() |
| Robust | > 2 | Heteroscedastic one-way repeated measures ANOVA for trimmed means | WRS2::rmanova() |
| Bayesian | > 2 | One-way repeated measures ANOVA | BayesFactor::anovaBF()
|
Effect size estimation
| Type | No. of groups | Effect size | CI available? | Function used |
| Parametric | > 2 | partial eta-squared, partial omega-squared | Yes | effectsize::omega_squared(), effectsize::eta_squared() |
| Non-parametric | > 2 | Kendall's coefficient of concordance | Yes | effectsize::kendalls_w() |
| Robust | > 2 | Algina-Keselman-Penfield robust standardized difference average | Yes | WRS2::wmcpAKP() |
| Bayesian | > 2 | Bayesian R-squared | Yes | performance::r2_bayes()
|
Pairwise comparison tests
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
between-subjects
Hypothesis testing
| Type | Equal variance? | Test | p-value adjustment? | Function used |
| Parametric | No | Games-Howell test | Yes | PMCMRplus::gamesHowellTest() |
| Parametric | Yes | Student's t-test | Yes | stats::pairwise.t.test() |
| Non-parametric | No | Dunn test | Yes | PMCMRplus::kwAllPairsDunnTest() |
| Robust | No | Yuen's trimmed means test | Yes | WRS2::lincon() |
| Bayesian | NA | Student's t-test | NA | BayesFactor::ttestBF()
|
Effect size estimation
Not supported.
within-subjects
Data requirement: Paired pairwise tests assume exactly one observation per subject per condition. If your data has multiple trials per cell, aggregate first (e.g., take the mean).
Hypothesis testing
| Type | Test | p-value adjustment? | Function used |
| Parametric | Student's t-test | Yes | stats::pairwise.t.test() |
| Non-parametric | Durbin-Conover test | Yes | PMCMRplus::durbinAllPairsTest() |
| Robust | Yuen's trimmed means test | Yes | WRS2::rmmcp() |
| Bayesian | Student's t-test | NA | BayesFactor::ttestBF()
|
Effect size estimation
Not supported.
See Also
grouped_ggbetweenstats, ggwithinstats,
grouped_ggwithinstats
Examples
# for reproducibility
set.seed(123)
p <- ggbetweenstats(mtcars, am, mpg)
p
# extracting details from statistical tests
extract_stats(p)
# show non-significant pairwise comparisons (needs 3+ groups for ggsignif)
ggbetweenstats(mtcars, cyl, mpg, pairwise.display = "non-significant")
# show all pairwise comparisons
ggbetweenstats(mtcars, cyl, mpg, pairwise.display = "all")
# use a stricter alpha threshold for significant pairwise comparisons
ggbetweenstats(mtcars, cyl, mpg, pairwise.alpha = 0.001)
# modifying defaults
ggbetweenstats(
morley,
x = Expt,
y = Speed,
type = "robust",
xlab = "The experiment number",
ylab = "Speed-of-light measurement"
)
# you can remove a specific geom to reduce complexity of the plot
ggbetweenstats(
mtcars,
am,
wt,
# to remove violin plot
violin.args = list(width = 0, linewidth = 0, colour = NA),
# to remove boxplot
boxplot.args = list(width = 0),
# to remove points
point.args = list(alpha = 0)
)
Dot-and-whisker plots for regression analyses
Description
Plot with the regression coefficients' point estimates as dots with confidence interval whiskers and other statistical details included as labels.
Although the statistical models displayed in the plot may differ based on the class of models being investigated, there are few aspects of the plot that will be invariant across models:
The dot-whisker plot contains a dot representing the estimate and their confidence intervals (
95%is the default). The estimate can either be effect sizes (for tests that depend on theF-statistic) or regression coefficients (for tests witht-,chi^2-, andz-statistic), etc. The function will, by default, display a helpfulx-axis label that should clear up what estimates are being displayed. The confidence intervals can sometimes be asymmetric if bootstrapping was used.The label attached to dot will provide more details from the statistical test carried out and it will typically contain estimate, statistic, and p-value.
The caption will contain diagnostic information, if available, about models that can be useful for model selection: The smaller the Akaike's Information Criterion (AIC) and the Bayesian Information Criterion (BIC) values, the "better" the model is.
The output of this function will be a
{ggplot2}object and, thus, it can be further modified (e.g. change themes) with{ggplot2}.
Usage
ggcoefstats(
x,
statistic = NULL,
conf.int = TRUE,
conf.level = 0.95,
digits = 2L,
exclude.intercept = FALSE,
effectsize.type = "omega",
meta.analytic.effect = FALSE,
meta.type = "parametric",
bf.message = TRUE,
sort = "none",
xlab = NULL,
ylab = NULL,
title = NULL,
subtitle = NULL,
caption = NULL,
only.significant = FALSE,
point.args = list(size = 3, color = "blue", na.rm = TRUE),
errorbar.args = list(width = 0, na.rm = TRUE),
vline = TRUE,
vline.args = list(linewidth = 1, linetype = "dashed"),
stats.labels = TRUE,
stats.label.color = NULL,
stats.label.args = list(size = 3, direction = "y", min.segment.length = 0, na.rm =
TRUE),
palette = "ggthemes::gdoc",
ggtheme = ggstatsplot::theme_ggstatsplot(),
...
)
Arguments
x |
A model object to be tidied, or a tidy data frame from a regression
model. Function internally uses |
statistic |
Relevant statistic for the model ( |
conf.int |
Logical. Decides whether to display confidence intervals as
error bars (Default: |
conf.level |
Numeric deciding level of confidence or credible intervals
(Default: |
digits |
Number of digits for rounding or significant figures. May also
be |
exclude.intercept |
Logical that decides whether the intercept should be
excluded from the plot (Default: |
effectsize.type |
This is the same as |
meta.analytic.effect |
Logical that decides whether subtitle for
meta-analysis via linear (mixed-effects) models (default: |
meta.type |
Type of statistics used to carry out random-effects
meta-analysis. If |
bf.message |
Logical that decides whether results from running a
Bayesian meta-analysis assuming that the effect size d varies across
studies with standard deviation t (i.e., a random-effects analysis)
should be displayed in caption. Defaults to |
sort |
If |
xlab |
Label for |
ylab |
Labels for |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. The input to this argument
will be ignored if |
caption |
The text for the plot caption. This argument is relevant only
if |
only.significant |
If |
point.args |
A list of additional aesthetic arguments to be passed to
the |
errorbar.args |
Additional arguments that will be passed to
|
vline |
Decides whether to display a vertical line (Default: |
vline.args |
Additional arguments that will be passed to
|
stats.labels |
Logical. Decides whether the statistic and p-values for
each coefficient are to be attached to each dot as a text label using
|
stats.label.color |
Color for the labels. If set to |
stats.label.args |
Additional arguments that will be passed to
|
palette |
Name of the palette in |
ggtheme |
A |
... |
Additional arguments to tidying method. For more, see
|
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggcoefstats.html
Summary of graphics
| graphical element | geom used | argument for further modification |
| regression estimate | ggplot2::geom_point() | point.args |
| error bars | ggplot2::geom_errorbarh() | errorbar.args |
| vertical line | ggplot2::geom_vline() | vline.args |
| label with statistical details | ggrepel::geom_label_repel() | stats.label.args
|
Random-effects meta-analysis
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing and Effect size estimation
| Type | Test | CI available? | Function used |
| Parametric | Pearson's correlation coefficient | Yes | correlation::correlation() |
| Non-parametric | Spearman's rank correlation coefficient | Yes | correlation::correlation() |
| Robust | Winsorized Pearson's correlation coefficient | Yes | correlation::correlation() |
| Bayesian | Bayesian Pearson's correlation coefficient | Yes | correlation::correlation()
|
Note
In case you want to carry out meta-analysis, you will be asked to install the needed packages (
{metafor},{metaplus}, or{metaBMA}) if they are unavailable.All rows of regression estimates where either of the following quantities is
NAwill be removed if labels are requested:estimate,statistic,p.value.Given the rapid pace at which new methods are added to these packages, it is recommended that you install development versions of
{easystats}packages using theinstall_latest()function from{easystats}.
Examples
# for reproducibility
set.seed(123)
# model object
mod <- lm(formula = mpg ~ cyl * am, data = mtcars)
# creating a plot
p <- ggcoefstats(mod)
# looking at the plot
p
# extracting details from statistical tests
extract_stats(p)
# exclude intercept from the plot
ggcoefstats(mod, exclude.intercept = TRUE)
# only show significant labels
ggcoefstats(mod, only.significant = TRUE)
# ANOVA model (F-statistic)
ggcoefstats(aov(mpg ~ cyl * am, data = mtcars))
# a tidy data frame can also be passed directly (model-free use)
ggcoefstats(data.frame(term = c("a", "b", "c"), estimate = c(0.5, -0.2, 1.1)))
# without a `term` column (auto-generated)
ggcoefstats(data.frame(estimate = c(0.5, -0.2, 1.1)))
# tidy data frames can also include stats-label inputs directly
df_tidy <- parameters::model_parameters(stats::lm(wt ~ am * cyl, mtcars), ci = 0.95)
names(df_tidy) <- c(
"term", "estimate", "std.error", "conf.level", "conf.low",
"conf.high", "statistic", "df.error", "p.value"
)
df_tidy$p.value[2L] <- 0.42
ggcoefstats(
df_tidy,
statistic = "t",
only.significant = TRUE,
stats.label.color = c("firebrick", "grey50", "forestgreen", "navy")
)
# further arguments can be passed to `parameters::model_parameters()`
library(lme4)
ggcoefstats(lmer(Reaction ~ Days + (Days | Subject), sleepstudy), effects = "fixed")
Visualization of a correlation matrix
Description
Correlation matrix containing results from pairwise correlation tests.
If you want a data frame of (grouped) correlation matrix, use
correlation::correlation() instead. It can also do grouped analysis when
used with output from dplyr::group_by().
Usage
ggcorrmat(
data,
cor.vars = NULL,
cor.vars.names = NULL,
matrix.type = "upper",
type = "parametric",
tr = 0.2,
partial = FALSE,
digits = 2L,
sig.level = 0.05,
conf.level = 0.95,
bf.prior = 0.707,
p.adjust.method = "holm",
colors = c("#EA4335", "white", "#4285F4"),
pch = "cross",
ggcorrplot.args = list(method = "square", outline.color = "black", pch.cex = 14),
ggtheme = ggstatsplot::theme_ggstatsplot(),
ggplot.component = NULL,
title = NULL,
subtitle = NULL,
caption = NULL,
...
)
Arguments
data |
A data frame from which variables specified are to be taken. |
cor.vars |
List of variables for which the correlation matrix is to be
computed and visualized. If |
cor.vars.names |
Optional list of names to be used for |
matrix.type |
Character, |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
tr |
Trim level for the mean when carrying out |
partial |
Can be |
digits |
Number of digits for rounding or significant figures. May also
be |
sig.level |
Significance level (Default: |
conf.level |
Scalar between |
bf.prior |
A number between |
p.adjust.method |
Adjustment method for p-values for multiple
comparisons. Possible methods are: |
colors |
A character vector of exactly three colors for the gradient:
low (negative correlations), mid (zero), and high (positive correlations).
Must be a diverging palette so that the sign of the correlation is
visually obvious.
Default: |
pch |
Decides the point shape to be used for insignificant correlation
coefficients (only valid when |
ggcorrplot.args |
A list of additional (mostly aesthetic) arguments that
will be passed to |
ggtheme |
A |
ggplot.component |
A |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
... |
Currently ignored. |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggcorrmat.html
Summary of graphics
| graphical element | geom used | argument for further modification |
| correlation matrix | ggcorrplot::ggcorrplot() | ggcorrplot.args
|
Correlation analyses
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing and Effect size estimation
| Type | Test | CI available? | Function used |
| Parametric | Pearson's correlation coefficient | Yes | correlation::correlation() |
| Non-parametric | Spearman's rank correlation coefficient | Yes | correlation::correlation() |
| Robust | Winsorized Pearson's correlation coefficient | Yes | correlation::correlation() |
| Bayesian | Bayesian Pearson's correlation coefficient | Yes | correlation::correlation()
|
See Also
grouped_ggcorrmat ggscatterstats
grouped_ggscatterstats
Examples
set.seed(123)
library(ggcorrplot)
ggcorrmat(iris)
# with data containing NAs (uses pairwise complete observations)
ggcorrmat(airquality)
# selecting specific variables
ggcorrmat(iris, cor.vars = c(Sepal.Length, Petal.Length, Petal.Width))
Dot plot/chart for labeled numeric data.
Description
A dot chart (as described by William S. Cleveland) with statistical details from one-sample test.
The point estimate (and associated uncertainty) displayed depends on the type of statistics selected:
-
mean for parametric statistics
-
median for non-parametric statistics
-
trimmed mean for robust statistics
-
MAP estimator for Bayesian statistics
Usage
ggdotplotstats(
data,
x,
y,
xlab = NULL,
ylab = NULL,
title = NULL,
subtitle = NULL,
caption = NULL,
type = "parametric",
test.value = 0,
alternative = "two.sided",
bf.prior = 0.707,
bf.message = TRUE,
conf.int = TRUE,
conf.level = 0.95,
tr = 0.2,
digits = 2L,
results.subtitle = TRUE,
point.args = list(color = "black", size = 3, shape = 16),
errorbar.args = list(width = 0, na.rm = TRUE),
centrality.plotting = TRUE,
centrality.type = type,
centrality.line.args = list(color = "blue", linewidth = 1, linetype = "dashed"),
ggplot.component = NULL,
ggtheme = ggstatsplot::theme_ggstatsplot(),
...
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
A numeric variable from the data frame |
y |
Label or grouping variable. |
xlab |
Label for |
ylab |
Labels for |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
test.value |
A number indicating the true value of the mean (Default:
|
alternative |
a character string specifying the alternative
hypothesis, must be one of |
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
conf.int |
Logical. Decides whether to display confidence intervals as
error bars (Default: |
conf.level |
Scalar between |
tr |
Trim level for the mean when carrying out |
digits |
Number of digits for rounding or significant figures. May also
be |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
point.args |
A list of additional aesthetic arguments to be passed to
the |
errorbar.args |
Additional arguments that will be passed to
|
centrality.plotting |
Logical that decides whether centrality tendency
measure is to be displayed as a point with a label (Default:
If you want default centrality parameter, you can specify this using
|
centrality.type |
Decides which centrality parameter is to be displayed.
The default is to choose the same as
Just as |
centrality.line.args |
A list of additional aesthetic arguments to be
passed to the |
ggplot.component |
A |
ggtheme |
A |
... |
Currently ignored. |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggdotplotstats.html
Summary of graphics
| graphical element | geom used | argument for further modification |
| raw data | ggplot2::geom_point() | point.args |
| error bars | ggplot2::geom_errorbarh() | errorbar.args |
| centrality measure line | ggplot2::geom_vline() | centrality.line.args
|
One-sample tests
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
| Type | Test | Function used |
| Parametric | One-sample Student's t-test | stats::t.test() |
| Non-parametric | One-sample Wilcoxon test | stats::wilcox.test() |
| Robust | Bootstrap-t method for one-sample test | WRS2::trimcibt() |
| Bayesian | One-sample Student's t-test | BayesFactor::ttestBF()
|
Effect size estimation
| Type | Effect size | CI available? | Function used |
| Parametric | Cohen's d, Hedge's g | Yes | effectsize::cohens_d(), effectsize::hedges_g() |
| Non-parametric | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
| Robust | trimmed mean | Yes | WRS2::trimcibt() |
| Bayesian | difference | Yes | bayestestR::describe_posterior()
|
See Also
grouped_gghistostats, gghistostats,
grouped_ggdotplotstats
Examples
# for reproducibility
set.seed(123)
# creating a plot
p <- ggdotplotstats(
data = ggplot2::mpg,
x = cty,
y = manufacturer,
title = "Fuel economy data",
xlab = "city miles per gallon"
)
# looking at the plot
p
# extracting details from statistical tests
extract_stats(p)
Histogram for distribution of a numeric variable
Description
Histogram with statistical details from one-sample test included in the plot as a subtitle.
Usage
gghistostats(
data,
x,
binwidth = NULL,
xlab = NULL,
title = NULL,
subtitle = NULL,
caption = NULL,
type = "parametric",
test.value = 0,
alternative = "two.sided",
bf.prior = 0.707,
bf.message = TRUE,
conf.level = 0.95,
tr = 0.2,
digits = 2L,
ggtheme = ggstatsplot::theme_ggstatsplot(),
results.subtitle = TRUE,
bin.args = list(color = "black", fill = "grey50", alpha = 0.7),
centrality.plotting = TRUE,
centrality.type = type,
centrality.line.args = list(color = "blue", linewidth = 1, linetype = "dashed"),
ggplot.component = NULL,
...
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
A numeric variable from the data frame |
binwidth |
The width of the histogram bins. Can be specified as a
numeric value, or a function that calculates width from |
xlab |
Label for |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
test.value |
A number indicating the true value of the mean (Default:
|
alternative |
a character string specifying the alternative
hypothesis, must be one of |
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
conf.level |
Scalar between |
tr |
Trim level for the mean when carrying out |
digits |
Number of digits for rounding or significant figures. May also
be |
ggtheme |
A |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
bin.args |
A list of additional aesthetic arguments to be passed to the
|
centrality.plotting |
Logical that decides whether centrality tendency
measure is to be displayed as a point with a label (Default:
If you want default centrality parameter, you can specify this using
|
centrality.type |
Decides which centrality parameter is to be displayed.
The default is to choose the same as
Just as |
centrality.line.args |
A list of additional aesthetic arguments to be
passed to the |
ggplot.component |
A |
... |
Currently ignored. |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/gghistostats.html
Summary of graphics
| graphical element | geom used | argument for further modification |
| histogram bin | ggplot2::stat_bin() | bin.args |
| centrality measure line | ggplot2::geom_vline() | centrality.line.args
|
One-sample tests
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
| Type | Test | Function used |
| Parametric | One-sample Student's t-test | stats::t.test() |
| Non-parametric | One-sample Wilcoxon test | stats::wilcox.test() |
| Robust | Bootstrap-t method for one-sample test | WRS2::trimcibt() |
| Bayesian | One-sample Student's t-test | BayesFactor::ttestBF()
|
Effect size estimation
| Type | Effect size | CI available? | Function used |
| Parametric | Cohen's d, Hedge's g | Yes | effectsize::cohens_d(), effectsize::hedges_g() |
| Non-parametric | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
| Robust | trimmed mean | Yes | WRS2::trimcibt() |
| Bayesian | difference | Yes | bayestestR::describe_posterior()
|
See Also
grouped_gghistostats, ggdotplotstats,
grouped_ggdotplotstats
Examples
# for reproducibility
set.seed(123)
# creating a plot
p <- gghistostats(
data = ToothGrowth,
x = len,
xlab = "Tooth length",
centrality.type = "np"
)
# looking at the plot
p
# extracting details from statistical tests
extract_stats(p)
Pie charts with statistical tests
Description
Pie charts for categorical data with statistical details included in the plot as a subtitle.
Usage
ggpiestats(
data,
x,
y = NULL,
counts = NULL,
type = "parametric",
paired = FALSE,
results.subtitle = TRUE,
label = "percentage",
label.args = list(direction = "both"),
label.repel = FALSE,
digits = 2L,
proportion.test = results.subtitle,
digits.perc = 0L,
bf.message = TRUE,
ratio = NULL,
alternative = "two.sided",
conf.level = 0.95,
p.adjust.method = "holm",
title = NULL,
subtitle = NULL,
caption = NULL,
legend.title = NULL,
ggtheme = ggstatsplot::theme_ggstatsplot(),
palette = "ggthemes::gdoc",
ggplot.component = NULL,
...
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The variable to use as the rows in the contingency table. Please note that if there are empty factor levels in your variable, they will be dropped. |
y |
The variable to use as the columns in the contingency table.
Please note that if there are empty factor levels in your variable, they
will be dropped. Default is |
counts |
The variable in data containing counts, or |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
paired |
Logical indicating whether data came from a within-subjects or
repeated measures design study (Default: |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
label |
Character decides what information needs to be displayed
on the label in each pie slice. Possible options are |
label.args |
Additional aesthetic arguments that will be passed to
|
label.repel |
Whether labels should be repelled using |
digits |
Number of digits for rounding or significant figures. May also
be |
proportion.test |
Decides whether proportion test for |
digits.perc |
Numeric that decides number of decimal places for
percentage labels (Default: |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
ratio |
A vector of proportions: the expected proportions for the
proportion test (should sum to |
alternative |
a character string specifying the alternative
hypothesis, must be one of |
conf.level |
Scalar between |
p.adjust.method |
Adjustment method for p-values for multiple
comparisons. Possible methods are: |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
legend.title |
Title text for the legend. |
ggtheme |
A |
palette |
Name of the palette in |
ggplot.component |
A |
... |
Currently ignored. |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggpiestats.html
Summary of graphics
| graphical element | geom used | argument for further modification |
| pie slices | ggplot2::geom_col() | NA |
| labels | ggplot2::geom_label()/ggrepel::geom_label_repel() | label.args
|
Pairwise comparisons
When there is a two-way table and x has more than two levels, pairwise
contingency table analyses (Fisher's exact tests) are computed using
statsExpressions::pairwise_contingency_table(). These pairwise results are not
displayed in the plot because bar and pie charts lack a natural visual
representation for pairwise significance annotations (unlike box/violin
plots, which use bracket annotations). Additionally, there is no
established convention for overlaying pairwise comparisons on pie charts,
and both ggpiestats() and ggbarstats() are designed to remain visually
congruent. The pairwise results are available as a data frame via
extract_stats(plot)$pairwise_comparisons_data.
Contingency table analyses
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
two-way table
Hypothesis testing
| Type | Design | Test | Function used |
| Parametric/Non-parametric | Unpaired | Pearson's chi-squared test | stats::chisq.test() |
| Bayesian | Unpaired | Bayesian Pearson's chi-squared test | BayesFactor::contingencyTableBF() |
| Parametric/Non-parametric | Paired | McNemar's chi-squared test | stats::mcnemar.test() |
| Bayesian | Paired | No | No |
Effect size estimation
| Type | Design | Effect size | CI available? | Function used |
| Parametric/Non-parametric | Unpaired | Cramer's V | Yes | effectsize::cramers_v() |
| Bayesian | Unpaired | Cramer's V | Yes | effectsize::cramers_v() |
| Parametric/Non-parametric | Paired | Cohen's g | Yes | effectsize::cohens_g() |
| Bayesian | Paired | No | No | No |
one-way table
Hypothesis testing
| Type | Test | Function used |
| Parametric/Non-parametric | Goodness of fit chi-squared test | stats::chisq.test() |
| Bayesian | Bayesian Goodness of fit chi-squared test | (custom) |
Effect size estimation
| Type | Effect size | CI available? | Function used |
| Parametric/Non-parametric | Pearson's C | Yes | effectsize::pearsons_c() |
| Bayesian | No | No | No |
See Also
grouped_ggpiestats, ggbarstats,
grouped_ggbarstats
Examples
# for reproducibility
set.seed(123)
# one sample goodness of fit proportion test
p <- ggpiestats(mtcars, vs)
# looking at the plot
p
# extracting details from statistical tests
extract_stats(p)
# association test (or contingency table analysis)
ggpiestats(mtcars, vs, cyl)
# Bayesian test
ggpiestats(mtcars, vs, cyl, type = "bayes")
# with repelled labels to avoid overlapping
ggpiestats(mtcars, vs, label.repel = TRUE)
# show counts instead of percentages
ggpiestats(mtcars, vs, label = "counts")
# show both counts and percentages
ggpiestats(mtcars, vs, label = "both")
# using pre-aggregated data with counts
ggpiestats(as.data.frame(Titanic), Survived, counts = Freq)
Scatterplot with marginal distributions and statistical results
Description
Scatterplots from {ggplot2} combined with marginal distributions plots
with statistical details.
Usage
ggscatterstats(
data,
x,
y,
type = "parametric",
conf.level = 0.95,
bf.prior = 0.707,
bf.message = TRUE,
tr = 0.2,
digits = 2L,
results.subtitle = TRUE,
label.var = NULL,
label.expression = NULL,
marginal = TRUE,
point.args = list(size = 3, alpha = 0.4, stroke = 0),
point.width.jitter = 0,
point.height.jitter = 0,
point.label.args = list(size = 3, max.overlaps = 1e+06),
smooth.line.args = list(linewidth = 1.5, color = "blue", method = "lm", formula = y ~
x),
xsidehistogram.args = list(fill = "#4285F4", color = "black", na.rm = TRUE),
ysidehistogram.args = list(fill = "#EA4335", color = "black", na.rm = TRUE),
xsidehistogram.scale = list(),
ysidehistogram.scale = list(),
xlab = NULL,
ylab = NULL,
title = NULL,
subtitle = NULL,
caption = NULL,
ggtheme = ggstatsplot::theme_ggstatsplot(),
ggplot.component = NULL,
...
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The column in |
y |
The column in |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
conf.level |
Scalar between |
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
tr |
Trim level for the mean when carrying out |
digits |
Number of digits for rounding or significant figures. May also
be |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
label.var |
Variable to use for points labels entered as a symbol (e.g.
|
label.expression |
An expression evaluating to a logical vector that
determines the subset of data points to label (e.g. |
marginal |
Decides whether marginal distributions will be plotted on
axes using |
point.args |
A list of additional aesthetic arguments to be passed to
the |
point.width.jitter, point.height.jitter |
Degree of jitter in |
point.label.args |
A list of additional aesthetic arguments to be passed
to |
smooth.line.args |
A list of additional aesthetic arguments to be passed
to |
xsidehistogram.args, ysidehistogram.args |
A list of arguments passed to
respective |
xsidehistogram.scale, ysidehistogram.scale |
A list of arguments passed
to |
xlab |
Label for |
ylab |
Labels for |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
ggtheme |
A |
ggplot.component |
A |
... |
Currently ignored. |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggscatterstats.html
Summary of graphics
| graphical element | geom used | argument for further modification |
| raw data | ggplot2::geom_point() | point.args |
| labels for raw data | ggrepel::geom_label_repel() | point.label.args |
| smooth line | ggplot2::geom_smooth() | smooth.line.args |
| marginal histograms | ggside::geom_xsidehistogram(), ggside::geom_ysidehistogram() | xsidehistogram.args, ysidehistogram.args
|
Correlation analyses
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing and Effect size estimation
| Type | Test | CI available? | Function used |
| Parametric | Pearson's correlation coefficient | Yes | correlation::correlation() |
| Non-parametric | Spearman's rank correlation coefficient | Yes | correlation::correlation() |
| Robust | Winsorized Pearson's correlation coefficient | Yes | correlation::correlation() |
| Bayesian | Bayesian Pearson's correlation coefficient | Yes | correlation::correlation()
|
Note
The plot uses ggrepel::geom_label_repel() to attempt to keep labels
from over-lapping to the largest degree possible. As a consequence plot
times will slow down massively (and the plot file will grow in size) if you
have a lot of labels that overlap.
See Also
grouped_ggscatterstats, ggcorrmat,
grouped_ggcorrmat
Examples
set.seed(123)
# creating a plot
p <- ggscatterstats(
iris,
x = Sepal.Width,
y = Petal.Length,
label.var = Species,
label.expression = Sepal.Length > 7.6
) +
ggplot2::geom_rug(sides = "b")
# looking at the plot
p
# extracting details from statistical tests
extract_stats(p)
# customize marginal histogram bins and scales
ggscatterstats(
mtcars,
x = wt,
y = mpg,
results.subtitle = FALSE,
xsidehistogram.args = list(fill = "#4285F4", color = "black", na.rm = TRUE, binwidth = 0.5),
ysidehistogram.args = list(fill = "#EA4335", color = "black", na.rm = TRUE, bins = 15),
xsidehistogram.scale = list(breaks = seq(0, 15, 5)),
ysidehistogram.scale = list(breaks = seq(0, 15, 5))
)
Box/Violin plots for repeated measures comparisons
Description
A combination of box and violin plots along with raw (unjittered) data points for within-subjects designs with statistical details included in the plot as a subtitle.
Usage
ggwithinstats(
data,
x,
y,
type = "parametric",
subject.id = NULL,
pairwise.display = "significant",
pairwise.alpha = 0.05,
p.adjust.method = "holm",
bf.prior = 0.707,
bf.message = TRUE,
results.subtitle = TRUE,
xlab = NULL,
ylab = NULL,
caption = NULL,
title = NULL,
subtitle = NULL,
digits = 2L,
conf.level = 0.95,
tr = 0.2,
alternative = "two.sided",
centrality.plotting = TRUE,
centrality.type = type,
centrality.point.args = list(size = 5, color = "darkred"),
centrality.label.args = list(size = 3, nudge_x = 0.4, segment.linetype = 4),
centrality.path = TRUE,
centrality.path.args = list(linewidth = 1, color = "red", alpha = 0.5),
point.args = list(size = 3, alpha = 0.5, na.rm = TRUE),
point.path = TRUE,
point.path.args = list(alpha = 0.5, linetype = "dashed"),
boxplot.args = list(width = 0.2, alpha = 0.5, na.rm = TRUE),
violin.args = list(width = 0.5, alpha = 0.2, na.rm = TRUE),
ggsignif.args = list(textsize = 3, tip_length = 0.01, na.rm = TRUE),
ggtheme = ggstatsplot::theme_ggstatsplot(),
palette = "ggthemes::gdoc",
ggplot.component = NULL,
...
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The grouping (or independent) variable from |
y |
The response (or outcome or dependent) variable from |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
subject.id |
Across repeated measures conditions, each row in the
dataset must correspond to a unique unit (e.g., subject or participant).
If your data frame is already in such a format, you can ignore the
|
pairwise.display |
Decides which pairwise comparisons to display. Available options are:
You can use this argument to make sure that your plot is not uber-cluttered
when you have multiple groups being compared and scores of pairwise
comparisons being displayed. If set to |
pairwise.alpha |
Numeric alpha threshold used to decide which pairwise
comparisons are displayed when |
p.adjust.method |
Adjustment method for p-values for multiple
comparisons. Possible methods are: |
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
xlab |
Label for |
ylab |
Labels for |
caption |
The text for the plot caption. This argument is relevant only
if |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
digits |
Number of digits for rounding or significant figures. May also
be |
conf.level |
Scalar between |
tr |
Trim level for the mean when carrying out |
alternative |
a character string specifying the alternative
hypothesis, must be one of |
centrality.plotting |
Logical that decides whether centrality tendency
measure is to be displayed as a point with a label (Default:
If you want default centrality parameter, you can specify this using
|
centrality.type |
Decides which centrality parameter is to be displayed.
The default is to choose the same as
Just as |
centrality.point.args, centrality.label.args |
A list of additional aesthetic
arguments to be passed to |
centrality.path.args, point.path.args |
A list of additional aesthetic
arguments passed on to |
point.args |
A list of additional aesthetic arguments to be passed to
the |
point.path, centrality.path |
Logical that decides whether individual
data points and means, respectively, should be connected using
|
boxplot.args |
A list of additional aesthetic arguments passed on to
|
violin.args |
A list of additional aesthetic arguments to be passed to
the |
ggsignif.args |
A list of additional aesthetic
arguments to be passed to |
ggtheme |
A |
palette |
Name of the palette in |
ggplot.component |
A |
... |
Currently ignored. |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggwithinstats.html
Summary of graphics
| graphical element | geom used | argument for further modification |
| raw data | ggplot2::geom_point() | point.args |
| point path | ggplot2::geom_path() | point.path.args |
| box plot | ggplot2::geom_boxplot() | boxplot.args |
| density plot | ggplot2::geom_violin() | violin.args |
| centrality measure point | ggplot2::geom_point() | centrality.point.args |
| centrality measure point path | ggplot2::geom_path() | centrality.path.args |
| centrality measure label | ggrepel::geom_label_repel() | centrality.label.args |
| pairwise comparisons | ggsignif::geom_signif() | ggsignif.args
|
Centrality measures
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
| Type | Measure | Function used |
| Parametric | mean | datawizard::describe_distribution() |
| Non-parametric | median | datawizard::describe_distribution() |
| Robust | trimmed mean | datawizard::describe_distribution() |
| Bayesian | MAP | datawizard::describe_distribution()
|
Two-sample tests
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
between-subjects
Hypothesis testing
| Type | No. of groups | Test | Function used |
| Parametric | 2 | Student's or Welch's t-test | stats::t.test() |
| Non-parametric | 2 | Mann-Whitney U test | stats::wilcox.test() |
| Robust | 2 | Yuen's test for trimmed means | WRS2::yuen() |
| Bayesian | 2 | Student's t-test | BayesFactor::ttestBF()
|
Effect size estimation
| Type | No. of groups | Effect size | CI available? | Function used |
| Parametric | 2 | Cohen's d, Hedge's g | Yes | effectsize::cohens_d(), effectsize::hedges_g() |
| Non-parametric | 2 | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
| Robust | 2 | Algina-Keselman-Penfield robust standardized difference | Yes | WRS2::akp.effect() |
| Bayesian | 2 | difference | Yes | bayestestR::describe_posterior()
|
within-subjects
Data requirement: Paired tests assume exactly one observation per subject per condition. If your data has multiple trials per cell, aggregate first (e.g., take the mean).
Hypothesis testing
| Type | No. of groups | Test | Function used |
| Parametric | 2 | Student's t-test | stats::t.test() |
| Non-parametric | 2 | Wilcoxon signed-rank test | stats::wilcox.test() |
| Robust | 2 | Yuen's test on trimmed means for dependent samples | WRS2::yuend() |
| Bayesian | 2 | Student's t-test | BayesFactor::ttestBF()
|
Effect size estimation
| Type | No. of groups | Effect size | CI available? | Function used |
| Parametric | 2 | Cohen's d, Hedge's g | Yes | effectsize::cohens_d(), effectsize::hedges_g() |
| Non-parametric | 2 | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
| Robust | 2 | Algina-Keselman-Penfield robust standardized difference | Yes | WRS2::wmcpAKP() |
| Bayesian | 2 | difference | Yes | bayestestR::describe_posterior()
|
One-way ANOVA
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
between-subjects
Hypothesis testing
| Type | No. of groups | Test | Function used |
| Parametric | > 2 | Fisher's or Welch's one-way ANOVA | stats::oneway.test() |
| Non-parametric | > 2 | Kruskal-Wallis one-way ANOVA | stats::kruskal.test() |
| Robust | > 2 | Heteroscedastic one-way ANOVA for trimmed means | WRS2::t1way() |
| Bayesian | > 2 | Fisher's ANOVA | BayesFactor::anovaBF()
|
Effect size estimation
| Type | No. of groups | Effect size | CI available? | Function used |
| Parametric | > 2 | partial eta-squared, partial omega-squared | Yes | effectsize::omega_squared(), effectsize::eta_squared() |
| Non-parametric | > 2 | rank epsilon squared | Yes | effectsize::rank_epsilon_squared() |
| Robust | > 2 | Explanatory measure of effect size | Yes | WRS2::t1way() |
| Bayesian | > 2 | Bayesian R-squared | Yes | performance::r2_bayes()
|
within-subjects
Data requirement: Repeated measures tests assume a complete design with
exactly one observation per subject per condition. If your data has multiple
trials per cell, aggregate first (e.g., take the mean). Verify with
table(data$subject, data$condition) — every cell should equal 1.
Hypothesis testing
| Type | No. of groups | Test | Function used |
| Parametric | > 2 | One-way repeated measures ANOVA | afex::aov_ez() |
| Non-parametric | > 2 | Friedman rank sum test | stats::friedman.test() |
| Robust | > 2 | Heteroscedastic one-way repeated measures ANOVA for trimmed means | WRS2::rmanova() |
| Bayesian | > 2 | One-way repeated measures ANOVA | BayesFactor::anovaBF()
|
Effect size estimation
| Type | No. of groups | Effect size | CI available? | Function used |
| Parametric | > 2 | partial eta-squared, partial omega-squared | Yes | effectsize::omega_squared(), effectsize::eta_squared() |
| Non-parametric | > 2 | Kendall's coefficient of concordance | Yes | effectsize::kendalls_w() |
| Robust | > 2 | Algina-Keselman-Penfield robust standardized difference average | Yes | WRS2::wmcpAKP() |
| Bayesian | > 2 | Bayesian R-squared | Yes | performance::r2_bayes()
|
Pairwise comparison tests
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
between-subjects
Hypothesis testing
| Type | Equal variance? | Test | p-value adjustment? | Function used |
| Parametric | No | Games-Howell test | Yes | PMCMRplus::gamesHowellTest() |
| Parametric | Yes | Student's t-test | Yes | stats::pairwise.t.test() |
| Non-parametric | No | Dunn test | Yes | PMCMRplus::kwAllPairsDunnTest() |
| Robust | No | Yuen's trimmed means test | Yes | WRS2::lincon() |
| Bayesian | NA | Student's t-test | NA | BayesFactor::ttestBF()
|
Effect size estimation
Not supported.
within-subjects
Data requirement: Paired pairwise tests assume exactly one observation per subject per condition. If your data has multiple trials per cell, aggregate first (e.g., take the mean).
Hypothesis testing
| Type | Test | p-value adjustment? | Function used |
| Parametric | Student's t-test | Yes | stats::pairwise.t.test() |
| Non-parametric | Durbin-Conover test | Yes | PMCMRplus::durbinAllPairsTest() |
| Robust | Yuen's trimmed means test | Yes | WRS2::rmmcp() |
| Bayesian | Student's t-test | NA | BayesFactor::ttestBF()
|
Effect size estimation
Not supported.
See Also
grouped_ggbetweenstats, ggbetweenstats,
grouped_ggwithinstats
Examples
# for reproducibility
set.seed(123)
library(dplyr, warn.conflicts = FALSE)
# create a plot
p <- ggwithinstats(
data = filter(bugs_long, condition %in% c("HDHF", "HDLF")),
x = condition,
y = desire,
type = "np",
subject.id = subject
)
# looking at the plot
p
# if the data are already arranged in repeated-measures order, `subject.id`
# can be omitted
ggwithinstats(
data = filter(bugs_long, condition %in% c("HDHF", "HDLF")),
x = condition,
y = desire,
pairwise.display = "none",
results.subtitle = FALSE
)
# extracting details from statistical tests
extract_stats(p)
# use a stricter alpha threshold for significant pairwise comparisons
ggwithinstats(
data = bugs_long,
x = condition,
y = desire,
subject.id = subject,
pairwise.alpha = 0.001
)
# modifying defaults
ggwithinstats(
data = bugs_long,
x = condition,
y = desire,
type = "robust",
subject.id = subject
)
# you can remove a specific geom to reduce complexity of the plot
ggwithinstats(
data = bugs_long,
x = condition,
y = desire,
subject.id = subject,
# to remove violin plot
violin.args = list(width = 0, linewidth = 0, colour = NA),
# to remove boxplot
boxplot.args = list(width = 0),
# to remove points
point.args = list(alpha = 0)
)
Grouped bar charts with statistical tests
Description
Helper function for ggstatsplot::ggbarstats() to apply this function across
multiple levels of a given factor and combining the resulting plots using
ggstatsplot::combine_plots().
Usage
grouped_ggbarstats(
data,
...,
grouping.var,
plotgrid.args = list(),
annotation.args = list()
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggpiestats.html
See Also
ggbarstats, ggpiestats,
grouped_ggpiestats
Examples
set.seed(123)
# grouped one-sample proportion test
grouped_ggbarstats(
data = mtcars,
x = cyl,
grouping.var = am,
annotation.args = list(title = "Cylinder distribution by transmission type")
)
Violin plots for group or condition comparisons in between-subjects designs repeated across all levels of a grouping variable.
Description
Helper function for ggstatsplot::ggbetweenstats to apply this function
across multiple levels of a given factor and combining the resulting plots
using ggstatsplot::combine_plots.
Usage
grouped_ggbetweenstats(
data,
...,
grouping.var,
plotgrid.args = list(),
annotation.args = list()
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
See Also
ggbetweenstats, ggwithinstats,
grouped_ggwithinstats
Examples
# for reproducibility
set.seed(123)
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)
grouped_ggbetweenstats(
data = filter(ggplot2::mpg, drv != "4"),
x = year,
y = hwy,
grouping.var = drv
)
# modifying individual plots using `ggplot.component` argument
grouped_ggbetweenstats(
data = filter(
movies_long,
genre %in% c("Action", "Comedy"),
mpaa %in% c("R", "PG")
),
x = genre,
y = rating,
grouping.var = mpaa,
ggplot.component = scale_y_continuous(
breaks = seq(1, 9, 1),
limits = (c(1, 9))
),
annotation.args = list(title = "Ratings by genre for different MPAA ratings")
)
Visualization of a correlalogram (or correlation matrix) for all levels of a grouping variable
Description
Helper function for ggstatsplot::ggcorrmat() to apply this function across
multiple levels of a given factor and combining the resulting plots using
ggstatsplot::combine_plots().
Usage
grouped_ggcorrmat(
data,
...,
grouping.var,
plotgrid.args = list(),
annotation.args = list()
)
Arguments
data |
A data frame from which variables specified are to be taken. |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggcorrmat.html
See Also
ggcorrmat, ggscatterstats,
grouped_ggscatterstats
Examples
set.seed(123)
grouped_ggcorrmat(
data = iris,
grouping.var = Species,
type = "robust",
colors = c("#0072B2", "white", "#D55E00"),
p.adjust.method = "holm",
plotgrid.args = list(ncol = 1L),
annotation.args = list(tag_levels = "i")
)
Grouped histograms for distribution of a labeled numeric variable
Description
Helper function for ggstatsplot::ggdotplotstats() to apply this function
across multiple levels of a given factor and combining the resulting plots
using ggstatsplot::combine_plots().
Usage
grouped_ggdotplotstats(
data,
...,
grouping.var,
plotgrid.args = list(),
annotation.args = list()
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggdotplotstats.html
See Also
grouped_gghistostats, ggdotplotstats,
gghistostats
Examples
# for reproducibility
set.seed(123)
library(dplyr, warn.conflicts = FALSE)
# removing factor level with very few no. of observations
df <- filter(ggplot2::mpg, cyl %in% c("4", "6", "8"))
# plot
grouped_ggdotplotstats(
data = df,
x = cty,
y = manufacturer,
grouping.var = cyl,
test.value = 15.5,
annotation.args = list(title = "City mileage by manufacturer for different cylinders")
)
Grouped histograms for distribution of a numeric variable
Description
Helper function for ggstatsplot::gghistostats to apply this function
across multiple levels of a given factor and combining the resulting plots
using ggstatsplot::combine_plots.
Usage
grouped_gghistostats(
data,
x,
grouping.var,
binwidth = NULL,
plotgrid.args = list(),
annotation.args = list(),
...
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
A numeric variable from the data frame |
grouping.var |
A single grouping variable. |
binwidth |
The width of the histogram bins. Can be specified as a
numeric value, or a function that calculates width from |
plotgrid.args |
A |
annotation.args |
A |
... |
Arguments passed on to
|
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/gghistostats.html
See Also
gghistostats, ggdotplotstats,
grouped_ggdotplotstats
Examples
# for reproducibility
set.seed(123)
# plot
grouped_gghistostats(
data = iris,
x = Sepal.Length,
test.value = 5,
grouping.var = Species,
plotgrid.args = list(nrow = 1),
annotation.args = list(tag_levels = "i")
)
Grouped pie charts with statistical tests
Description
Helper function for ggstatsplot::ggpiestats to apply this
function across multiple levels of a given factor and combining the
resulting plots using ggstatsplot::combine_plots.
Usage
grouped_ggpiestats(
data,
...,
grouping.var,
plotgrid.args = list(),
annotation.args = list()
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggpiestats.html
See Also
ggbarstats, ggpiestats,
grouped_ggbarstats
Examples
set.seed(123)
# grouped one-sample proportion test
grouped_ggpiestats(
data = mtcars,
x = cyl,
grouping.var = am,
annotation.args = list(title = "Cylinder distribution by transmission type")
)
Scatterplot with marginal distributions for all levels of a grouping variable
Description
Grouped scatterplots from {ggplot2} combined with marginal distribution
plots with statistical details added as a subtitle.
Usage
grouped_ggscatterstats(
data,
...,
grouping.var,
plotgrid.args = list(),
annotation.args = list()
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
Details
For details, see: https://www.indrapatil.com/ggstatsplot/articles/web_only/ggscatterstats.html
See Also
ggscatterstats, ggcorrmat,
grouped_ggcorrmat
Examples
# to ensure reproducibility
set.seed(123)
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)
grouped_ggscatterstats(
data = filter(movies_long, genre == "Comedy" | genre == "Drama"),
x = length,
y = rating,
type = "robust",
grouping.var = genre,
ggplot.component = list(geom_rug(sides = "b"))
)
# using labeling
# (also show how to modify basic plot from within function call)
grouped_ggscatterstats(
data = filter(ggplot2::mpg, cyl != 5),
x = displ,
y = hwy,
grouping.var = cyl,
type = "robust",
label.var = manufacturer,
label.expression = hwy > 25 & displ > 2.5,
ggplot.component = scale_y_continuous(sec.axis = dup_axis())
)
# labeling without expression
grouped_ggscatterstats(
data = filter(movies_long, rating == 7, genre %in% c("Drama", "Comedy")),
x = budget,
y = length,
grouping.var = genre,
bf.message = FALSE,
label.var = "title",
annotation.args = list(tag_levels = "a")
)
# customize marginal histogram bins and scales
grouped_ggscatterstats(
data = filter(movies_long, genre %in% c("Drama", "Comedy")),
x = rating,
y = length,
grouping.var = genre,
results.subtitle = FALSE,
xsidehistogram.args = list(fill = "#4285F4", color = "black", na.rm = TRUE, bins = 20),
ysidehistogram.args = list(fill = "#EA4335", color = "black", na.rm = TRUE, binwidth = 10),
xsidehistogram.scale = list(breaks = seq(0, 200, 50)),
ysidehistogram.scale = list(breaks = seq(0, 200, 50))
)
Violin plots for group or condition comparisons in within-subjects designs repeated across all levels of a grouping variable.
Description
A combined plot of comparison plot created for levels of a grouping variable.
Usage
grouped_ggwithinstats(
data,
...,
grouping.var,
plotgrid.args = list(),
annotation.args = list()
)
Arguments
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
See Also
ggwithinstats, ggbetweenstats,
grouped_ggbetweenstats
Examples
# for reproducibility
set.seed(123)
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)
# the most basic function call
grouped_ggwithinstats(
data = filter(bugs_long, condition %in% c("HDHF", "HDLF")),
x = condition,
y = desire,
subject.id = subject,
grouping.var = gender,
type = "np",
# additional modifications for **each** plot using `{ggplot2}` functions
ggplot.component = scale_y_continuous(breaks = seq(0, 10, 1), limits = c(0, 10)),
annotation.args = list(title = "Desire ratings by condition for each gender")
)
Edgar Anderson's Iris Data in long format.
Description
Edgar Anderson's Iris Data in long format.
Usage
iris_long
Format
A data frame with 600 rows and 5 variables
id. Dummy identity number for each flower (150 flowers in total).
Species. The species are Iris setosa, versicolor, and virginica.
condition. Factor giving a detailed description of the attribute (Four levels:
"Petal.Length","Petal.Width","Sepal.Length","Sepal.Width").attribute. What attribute is being measured (
"Sepal"or"Pepal").measure. What aspect of the attribute is being measured (
"Length"or"Width").value. Value of the measurement.
Details
This famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.
This is a modified dataset from {datasets} package.
Examples
dim(iris_long)
head(iris_long)
dplyr::glimpse(iris_long)
Movie information and user ratings from IMDB.com (long format).
Description
Movie information and user ratings from IMDB.com (long format).
Usage
movies_long
Format
A data frame with 1,579 rows and 8 variables
title. Title of the movie.
year. Year of release.
budget. Total budget (if known) in US dollars
length. Length in minutes.
rating. Average IMDB user rating.
votes. Number of IMDB users who rated this movie.
mpaa. MPAA rating.
genre. Different genres of movies (action, animation, comedy, drama, documentary, romance, short).
Details
Modified dataset from {ggplot2movies} package.
The internet movie database (IMDB) is a website devoted to collecting movie data supplied by studios and fans. It claims to be the biggest movie database on the web and is run by amazon.
Source
https://CRAN.R-project.org/package=ggplot2movies
Examples
dim(movies_long)
head(movies_long)
dplyr::glimpse(movies_long)
Default theme used in {ggstatsplot}
Description
Common theme used across all plots generated in {ggstatsplot} and assumed
by the author to be aesthetically pleasing to the user. The theme is a
wrapper around ggplot2::theme_bw().
All {ggstatsplot} functions have a ggtheme parameter that let you choose
a different theme.
Usage
theme_ggstatsplot()
Value
A ggplot object.
Examples
library(ggplot2)
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
theme_ggstatsplot()