desctable usage vignette

Desctable aims to be a simple and expressive interface to building statistical tables in R.

Descriptive tables

Simple

Creating a descriptive table with desctable is as easy as

iris %>%
  desc_table()
##                   Variables   N        % Min  Q1  Med     Mean  Q3 Max
## 1              Sepal.Length 150       NA 4.3 5.1 5.80 5.843333 6.4 7.9
## 2               Sepal.Width 150       NA 2.0 2.8 3.00 3.057333 3.3 4.4
## 3              Petal.Length 150       NA 1.0 1.6 4.35 3.758000 5.1 6.9
## 4               Petal.Width 150       NA 0.1 0.3 1.30 1.199333 1.8 2.5
## 5               **Species** 150       NA  NA  NA   NA       NA  NA  NA
## 6     **Species**: *setosa*  50 33.33333  NA  NA   NA       NA  NA  NA
## 7 **Species**: *versicolor*  50 33.33333  NA  NA   NA       NA  NA  NA
## 8  **Species**: *virginica*  50 33.33333  NA  NA   NA       NA  NA  NA
##          sd IQR
## 1 0.8280661 1.3
## 2 0.4358663 0.5
## 3 1.7652982 3.5
## 4 0.7622377 1.5
## 5        NA  NA
## 6        NA  NA
## 7        NA  NA
## 8        NA  NA


By default, desc_table will select the most appropriate statistics for the given table, but you can choose your own as easily

mtcars %>%
  desc_table(N = length,
             mean,
             sd)
##    Variables  N       mean          sd
## 1        mpg 32  20.090625   6.0269481
## 2        cyl 32   6.187500   1.7859216
## 3       disp 32 230.721875 123.9386938
## 4         hp 32 146.687500  68.5628685
## 5       drat 32   3.596563   0.5346787
## 6         wt 32   3.217250   0.9784574
## 7       qsec 32  17.848750   1.7869432
## 8         vs 32   0.437500   0.5040161
## 9         am 32   0.406250   0.4989909
## 10      gear 32   3.687500   0.7378041
## 11      carb 32   2.812500   1.6152000

As you can see with N = length, you can give a meaningful name to the column instead of the name of the function.
You are not limited in your options, and can use any statistical function that exists in R, even your own!

You can also use purrr::map-like formulas, for example to get the first and third quartiles here

iris %>%
  desc_table(N = length,
             "%" = percent,
             Q1 = ~ quantile(., .25),
             Med = median,
             Q3 = ~ quantile(., .75))
##                   Variables   N        %  Q1  Med  Q3
## 1              Sepal.Length 150       NA 5.1 5.80 6.4
## 2               Sepal.Width 150       NA 2.8 3.00 3.3
## 3              Petal.Length 150       NA 1.6 4.35 5.1
## 4               Petal.Width 150       NA 0.3 1.30 1.8
## 5               **Species** 150       NA  NA   NA  NA
## 6     **Species**: *setosa*  50 33.33333  NA   NA  NA
## 7 **Species**: *versicolor*  50 33.33333  NA   NA  NA
## 8  **Species**: *virginica*  50 33.33333  NA   NA  NA

By group

You can also create nested descriptive tables by applying group_by on your dataframe

iris %>%
  group_by(Species) %>%
  desc_table()
## # A tibble: 3 × 4
## # Groups:   Species [3]
##   Species    data              .stats       .vars       
##   <fct>      <list>            <list>       <list>      
## 1 setosa     <tibble [50 × 4]> <df [4 × 8]> <df [4 × 1]>
## 2 versicolor <tibble [50 × 4]> <df [4 × 8]> <df [4 × 1]>
## 3 virginica  <tibble [50 × 4]> <df [4 × 8]> <df [4 × 1]>

However, because of the grouping, you can see the resulting object is not a simple data frame, but a nested dataframe (see tidyr::nest and tidyr::unnest).
desctable provides output functions to format this object to various outputs.
Right now, desctable supports data.frame, pander, and DT outputs. These output functions will also round numerical values, as well as p values for tests (we’ll see desc_tests a bit later).

mtcars %>%
  group_by(am) %>%
  desc_table() %>%
  desc_output("df")
##      am = 1 (N = 13)\nMin  Q1 Med Mean  Q3 Max   sd  IQR am = 0 (N = 19)\nMin
## mpg                    15  21  23   24  30  34  6.2  9.4                   10
## cyl                     4   4   4  5.1   6   8  1.6    2                    4
## disp                   71  79 120  144 160 351   87   81                  120
## hp                     52  66 109  127 113 335   84   47                   62
## drat                  3.5 3.9 4.1    4 4.2 4.9 0.36 0.37                  2.8
## wt                    1.5 1.9 2.3  2.4 2.8 3.6 0.62 0.84                  2.5
## qsec                   14  16  17   17  19  20  1.8  2.1                   15
## vs                      0   0   1 0.54   1   1 0.52    1                    0
## gear                    4   4   4  4.4   5   5 0.51    1                    3
## carb                    1   1   2  2.9   4   8  2.2    3                    1
##       Q1 Med Mean  Q3 Max   sd  IQR
## mpg   15  17   17  19  24  3.8  4.2
## cyl    6   8  6.9   8   8  1.5    2
## disp 196 276  290 360 472  110  164
## hp   116 175  160 192 245   54   76
## drat 3.1 3.1  3.3 3.7 3.9 0.39 0.63
## wt   3.4 3.5  3.8 3.8 5.4 0.78 0.41
## qsec  17  18   18  19  23  1.8    2
## vs     0   0 0.37   1   1  0.5    1
## gear   3   3  3.2   3   4 0.42    0
## carb   2   3  2.7   4   4  1.1    2
mtcars %>%
  group_by(am) %>%
  desc_table() %>%
  desc_output("pander")
  am = 1
(N = 13)
Min
Q1 Med Mean Q3 Max sd IQR am = 0
(N = 19)
Min
Q1 Med Mean Q3 Max sd IQR
mpg 15 21 23 24 30 34 6.2 9.4 10 15 17 17 19 24 3.8 4.2
cyl 4 4 4 5.1 6 8 1.6 2 4 6 8 6.9 8 8 1.5 2
disp 71 79 120 144 160 351 87 81 120 196 276 290 360 472 110 164
hp 52 66 109 127 113 335 84 47 62 116 175 160 192 245 54 76
drat 3.5 3.9 4.1 4 4.2 4.9 0.36 0.37 2.8 3.1 3.1 3.3 3.7 3.9 0.39 0.63
wt 1.5 1.9 2.3 2.4 2.8 3.6 0.62 0.84 2.5 3.4 3.5 3.8 3.8 5.4 0.78 0.41
qsec 14 16 17 17 19 20 1.8 2.1 15 17 18 18 19 23 1.8 2
vs 0 0 1 0.54 1 1 0.52 1 0 0 0 0.37 1 1 0.5 1
gear 4 4 4 4.4 5 5 0.51 1 3 3 3 3.2 3 4 0.42 0
carb 1 1 2 2.9 4 8 2.2 3 1 2 3 2.7 4 4 1.1 2
mtcars %>%
  group_by(am) %>%
  desc_table() %>%
  desc_output("DT")

Comparative tables

You can add tests to a grouped descriptive desctable

iris %>%
  group_by(Petal.Length > 5) %>%
  desc_table() %>%
  desc_tests() %>%
  desc_output("DT")

By default, desc_tests will select the most appropriate statistical tests for the given table, but you can choose your own as easily. For example, to compare Sepal.Width using a Student’s t test

iris %>%
  group_by(Petal.Length > 5) %>%
  desc_table(mean, sd, median, IQR) %>%
  desc_tests(Sepal.Width = ~t.test) %>%
  desc_output("DT")

Note that the name of the test must be prepended with a tilde (~) in all cases!

You can also use purrr::map-like formulas to change tests options

iris %>%
  group_by(Petal.Length > 5) %>%
  desc_table(mean, sd, median, IQR) %>%
  desc_tests(Sepal.Width = ~t.test(., var.equal = T)) %>%
  desc_output("DT")

See the tips and tricks to go further.