Rendering tables with pandoc.table

Roman Tsegelskyi, Gergely Daróczi

2021-06-13

Core functionality of pander is centered around pandoc.table, which is aimed at rendering tables in markdown. In case of 2D tables, pander calls pandoc.table internally, thus in such cases pander and pandoc.table support the same argument and in this vignette will be used iterchangingly. pandoc.table has a wide variety of options (highlighting, styles, etc.) and this vignette aims to give a more detailed overview of the most common options. pander comes with a variety of globally adjustable options, which have an effect on the result of your reports. You can query and update these options with the panderOptions function.

Table styles

Since pander aims at rendering R objects into Pandoc’s markdown all four (multiline, simple, grid, rmarkdown) of Pandoc’s formats are supported. Users are advised to stick with the default multiline style, but if there is a need to change it either specify style argument when calling pander/pandoc.table or change the default style using panderOptions.

multiline tables

multiline tables allow headers and table rows to span multiple lines of text (but cells that span multiple columns or rows of the table are not supported). Also note that, for simplicity, line breaks are removed from cells by default, so multiline cells are typically the result of splitting large cells or setting keep.line.breaks to TRUE:

m <- data.frame('Value\n1', 'Value\n2')
colnames(m) <- c('Multiline\nCol1', 'Multiline\nCol2')
pandoc.table(m, keep.line.breaks = TRUE)
#> 
#> -----------------------
#>  Multiline   Multiline 
#>    Col1        Col2    
#> ----------- -----------
#>    Value       Value   
#>      1           2     
#> -----------------------
m <- mtcars[1:3, 1:4]
pandoc.table(m)
#> 
#> ---------------------------------------------
#>                      mpg    cyl   disp   hp  
#> ------------------- ------ ----- ------ -----
#>      Mazda RX4        21     6    160    110 
#> 
#>    Mazda RX4 Wag      21     6    160    110 
#> 
#>     Datsun 710       22.8    4    108    93  
#> ---------------------------------------------

simple tables

simple tables are have more compact syntax that all other styles, but don’t they support multiline cells:

m <- mtcars[1:3, 1:4]
pandoc.table(m, style = 'simple')
#> 
#> 
#>                      mpg    cyl   disp   hp  
#> ------------------- ------ ----- ------ -----
#>      Mazda RX4        21     6    160    110 
#>    Mazda RX4 Wag      21     6    160    110 
#>     Datsun 710       22.8    4    108    93
m <- data.frame('Value\n1', 'Value\n2')
colnames(m) <- c('Multiline\nCol1', 'Multiline\nCol2')
pandoc.table(m, keep.line.breaks = TRUE, style='simple')
#> Error in tableExpand_cpp(cells, cols.width, justify, sep.cols, style): Pandoc does not support newlines in simple or Rmarkdown table format!

grid tables

grid format is really handy for emacs users (Emacs table mod) and it does support block elements (multiple paragraphs, code blocks, lists, etc.) inside cells, but cells can’t span multiple columns or rows. Alignments are not supported for grid tables by most parsers, meaning that even though pander will produce a table with alignment, it will be lost during conversion from markdown to HTML/PDF/DOCX.

m <- mtcars[1:3, 1:4]
pandoc.table(m, style = 'grid')
#> 
#> 
#> +-------------------+------+-----+------+-----+
#> |                   | mpg  | cyl | disp | hp  |
#> +===================+======+=====+======+=====+
#> |     Mazda RX4     |  21  |  6  | 160  | 110 |
#> +-------------------+------+-----+------+-----+
#> |   Mazda RX4 Wag   |  21  |  6  | 160  | 110 |
#> +-------------------+------+-----+------+-----+
#> |    Datsun 710     | 22.8 |  4  | 108  | 93  |
#> +-------------------+------+-----+------+-----+
m <- data.frame('Value\n1', 'Value\n2')
colnames(m) <- c('Multiline\nCol1', 'Multiline\nCol2')
pandoc.table(m, keep.line.breaks = TRUE, style='grid')
#> 
#> 
#> +-----------+-----------+
#> | Multiline | Multiline |
#> |   Col1    |   Col2    |
#> +===========+===========+
#> |   Value   |   Value   |
#> |     1     |     2     |
#> +-----------+-----------+

rmarkdown tables

rmarkdown or pipe table format, is often used directly with knitr, since it was supported by the first versions of the markdown package. It is similar to simple table in that multiline cells are also not supported. The beginning and ending pipe characters are optional, but pipes are required between all columns:

m <- mtcars[1:3, 1:4]
pandoc.table(m, style = 'rmarkdown')
#> 
#> 
#> |                   | mpg  | cyl | disp | hp  |
#> |:-----------------:|:----:|:---:|:----:|:---:|
#> |     Mazda RX4     |  21  |  6  | 160  | 110 |
#> |   Mazda RX4 Wag   |  21  |  6  | 160  | 110 |
#> |    Datsun 710     | 22.8 |  4  | 108  | 93  |
m <- data.frame('Value\n1', 'Value\n2')
colnames(m) <- c('Multiline\nCol1', 'Multiline\nCol2')
pandoc.table(m, keep.line.breaks = TRUE, style='rmarkdown')
#> Error in tableExpand_cpp(cells, cols.width, justify, sep.cols, style): Pandoc does not support newlines in simple or Rmarkdown table format!

Cell alignment

pander allows users to control cell alignment (left, right or center/centre) in a table directly by setting the justify parameter when calling pander/pandoc.table. Note that it is possible to specify alignment for each column separately by supplying a vector to justify:

pandoc.table(head(iris[,1:3], 2), justify = 'right')
#> 
#> -------------------------------------------
#>   Sepal.Length   Sepal.Width   Petal.Length
#> -------------- ------------- --------------
#>            5.1           3.5            1.4
#> 
#>            4.9             3            1.4
#> -------------------------------------------
pandoc.table(head(iris[,1:3], 2), justify = c('right', 'center', 'left'))
#> 
#> -------------------------------------------
#>   Sepal.Length  Sepal.Width  Petal.Length  
#> -------------- ------------- --------------
#>            5.1      3.5      1.4           
#> 
#>            4.9       3       1.4           
#> -------------------------------------------

Another way to define alignment is by using a permanent option table.alignment.default/table.alignment.rownames in panderOptions (preferred way) or by using set.alignment function (legacy way of setting alignment for next table or permanently) which support setting alignment separately for cells and rownames:

set.alignment('left', row.names = 'right') # set only for next table since permanent parameter is falce
pandoc.table(mtcars[1:2,  1:5])
#> 
#> ---------------------------------------------------
#>                     mpg   cyl   disp   hp    drat  
#> ------------------- ----- ----- ------ ----- ------
#>           Mazda RX4 21    6     160    110   3.9   
#> 
#>       Mazda RX4 Wag 21    6     160    110   3.9   
#> ---------------------------------------------------

Interesting application for this functionality is specifying a function that takes the R object as its argument to compute some unique alignment for your table based on e.g. column values or variable types:

panderOptions('table.alignment.default',
    function(df)
        ifelse(sapply(df, mean) > 2, 'left', 'right'))
pandoc.table(head(iris[,1:3], 2))
#> 
#> -------------------------------------------
#> Sepal.Length   Sepal.Width     Petal.Length
#> -------------- ------------- --------------
#> 5.1            3.5                      1.4
#> 
#> 4.9            3                        1.4
#> -------------------------------------------
panderOptions('table.alignment.default', 'center')

Highlighting cells

One of the great features of pander is the ease of highlighting rows, columns and cells in a table. This is a native markdown feature without custom HTML or LaTeX-only tweaks, so all HTML/PDF/MS Word/OpenOffice etc. formats are supported.

This can be achieved by specifying one of the arguments below when calling pander/pandoc.table or change default style using panderOptions:

The emphasize.italics helpers would turn the affected cells to italic, emphasize.strong would apply a bold style to the cell and emphasize.verbatim would apply a verbatim style to the cell. A cell can be also italic, bold and verbatim at the same time.

Those functions and arguments ending in rows or cols take a vector (like which columns or rows to emphasize in a table), while the cells argument take either a vector (for one dimensional “tables”) or an array-like data structure with two columns holding row and column indexes of cells to be emphasized – just like what which(..., arr.ind = TRUE) returns:

t <- mtcars[1:3, 1:5]
emphasize.italics.cols(1)
emphasize.italics.rows(1)
emphasize.strong.cells(which(t > 20, arr.ind = TRUE))
pandoc.table(t)
#> 
#> ----------------------------------------------------
#>                      mpg    cyl   disp   hp    drat 
#> ------------------- ------ ----- ------ ----- ------
#>      Mazda RX4        21     6    160    110   3.9  
#> 
#>    Mazda RX4 Wag      21     6    160    110   3.9  
#> 
#>     Datsun 710       22.8    4    108    93    3.85 
#> ----------------------------------------------------
pandoc.table(t, emphasize.verbatim.rows = 1, emphasize.strong.cells = which(t > 20, arr.ind = TRUE))
#> 
#> ----------------------------------------------------
#>                      mpg    cyl   disp   hp    drat 
#> ------------------- ------ ----- ------ ----- ------
#>      Mazda RX4        21     6    160    110   3.9  
#> 
#>    Mazda RX4 Wag      21     6    160    110   3.9  
#> 
#>     Datsun 710       22.8    4    108    93    3.85 
#> ----------------------------------------------------

For more elaborative examples, please see our blog post - Highlight cells in markdown tables.

Table and Cell width

pander/pandoc.table is able to deal with wide tables. Ever had an issue in LaTeX or MS Word when trying to print a correlation matrix of 40 variables? This problem is carefully addressed with split.table parameter:

pandoc.table(mtcars[1:2, ], style = "grid", caption = "Wide table to be split!")
#> 
#> 
#> +-------------------+-----+-----+------+-----+------+-------+-------+
#> |                   | mpg | cyl | disp | hp  | drat |  wt   | qsec  |
#> +===================+=====+=====+======+=====+======+=======+=======+
#> |     Mazda RX4     | 21  |  6  | 160  | 110 | 3.9  | 2.62  | 16.46 |
#> +-------------------+-----+-----+------+-----+------+-------+-------+
#> |   Mazda RX4 Wag   | 21  |  6  | 160  | 110 | 3.9  | 2.875 | 17.02 |
#> +-------------------+-----+-----+------+-----+------+-------+-------+
#> 
#> Table: Wide table to be split! (continued below)
#> 
#>  
#> 
#> +-------------------+----+----+------+------+
#> |                   | vs | am | gear | carb |
#> +===================+====+====+======+======+
#> |     Mazda RX4     | 0  | 1  |  4   |  4   |
#> +-------------------+----+----+------+------+
#> |   Mazda RX4 Wag   | 0  | 1  |  4   |  4   |
#> +-------------------+----+----+------+------+

split.table defaults to 80 characters and to turn it off, set split.table to Inf:

pandoc.table(mtcars[1:2, ], style = "grid",
             caption = "Wide table to be split!", split.table = Inf)
#> 
#> 
#> +-------------------+-----+-----+------+-----+------+-------+-------+----+----+------+------+
#> |                   | mpg | cyl | disp | hp  | drat |  wt   | qsec  | vs | am | gear | carb |
#> +===================+=====+=====+======+=====+======+=======+=======+====+====+======+======+
#> |     Mazda RX4     | 21  |  6  | 160  | 110 | 3.9  | 2.62  | 16.46 | 0  | 1  |  4   |  4   |
#> +-------------------+-----+-----+------+-----+------+-------+-------+----+----+------+------+
#> |   Mazda RX4 Wag   | 21  |  6  | 160  | 110 | 3.9  | 2.875 | 17.02 | 0  | 1  |  4   |  4   |
#> +-------------------+-----+-----+------+-----+------+-------+-------+----+----+------+------+
#> 
#> Table: Wide table to be split!

Also, pander tries to split too wide cells into multiline cells. The maximum number of characters in a cell is specified by the split.cells parameter (defaults to 30), which can be a single value, vector (values for each column separately) and relative vector (percentages of split.tables parameter). Please not that this only works for multiline and grid tables:

df <- data.frame(a = 'Lorem ipsum', b = 'dolor sit', c = 'amet')
pandoc.table(df, split.cells = 5)
#> 
#> ----------------------
#>    a       b      c   
#> ------- ------- ------
#>  Lorem   dolor   amet 
#>  ipsum    sit         
#> ----------------------
pandoc.table(df, split.cells = c(5, 20, 5))
#> 
#> --------------------------
#>    a         b        c   
#> ------- ----------- ------
#>  Lorem   dolor sit   amet 
#>  ipsum                    
#> --------------------------
pandoc.table(df, split.cells = c("80%", "10%", "10%"))
#> 
#> ----------------------------
#>       a          b      c   
#> ------------- ------- ------
#>  Lorem ipsum   dolor   amet 
#>                 sit         
#> ----------------------------
pandoc.table(df, split.cells = 5, style = 'simple')
#> 
#> 
#>       a            b        c   
#> ------------- ----------- ------
#>  Lorem ipsum   dolor sit   amet

In some cases it is also useful to split too long words with hyphens, and pander uses sylly functionality for that. Just specify use.hyphening argument and have sylly installed:

pandoc.table(data.frame(baz = 'foobar', foo='accoutrements'),
             use.hyphening = TRUE, split.cells = 3)
#> 
#> --------------
#>  baz     foo  
#> ------ -------
#>  foo-    ac-  
#>  bar    cou-  
#>         tre-  
#>         ments 
#> --------------

Rounding and number formatting

pander/pandoc.table deals with formatting numbers by having 4 parameters:

round and digits parameter can be a vector specifying values for each column (has to be the same length as number of columns). Values for non-numeric columns will be disregarded.

Now let’s get to some examples:

r <- matrix(c(283764.97430, 29.12345678901, -7.1234, -100.1), ncol = 2)
pandoc.table(r, round = 2)
#> 
#> -------- --------
#>  283765   -7.12  
#> 
#>  29.12    -100.1 
#> -------- --------
pandoc.table(r, round = c(4,2)) # vector for each column
#> 
#> -------- --------
#>  283765   -7.12  
#> 
#>  29.12    -100.1 
#> -------- --------
pandoc.table(r, digits = 2)
#> 
#> -------- ------
#>  283765   -7.1 
#> 
#>    29     -100 
#> -------- ------
pandoc.table(r, digits = c(1, 5)) # vector for each column
#> 
#> ------- ---------
#>  3e+05   -7.1234 
#> 
#>   29     -100.1  
#> ------- ---------
pandoc.table(r, big.mark = ',')
#> 
#> --------- --------
#>  283,765   -7.123 
#> 
#>   29.12    -100.1 
#> --------- --------
pandoc.table(r, decimal.mark = ',')
#> 
#> -------- --------
#>  283765   -7,123 
#> 
#>  29,12    -100,1 
#> -------- --------

Other options

Functionality described in other sections is most notable, but pander/pandoc.table also has smaller nifty features that are worth mentioning:

pandoc.table(mtcars[1:3, 1:4])
#> 
#> ---------------------------------------------
#>                      mpg    cyl   disp   hp  
#> ------------------- ------ ----- ------ -----
#>      Mazda RX4        21     6    160    110 
#> 
#>    Mazda RX4 Wag      21     6    160    110 
#> 
#>     Datsun 710       22.8    4    108    93  
#> ---------------------------------------------
pandoc.table(mtcars[1:3, 1:4], plain.ascii = TRUE)
#> 
#> ---------------------------------------------
#>                      mpg    cyl   disp   hp  
#> ------------------- ------ ----- ------ -----
#>      Mazda RX4        21     6    160    110 
#> 
#>    Mazda RX4 Wag      21     6    160    110 
#> 
#>     Datsun 710       22.8    4    108    93  
#> ---------------------------------------------
pandoc.table(mtcars[1:3, 1:5], style = "grid", caption = "My caption!")
#> 
#> 
#> +-------------------+------+-----+------+-----+------+
#> |                   | mpg  | cyl | disp | hp  | drat |
#> +===================+======+=====+======+=====+======+
#> |     Mazda RX4     |  21  |  6  | 160  | 110 | 3.9  |
#> +-------------------+------+-----+------+-----+------+
#> |   Mazda RX4 Wag   |  21  |  6  | 160  | 110 | 3.9  |
#> +-------------------+------+-----+------+-----+------+
#> |    Datsun 710     | 22.8 |  4  | 108  | 93  | 3.85 |
#> +-------------------+------+-----+------+-----+------+
#> 
#> Table: My caption!
m <- mtcars[1:3, 1:5]
m$mpg <- NA
pandoc.table(m, missing = '?')
#> 
#> ---------------------------------------------------
#>                      mpg   cyl   disp   hp    drat 
#> ------------------- ----- ----- ------ ----- ------
#>      Mazda RX4        ?     6    160    110   3.9  
#> 
#>    Mazda RX4 Wag      ?     6    160    110   3.9  
#> 
#>     Datsun 710        ?     4    108    93    3.85 
#> ---------------------------------------------------