List comprehension is an alternative method of writing `for`

loops or `lapply`

*(and similar)* functions that is readable, quick, and easy to use. Many newcomers to R struggle with using the `lapply`

family of functions and default to using `for`

. However, `for`

loops are extremely slow in R and should be avoided. Comprehensions in the `eList`

package bridge the gap by allowing users to write `lapply`

or `sapply`

functions using `for`

loops.

This vignette describes how to use list and other comprehensions. It also explores comprehensions with multiple variables, `if-else`

conditions, nested comprehensions, and parallel comprehensions.

Let us start with a simple example: we have a vector of fruit names and want to create a sequence for each based on the number of characters. One common method is to write a `for`

statement where we loop across each fruit name, create a sequence from the numbers, and place it in a list.

```
x <- c("apples", "bananas", "pears")
fruit_chars <- vector("list", length(x))
for (i in seq_along(x)){
fruit_chars[[i]] <- 1:nchar(x[i])
}
fruit_chars
#> [[1]]
#> [1] 1 2 3 4 5 6
#>
#> [[2]]
#> [1] 1 2 3 4 5 6 7
#>
#> [[3]]
#> [1] 1 2 3 4 5
```

Alternatively, we could use a faster, vectorized function such as `lapply`

. This is similar to the `for`

loop, except that a specified function is applied to each fruit name and the results are wrapped in a list.

The `eList`

package allows for a hybrid variant of the two using the `List()`

function. The `for`

loop evaluates the expression and merges the results into a list. Beneath the scenes, `List`

is parsing the statement into `lapply`

and evaluating it. As we will see, though, there are additional features in `List`

.

```
library(eList)
#>
#> Attaching package: 'eList'
#> The following object is masked from 'package:utils':
#>
#> zip
fruit_chars <- List(for (i in x) 1:nchar(i))
```

If you are coming from Python or another language that uses list comprehension, you will notice that this syntax is a bit different; it is written more similar to standard `for`

loops in R. The `for`

statement comes first, then the variable name and sequence are placed in parentheses. Finally the expression is placed afterwards. The expression can be wrapped in braces `{}`

if desired.

The lists of numbers in the previous examples may be confusing. Luckily, `List`

and other comprehension functions allow us to assign names to each result using the notation: `name = expr`

.

```
List(for (i in x) i = 1:nchar(i))
#> $apples
#> [1] 1 2 3 4 5 6
#>
#> $bananas
#> [1] 1 2 3 4 5 6 7
#>
#> $pears
#> [1] 1 2 3 4 5
```

The name can be any type of expression. Though complex expressions should be wrapped in parentheses so that the parser does not confuse who has the `=`

sign.

The results of a comprehension can be filtered using standard `if`

statements after naming the variables and sequence, making sure that the condition is placed in parentheses. The statement below only returns the sequence for `"bananas"`

.

Any statement that evaluates to `NULL`

is automatically filtered from the results. `else`

statements can also be included so that the results are not filtered out *(unless the else statement evaluates to NULL)*.

```
List(for (i in x) if (i == "bananas") "delicious" else "ewww")
#> [[1]]
#> [1] "ewww"
#>
#> [[2]]
#> [1] "delicious"
#>
#> [[3]]
#> [1] "ewww"
```

Each entry can still be assigned a name. Furthermore, the expression can be as complex as necessary for the task with any number of `else if`

checks.

Comprehensions can have multiple variables if the variables are separated using `"."`

within a single name. To see how this works, let us use the `enum`

function in the `eList`

package. When `enum`

is used on a variable, the first value becomes its index in the loop and the second is the value of the vector at that index. Now, when we use `(i.j in enum(x))`

, `i =`

the index number of each item in `x`

, while `j =`

the value of the item in `x`

*(the name of the fruit)*.

Let us inspect `enum(x)`

to see what is going on beneath the surface. `enum`

took the vector `x`

and created a new list at each index. The first list contains two elements: the first being `1`

*(the index number)*, the second being `"apples"`

*(the first value in x)*.

```
enum(x)
#> [[1]]
#> [[1]][[1]]
#> [1] 1
#>
#> [[1]][[2]]
#> [1] "apples"
#>
#>
#> [[2]]
#> [[2]][[1]]
#> [1] 2
#>
#> [[2]][[2]]
#> [1] "bananas"
#>
#>
#> [[3]]
#> [[3]][[1]]
#> [1] 3
#>
#> [[3]][[2]]
#> [1] "pears"
```

When multiple variables like `i.j`

are passed to the `for`

loop, then `i`

is assigned to the first element and `j`

to the second element for item in the sequence. There can be any number of variables as long as they are separated by `"."`

. There does not have to be a variable for each element; but there does need to be an element for each item or else an “out-of-bounds” error will be produced.

```
y <- list(a = 1:3, b = 4:6, c = 7:9)
List(for (i.j.k in y) (i+j)/k)
#> $a
#> [1] 1
#>
#> $b
#> [1] 1.5
#>
#> $c
#> [1] 1.666667
```

Similarly, variables can be skipped. The function below extracts the first and third elements of each item in the list.

If only the first or second item is needed, though, you should use two dots: `i..`

for the first, `..i`

for the second.

Another convenient function that can be used on the sequence is `items()`

. It is similar to `enum`

, except that the name of each item is used instead of its index number. See the documentation for `eList`

for other helper functions.

```
List(for (i.j in items(y)) paste0(i, j))
#> [[1]]
#> [1] "a1" "a2" "a3"
#>
#> [[2]]
#> [1] "b4" "b5" "b6"
#>
#> [[3]]
#> [1] "c7" "c8" "c9"
```

Variables can also be separated using a comma as long as the variables are surrounded in backticks.

All of the examples so far have returned a list. However, `eList`

supports a variety of different types of comprehension. For example, `Num`

returns a numeric vector, while `Chr`

returns a character vector. `Env`

can be used to produce an environment, similar to dictionary comprehensions in Python. Note that each entry in an environment must have a unique name. These work by coercing the result into particular type, producing an error if unable.

```
Num(for (i.j.k in y) (i+j)/k)
#> a b c
#> 1.000000 1.500000 1.666667
Chr(for (i.j.k in y) (i+j)/k)
#> a b c
#> "1" "1.5" "1.66666666666667"
```

One convenience function is `..`

. It can be used as either `..[for ...]`

or `..(for ...)`

. By default, it attempts to simplify the results in an array, but can mimic any other type of comprehension.

Multiple `for`

loops can be used within a comprehension. As long the subsequent `for`

statements immediately follow the variables & sequence of the previous one, or immediately follow a `if-else`

statement, then it will be parsed into a vectorized `lapply`

style function. Nested loops should be avoided unless necessary since they can be difficult to understand. The following are a couple examples using matrix and numeric comprehension.

```
Mat(for (i in 1:3) for (j in 1:6) i*j)
#> [,1] [,2] [,3]
#> [1,] 1 2 3
#> [2,] 2 4 6
#> [3,] 3 6 9
#> [4,] 4 8 12
#> [5,] 5 10 15
#> [6,] 6 12 18
```

Vector Comprehensions can also be nested within each other. The code below nests a numeric comprehension within a character comprehension.

Vector comprehensions also for parallel computations using the `parallel`

package. All comprehensions allow the user to supply a cluster, which activates the parallel version of `sapply`

or `lapply`

. `eList`

comes with a function that allows for the quick creation of a cluster based on the user’s operating system and by auto-detecting the number of cores available. Users are recommended to explicitly create a cluster though the functions in the `parallel`

package unless a quick parallel calculation is needed.

```
cluster <- auto_cluster(2)
Num(for (i in 1:100) sample(1:100, 1), clust=cluster)
close_cluster(cluster)
```

Note that the additional overhead involved with parallelization means that the gain in performance will be relatively small *(and often negative)* unless the sequence is sufficiently large. Large comprehensions, though, can experience significant gains by specifying a cluster.

The `eList`

package contains a number of summary functions that accept vector comprehensions and/or normal vectors. These functions allow users to combine multiple comprehensions and other vectors in a single function, then apply a summary function to all entries. Each summary comprehension accepts a cluster for parallel computations and the `na.rm`

argument. Some examples:

```
# Are no values TRUE? (combines comprehension with other values)
None(for (i in 1:10) i < 0, TRUE, FALSE)
#> [1] FALSE
```

```
# Summary statistics from a random draw of 1000 observations from normal distribution
Stats(for (i in rnorm(1000)) i)
#> $min
#> [1] -2.975354
#>
#> $q1
#> [1] -0.6803642
#>
#> $med
#> [1] 0.09313939
#>
#> $q3
#> [1] 0.809412
#>
#> $max
#> [1] 3.731964
#>
#> $mean
#> [1] 0.06918359
#>
#> $sd
#> [1] 1.04056
```