pipeR provides various styles of function chaining methods:

- Pipe operator
- Pipe object
- pipeline function

Each of them represents a distinct pipeline model but they share almost a common set of features. A value can be piped to the next expression

- As the first unnamed argument of the function
- As dot symbol (
`.`

) in the expression - As a named variable defined by a formula
- For side-effect that carries over the input to the next
- For assignment that saves an intermediate value

The syntax is designed to make the pipeline more readable and friendly to a wide variety of operations.

**pipeR Tutorial
is a highly recommended complete guide to pipeR.**

This document is also translated into 日本語 (by @hoxo_m).

Install the latest development version from GitHub:

`::install_github("renkun-ken/pipeR") devtools`

Install from CRAN:

`install.packages("pipeR")`

The following code is an example written in traditional approach:

It basically performs bootstrap on `mpg`

values in
built-in dataset `mtcars`

and plots its density function
estimated by Gaussian kernel.

```
plot(density(sample(mtcars$mpg, size = 10000, replace = TRUE),
kernel = "gaussian"), col = "red", main="density of mpg (bootstrap)")
```

The code is deeply nested and can be hard to read and maintain. In
the following examples, the traditional code is rewritten by Pipe
operator, `Pipe()`

function and `pipeline()`

function, respectively.

- Operator-based pipeline

```
$mpg %>>%
mtcarssample(size = 10000, replace = TRUE) %>>%
density(kernel = "gaussian") %>>%
plot(col = "red", main = "density of mpg (bootstrap)")
```

- Object-based pipeline (
`Pipe()`

)

```
Pipe(mtcars$mpg)$
sample(size = 10000, replace = TRUE)$
density(kernel = "gaussian")$
plot(col = "red", main = "density of mpg (bootstrap)")
```

- Argument-based pipeline

```
pipeline(mtcars$mpg,
sample(size = 10000, replace = TRUE),
density(kernel = "gaussian"),
plot(col = "red", main = "density of mpg (bootstrap)"))
```

- Expression-based pipeline

```
pipeline({
$mpg
mtcarssample(size = 10000, replace = TRUE)
density(kernel = "gaussian")
plot(col = "red", main = "density of mpg (bootstrap)")
})
```

`%>>%`

Pipe operator `%>>%`

basically pipes the left-hand
side value forward to the right-hand side expression which is evaluated
according to its syntax.

Many R functions are pipe-friendly: they take some data by the first argument and transform it in a certain way. This arrangement allows operations to be streamlined by pipes, that is, one data source can be put to the first argument of a function, get transformed, and put to the first argument of the next function. In this way, a chain of commands are connected, and it is called a pipeline.

On the right-hand side of `%>>%`

, whenever a
function name or call is supplied, the left-hand side value will always
be put to the first unnamed argument to that function.

```
rnorm(100) %>>%
plot
```

```
rnorm(100) %>>%
plot(col="red")
```

Sometimes the value on the left is needed at multiple places. One can
use `.`

to represent it anywhere in the function call.

```
rnorm(100) %>>%
plot(col="red", main=length(.))
```

There are situations where one calls a function in a namespace with
`::`

. In this case, the call must end up with
`()`

.

```
rnorm(100) %>>%
::median()
stats
rnorm(100) %>>%
::plot(col = "red") graphics
```

`.`

in an
expressionNot all functions are pipe-friendly in every case: You may find some
functions do not take your data produced by a pipeline as the first
argument. In this case, you can enclose your expression by
`{}`

or `()`

so that `%>>%`

will
use `.`

to represent the value on the left.

```
%>>%
mtcars lm(mpg ~ cyl + wt, data = .) } {
```

```
%>>%
mtcars lm(mpg ~ cyl + wt, data = .) ) (
```

Sometimes, it may look confusing to use `.`

to represent
the value being piped. For example,

```
%>>%
mtcars lm(mpg ~ ., data = .)) (
```

Although it works perfectly, it may look ambiguous if `.`

has several meanings in one line of code.

`%>>%`

accepts lambda expression to direct its
piping behavior. Lambda expression is characterized by a formula
enclosed within `()`

, for example, `(x ~ f(x))`

.
It contains a user-defined symbol to represent the value being piped and
the expression to be evaluated.

```
%>>%
mtcars ~ lm(mpg ~ ., data = df)) (df
```

```
%>>%
mtcars subset(select = c(mpg, wt, cyl)) %>>%
~ plot(mpg ~ ., data = x)) (x
```

In a pipeline, one may be interested not only in the final outcome
but sometimes also in intermediate results. To print, plot or save the
intermediate results, it must be a side-effect to avoid breaking the
mainstream pipeline. For example, calling `plot()`

to draw
scatter plot returns `NULL`

, and if one directly calls
`plot()`

in the middle of a pipeline, it would break the
pipeline by changing the subsequent input to `NULL`

.

One-sided formula that starts with `~`

indicates that the
right-hand side expression will only be evaluated for its side-effect,
its value will be ignored, and the input value will be returned
instead.

```
%>>%
mtcars subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg, 0.95)) %>>%
~ cat("rows:",nrow(.),"\n")) %>>% # cat() returns NULL
( summary
```

```
%>>%
mtcars subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg, 0.95)) %>>%
~ plot(mpg ~ wt, data = .)) %>>% # plot() returns NULL
(lm(mpg ~ wt, data = .)) %>>%
(summary()
```

With `~`

, side-effect operations can be easily
distinguished from mainstream pipeline.

An easier way to print the intermediate value it to use
`(? expr)`

syntax like asking question.

```
%>>%
mtcars ncol(.)) %>>%
(? summary
```

In addition to printing and plotting, one may need to save an intermediate value to the environment by assigning the value to a variable (symbol).

If one needs to assign the value to a symbol, just insert a step like
`(~ symbol)`

, then the input value of that step will be
assigned to `symbol`

in the current environment.

```
%>>%
mtcars lm(formula = mpg ~ wt + cyl, data = .)) %>>%
(~ lm_mtcars) %>>%
( summary
```

If the input value is not directly to be saved but after some
transformation, then one can use `=`

, `<-`

, or
more natural `->`

to specify a lambda expression to tell
what to be saved (thanks @yanlinlin82 for suggestion).

```
%>>%
mtcars ~ summ = summary(.)) %>>% # side-effect assignment
(lm(formula = mpg ~ wt + cyl, data = .)) %>>%
(~ lm_mtcars) %>>%
( summary
```

```
%>>%
mtcars ~ summary(.) -> summ) %>>%
(
%>>%
mtcars ~ summ <- summary(.)) %>>% (
```

An easier way to saving intermediate value that is to be further
piped is to use `(symbol = expression)`

syntax:

```
%>>%
mtcars ~ summ = summary(.)) %>>% # side-effect assignment
(lm_mtcars = lm(formula = mpg ~ wt + cyl, data = .)) %>>% # continue piping
( summary
```

or `(expression -> symbol)`

syntax:

```
%>>%
mtcars ~ summary(.) -> summ) %>>% # side-effect assignment
(lm(formula = mpg ~ wt + cyl, data = .) -> lm_mtcars) %>>% # continue piping
( summary
```

`x %>>% (y)`

means extracting the element named
`y`

from object `x`

where `y`

must be a
valid symbol name and `x`

can be a vector, list, environment
or anything else for which `[[]]`

is defined, or S4
object.

```
%>>%
mtcars lm(mpg ~ wt + cyl, data = .)) %>>%
(~ lm_mtcars) %>>%
(%>>%
summary (r.squared)
```

- Working with dplyr:

```
library(dplyr)
%>>%
mtcars filter(mpg <= mean(mpg)) %>>%
select(mpg, wt, cyl) %>>%
~ plot(.)) %>>%
(model = lm(mpg ~ wt + cyl, data = .)) %>>%
(summ = summary(.)) %>>%
( (coefficients)
```

- Working with ggvis:

```
library(ggvis)
%>>%
mtcars ggvis(~mpg, ~wt) %>>%
layer_points()
```

- Working with rlist:

```
library(rlist)
1:100 %>>%
list.group(. %% 3) %>>%
list.mapv(g ~ mean(g))
```

`Pipe()`

`Pipe()`

creates a Pipe object that supports light-weight
chaining without any external operator. Typically, start with
`Pipe()`

and end with `$value`

or `[]`

to extract the final value of the Pipe.

Pipe object provides an internal function `.(...)`

that
work exactly in the same way with `x %>>% (...)`

, and
it has more features than `%>>%`

.

NOTE:

`.()`

does not support assignment with`=`

but supports`~`

,`<-`

and`->`

.

```
Pipe(rnorm(1000))$
density(kernel = "cosine")$
plot(col = "blue")
```

```
Pipe(mtcars)$
$
.(mpg)summary()
```

```
Pipe(mtcars)$
~ summary(.) -> summ)$
.(lm(formula = mpg ~ wt + cyl)$
summary()$
.(coefficients)
```

```
<- Pipe(mtcars)
pmtcars c("mpg","wt")]$
pmtcars[lm(formula = mpg ~ wt)$
summary()
"mpg"]]$mean() pmtcars[[
```

```
<- Pipe(list(a=1,b=2))
plist $a <- 0
plist$b <- NULL plist
```

```
Pipe(mtcars)$
ncol(.))$
.(? ~ plot(mpg ~ ., data = .))$ # side effect: plot
.(lm(formula = mpg ~ .)$
~ lm_mtcars)$ # side effect: assign
.(summary()$
```

- Working with dplyr:

```
Pipe(mtcars)$
filter(mpg >= mean(mpg))$
select(mpg, wt, cyl)$
lm(formula = mpg ~ wt + cyl)$
summary()$
$
.(coefficients) value
```

- Working with ggvis:

```
Pipe(mtcars)$
ggvis(~ mpg, ~ wt)$
layer_points()
```

- Working with rlist:

```
Pipe(1:100)$
list.group(. %% 3)$
list.mapv(g ~ mean(g))$
value
```

`pipeline()`

`pipeline()`

provides argument-based and expression-based
pipeline evaluation mechanisms. Its behavior depends on how its
arguments are supplied. If only the first argument is supplied, it
expects an expression enclosed in `{}`

in which each line
represents a pipeline step. If, instead, multiple arguments are
supplied, it regards each argument as a pipeline step. For all pipeline
steps, the expressions will be transformed to be connected by
`%>>%`

so that they behave exactly the same.

One notable difference is that in `pipeline()`

’s argument
or expression, the special symbols to perform specially defined pipeline
tasks (e.g. side-effect) does not need to be enclosed within
`()`

because no operator priority issues arise as they do in
using `%>>%`

.

```
pipeline({
mtcarslm(formula = mpg ~ cyl + wt)
~ lmodel
summary$r.squared
? .
coef })
```

Thanks @hoxo_m for the idea presented in this post.

This package is under MIT License.