# santoku

santoku is a versatile cutting tool for R. It provides `chop()`, a replacement for `base::cut()`.

Here are some advantages of santoku:

• By default, `chop()` always covers the whole range of the data, so you won’t get unexpected `NA` values.

• `chop()` can handle single values as well as intervals. For example, `chop(x, breaks = c(1, 2, 2, 3))` will create a separate factor level for values exactly equal to 2.

• `chop()` can handle many kinds of data, including numbers, dates and times, and units.

• `chop_*` functions create intervals in many ways, using quantiles of the data, standard deviations, fixed-width intervals, equal-sized groups, or pretty intervals for use in graphs.

• It’s easy to label intervals: use names for your breaks vector, or use a `lbl_*` function to create interval notation like `[1, 2)`, dash notation like `1-2`, or arbitrary styles using `glue::glue()`.

• `tab_*` functions quickly chop data, then tabulate it.

These advantages make santoku especially useful for exploratory analysis, where you may not know the range of your data in advance.

## Examples

``library(santoku)``

`chop` returns a factor:

``````chop(1:5, c(2, 4))
#> [1] [1, 2) [2, 4) [2, 4) [4, 5] [4, 5]
#> Levels: [1, 2) [2, 4) [4, 5]``````

Include a number twice to match it exactly:

``````chop(1:5, c(2, 2, 4))
#> [1] [1, 2) {2}    (2, 4) [4, 5] [4, 5]
#> Levels: [1, 2) {2} (2, 4) [4, 5]``````

Use names in breaks for labels:

``````chop(1:5, c(Low = 1, Mid = 2, High = 4))
#> [1] Low  Mid  Mid  High High
#> Levels: Low Mid High``````

Or use `lbl_*` functions:

``````chop(1:5, c(2, 4), labels = lbl_dash())
#> [1] 1—2 2—4 2—4 4—5 4—5
#> Levels: 1—2 2—4 4—5``````

Chop into fixed-width intervals:

``````chop_width(runif(10), 0.1)
#>  [1] [0.1405, 0.2405)  [0.3405, 0.4405)  [0.5405, 0.6405)  [0.4405, 0.5405)
#>  [5] [0.04047, 0.1405) [0.04047, 0.1405) [0.3405, 0.4405)  [0.8405, 0.9405]
#>  [9] [0.6405, 0.7405)  [0.4405, 0.5405)
#> 7 Levels: [0.04047, 0.1405) [0.1405, 0.2405) ... [0.8405, 0.9405]``````

Or into fixed-size groups:

``````chop_n(1:10, 5)
#>  [1] [1, 6)  [1, 6)  [1, 6)  [1, 6)  [1, 6)  [6, 10] [6, 10] [6, 10] [6, 10]
#> [10] [6, 10]
#> Levels: [1, 6) [6, 10]``````

Chop dates by calendar month, then tabulate:

``````library(lubridate)