---
title: "bean: an overview"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{bean: an overview}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse  = TRUE,
  comment   = "#>",
  fig.width = 7,
  fig.height = 5,
  out.width = "90%"
)
```

`bean` reduces sampling bias in species occurrence data by thinning it in
**environmental space** rather than in geographic space. The result is a
cleaner training set for species distribution models (SDM / ENM).

The protocol is:

1. **Prepare** raw occurrences with `prepare_bean()`.
2. **Choose a grid resolution** with `find_env_resolution()`, which selects a
   kernel-density bandwidth for each environmental variable.
3. **Thin** occurrences with `thin_env_nd()` (stochastic) or
   `thin_env_center()` (deterministic).
4. **Fit a niche ellipsoid** with `fit_ellipsoid()`.
5. **Predict suitability** with `predict()` on the fitted ellipsoid.

```{r setup}
library(bean)
```

## Quickstart

```{r quickstart}
data(origin_dat_prepared, package = "bean")
env_vars <- c("bio_1", "bio_4", "bio_12", "bio_15")

# 1. Pick an objective grid resolution from the data
res <- find_env_resolution(origin_dat_prepared, env_vars = env_vars)
res

# 2. Thin in environmental space
thinned <- thin_env_nd(
  data = origin_dat_prepared,
  env_vars = env_vars,
  grid_resolution = res$suggested_resolution,
  seed = 1
)
thinned
```

The remaining vignettes walk through each step in detail.