---
title: "Certifying outputs and detecting drift"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Certifying outputs and detecting drift}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>"
)
library(reproducr)
cert_file <- tempfile() # shared across all chunks
```

This vignette covers Tier 2 of the `reproducr` workflow in depth:
`certify()`, `check_drift()`, and `list_certs()`. These three functions
together form the *baseline and drift detection* system.

## The problem they solve — a real scenario

### Scenario — The revision drift problem

You submit a paper in March. Before submission you run the analysis and note
the key results: hazard ratio 0.582 (95% CI: 0.446–0.760, p < 0.001).

In May a reviewer asks for a revision. While working on the response you
upgrade your packages — including `lme4`, which adjusted its default optimizer
tolerances between versions 1.1.29 and 1.1.30. You re-run the analysis:
hazard ratio 0.591 (95% CI: 0.452–0.768).

The numbers are slightly different. No error was thrown. The code is
identical. Without a record of what the March run produced, you would not
know whether the change came from your revision or from the package upgrade.

```
[DRIFTED] hr:       0.582 → 0.591
[DRIFTED] ci_lower: 0.446 → 0.452
[DRIFTED] ci_upper: 0.760 → 0.768
```

With `certify()` and `check_drift()`, this is caught immediately and you can
investigate before submitting to the reviewer.

More broadly, packages change hands, maintainers push silent fixes,
platform-level libraries (BLAS, LAPACK) get updated by system administrators,
and R itself changes RNG defaults between minor versions. Any of these can
alter your numerical results without producing an error.

`certify()` and `check_drift()` detect this. The idea is simple:

1. After a successful analysis run, hash the key outputs and store the hashes.
2. Later — after any change to the environment — re-run the analysis and
   compare the new hashes against the stored ones.
3. Any mismatch is reported explicitly, by output name.

---

## `certify()` — creating a baseline

### What gets hashed

Pass a fully named list of any R objects you want to protect. Common choices:

```{r certify-examples}
model <- lm(mpg ~ wt + cyl, data = mtcars)

certify(
  outputs = list(
    coefs       = coef(model),
    r_squared   = summary(model)$r.squared,
    sigma       = sigma(model),
    n_obs       = nrow(mtcars),
    n_complete  = sum(complete.cases(mtcars)),
    group_means = aggregate(mpg ~ cyl, data = mtcars, FUN = mean)
  ),
  tag = "baseline-v1",
  script = "analysis.R",
  file = cert_file
)
```

### Choosing what to certify

Certify outputs that are:

- **Conclusions** — the numbers that appear in your paper or report
- **Stable** — not random session artefacts like timestamps or row ordering
- **Interpretable** — so a drift report tells you something meaningful

Avoid certifying objects that are expected to differ across runs by design,
such as `proc.time()` outputs or `Sys.time()` values.

### Tags and the certification store

Every certification requires a `tag` — a human-readable label:

```{r tags}
certify(
  outputs = list(coefs = coef(model)),
  tag     = "pre-peer-review",
  file    = cert_file
)

certify(
  outputs = list(coefs = coef(model)),
  tag     = "post-revision",
  file    = cert_file
)
```

Passing a duplicate tag overwrites the existing record with a warning:

```{r duplicate-tag, warning = TRUE}
certify(
  outputs = list(coefs = coef(model)),
  tag     = "baseline-v1",
  file    = cert_file
)
```

---

## `list_certs()` — inspecting the store

```{r list-certs}
list_certs(file = cert_file)
```

---

## `check_drift()` — comparing against a baseline

### Basic usage

```{r drift-basic}
model2 <- lm(mpg ~ wt + cyl, data = mtcars)

result <- check_drift(
  outputs = list(
    coefs       = coef(model2),
    r_squared   = summary(model2)$r.squared,
    sigma       = sigma(model2),
    n_obs       = nrow(mtcars),
    n_complete  = sum(complete.cases(mtcars)),
    group_means = aggregate(mpg ~ cyl, data = mtcars, FUN = mean)
  ),
  against = "baseline-v1",
  file = cert_file
)
```

### The four statuses

```{r statuses}
certify(
  outputs = list(
    stays_same  = 42L,
    will_change = coef(lm(mpg ~ wt, data = mtcars)),
    will_vanish = "this output disappears next run"
  ),
  tag = "four-statuses",
  file = cert_file
)

demo_result <- check_drift(
  outputs = list(
    stays_same  = 42L,
    will_change = coef(lm(mpg ~ hp, data = mtcars)),
    brand_new   = "this output is new"
  ),
  against = "four-statuses",
  file = cert_file
)

print(demo_result)
```

| Status | Meaning |
|---|---|
| `ok` | Hash matches the baseline exactly |
| `drifted` | Hash differs — output has changed |
| `missing` | Present in baseline, not supplied to `check_drift()` |
| `new` | Supplied to `check_drift()`, not in baseline |

### Using `"latest"`

```{r latest}
certify(outputs = list(x = 1L), tag = "run-1", file = cert_file)
certify(outputs = list(x = 1L), tag = "run-2", file = cert_file)
certify(outputs = list(x = 1L), tag = "run-3", file = cert_file)

check_drift(outputs = list(x = 1L), against = "latest", file = cert_file)
```

### Using drift results programmatically

```{r drift-programmatic, eval = FALSE}
result <- check_drift(outputs = current_outputs, against = "latest")

n_drifted <- sum(result$status == "drifted")
if (n_drifted > 0L) {
  drifted_names <- result$output[result$status == "drifted"]
  stop(sprintf(
    "%d output(s) have drifted since last certification: %s",
    n_drifted,
    paste(drifted_names, collapse = ", ")
  ))
}
```

---

## Recommended workflow

### At submission

```r
certify(
  outputs = list(
    primary_coef = coef(model)[2],
    primary_pval = summary(model)$coefficients[2, 4],
    n            = nrow(data),
    effect_size  = compute_d(model)
  ),
  tag    = "submitted-2026-01-15",
  script = "main_analysis.R"
)
```

### After reviewer comments

```r
check_drift(
  outputs = list(
    primary_coef = coef(model)[2],
    primary_pval = summary(model)$coefficients[2, 4],
    n            = nrow(data),
    effect_size  = compute_d(model)
  ),
  against = "submitted-2026-01-15"
)
```

---

## Version control

Commit `.reproducr.rds` to your Git repository. This gives you a permanent,
auditable history of what every run produced, and lets you compare against any
past milestone.

Add to `.gitattributes` to prevent noisy diffs:

```
.reproducr.rds binary
```

```{r cleanup, include = FALSE}
unlink(paste0(cert_file, ".rds"))
```