| Title: | Behavioural Reproducibility Auditing for R Projects |
| Version: | 0.2.0 |
| Description: | Audits R scripts for behavioural reproducibility risk. Scans scripts for qualified package::function calls and checks them against a curated database of known silent breaking changes across popular CRAN packages. Flags stochastic calls lacking set.seed() and detects locale-sensitive operations that may produce different results across systems. Supports baseline certification of analytical outputs so that silent numerical drift can be detected across package upgrades or platform changes. Generates human-readable audit reports suitable for academic submission or pharmaceutical QC workflows. For more details see https://github.com/repro-stats/reproducr. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| Language: | en-GB |
| RoxygenNote: | 7.3.3 |
| Depends: | R (≥ 4.0.0) |
| Imports: | utils |
| Suggests: | digest (≥ 0.6.0), jsonlite, commonmark, testthat (≥ 3.0.0), knitr, rmarkdown, covr |
| Config/testthat/edition: | 3 |
| VignetteBuilder: | knitr |
| URL: | https://github.com/repro-stats/reproducr |
| BugReports: | https://github.com/repro-stats/reproducr/issues |
| NeedsCompilation: | no |
| Packaged: | 2026-06-15 15:03:22 UTC; ndohpenn |
| Author: | Ndoh Penn |
| Maintainer: | Ndoh Penn <ndohpenn9@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-06-20 14:10:02 UTC |
reproducr: Behavioural Reproducibility Auditing for R Projects
Description
You finish an analysis. The code runs. The numbers look right. But are they stable?
reproducr makes behavioural reproducibility risks visible and trackable.
It scans your scripts for known silent breaking changes, flags stochastic
calls missing set.seed(), certifies analytical outputs as baselines, and
detects numerical drift across runs.
Workflow
Tier 1 – Scan & score
report <- audit_script("analysis.R")
risks <- risk_score(report)
print(risks)
Tier 2 – Baseline & drift
model <- lm(mpg ~ wt, data = mtcars) certify(list(coefs = coef(model)), tag = "submission-v1") # Later, after any environment change: check_drift(list(coefs = coef(model)), against = "submission-v1")
Tier 3 – Report & export
repro_report(report, risks, format = "html", style = "pharma") repro_badge(report, risks, output = "README")
Key functions
| Function | Purpose |
audit_script() | Parse a script and extract all pkg::fn calls |
risk_score() | Check calls against the breaking-changes database |
certify() | Hash and store analytical outputs as a baseline |
check_drift() | Compare current outputs against a stored baseline |
list_certs() | List all certifications in a .reproducr file |
repro_report() | Render a human-readable audit report |
repro_badge() | Generate a reproducibility status badge |
check_db_staleness() | Check database entries against current CRAN versions |
The breaking-changes database
The internal database covers known silent breaking changes in:
dplyr, tidyr, ggplot2, readr, purrr, stringr, broom,
data.table, lme4, lubridate, and base R. Community contributions
are welcome – see vignette("contributing-to-the-database").
The database is kept current via a weekly GitHub Actions workflow that
calls check_db_staleness() and opens an issue automatically
when any entry's to_version ceiling falls below the current CRAN release.
Author(s)
Maintainer: Ndoh Penn ndohpenn9@gmail.com (ORCID)
See Also
Useful links:
Report bugs at https://github.com/repro-stats/reproducr/issues
Audit an R script for reproducibility risks
Description
Parses one or more R source files and extracts every qualified
package::function call, resolving the installed version of each package.
The resulting audit_report object is the entry point for the rest of the
reproducr workflow.
Usage
audit_script(path = ".", renv = TRUE, verbose = TRUE)
## S3 method for class 'audit_report'
print(x, ...)
## S3 method for class 'audit_report'
summary(object, ...)
Arguments
path |
|
renv |
|
verbose |
|
x |
An |
... |
Additional arguments (currently unused). |
object |
An |
Value
An S3 object of class "audit_report", a list containing:
callsA
data.framewith one row per detectedpkg::fncall, columnsfile,line,pkg,fn,pkg_version.envA list with R version, platform, OS, locale, and timezone.
renv_usedlogical– were versions sourced from a lockfile?timestampPOSIXcttimestamp of when the audit was run.pathsCharacter vector of files that were scanned.
Detection approach
audit_script() uses regular-expression matching on source text to extract
qualified calls of the form pkg::fn or pkg:::fn. It intentionally skips
comment lines (lines beginning with #, after trimming whitespace). For more
robust analysis, tools that operate on the parse tree (e.g. lintr) should
be used alongside reproducr.
What counts as a qualifying call?
Only qualified calls – those using :: or ::: – are detected.
Unqualified calls (e.g. filter(df, x > 0) without dplyr::) are not
detected because he package cannot be determined unambiguously from source
text alone. This is by design: qualifying calls is also a reproducibility
best practice.
See Also
risk_score() to check detected calls against the
breaking-changes database; repro_report() to render the
full audit; certify() to lock a set of outputs as a baseline.
Examples
# Write a temporary script to audit
script <- tempfile(fileext = ".R")
writeLines(c(
"set.seed(237)",
"x <- dplyr::filter(mtcars, cyl == 4)",
"y <- dplyr::summarise(x, mean_mpg = mean(mpg))",
"z <- stats::rnorm(nrow(y))"
), script)
report <- audit_script(script, renv = FALSE, verbose = FALSE)
print(report)
# See the detected calls as a data frame
report$calls
Certify analytical outputs as a reproducibility baseline
Description
Hashes a named list of R objects (model coefficients, summary statistics,
key scalars, data frames) and saves them alongside full environment metadata
to a local certification file (.reproducr.rds by default). Later runs
can call check_drift() to verify that results have not changed.
Think of certify() as a "signed receipt" for a completed analysis run.
Usage
certify(outputs, tag, script = NULL, file = ".reproducr")
Arguments
outputs |
A fully named list of R objects to certify. Each element is
hashed using SHA-256 (or a base-R fallback if |
tag |
|
script |
|
file |
|
Value
Invisibly returns the certification record (a list). Prints a one-line summary to the console.
Certification store
All certifications for a project are accumulated in a single .reproducr.rds
file. You can have multiple tags representing different stages (e.g. before
and after peer review). Use list_certs() to inspect stored tags.
Version control
Commit .reproducr.rds to your project's version control repository.
This makes the certification auditable and shareable with collaborators.
See Also
check_drift() to compare current outputs against a baseline;
list_certs() to inspect stored certifications.
Examples
model <- lm(mpg ~ wt, data = mtcars)
cert_file <- tempfile()
certify(
outputs = list(
coefs = coef(model),
r_squared = summary(model)$r.squared,
n_obs = nrow(mtcars)
),
tag = "baseline-v1",
script = "analysis.R",
file = cert_file
)
# See what is stored
list_certs(file = cert_file)
Check whether breaking-changes database entries are stale
Description
Compares the to_version ceiling and from_version floor of each entry
in the breaking-changes database against the current version of that
package on CRAN. Two types of staleness are detected:
-
stale_ceiling– the package has released a new version above theto_versionceiling. The window may need extending. -
stale_floor– the current CRAN version is so far ahead offrom_versionthat the window captures users who are already well past the breaking-change transition. The entry may need closing or thefrom_versionfloor raising.
This function is primarily intended for use by reproducr maintainers
and contributors. It is also run as a scheduled GitHub Actions workflow
on the reproducr repository to automatically open issues when staleness
is detected.
Usage
check_db_staleness(
packages = NULL,
verbose = TRUE,
source = "cran",
from_version_major_threshold = 1L
)
Arguments
packages |
|
verbose |
|
source |
Default |
from_version_major_threshold |
|
Value
A data.frame of class c("staleness_report", "data.frame")
with one row per database entry. Columns:
keyThe
pkg::fnkey.pkgPackage name.
fnFunction name.
from_versionThe floor version currently in the database.
to_versionThe ceiling version currently in the database.
current_versionThe current version on CRAN or installed.
statusOne of
"ok","stale_ceiling","stale_floor", or"unknown".gapDescription of the version gap.
NAwhen status is"ok"or"unknown".
Rows are ordered: stale_ceiling first, stale_floor second, then ok, then unknown.
See Also
risk_score() which uses the database at runtime;
vignette("contributing-to-the-database") for the database schema and
version window design principles.
Examples
# Check all tracked packages against CRAN
report <- check_db_staleness()
print(report)
# Check specific packages only
check_db_staleness(packages = c("dplyr", "tidyr"))
# Offline check using installed versions
check_db_staleness(source = "installed")
# Filter to stale entries only
report <- check_db_staleness()
report[report$status != "ok", ]
Check analytical outputs for drift against a certified baseline
Description
Re-hashes a set of named R objects and compares them against a previously
stored certification. Reports which outputs are unchanged ("ok"), have
changed ("drifted"), are present in the baseline but not supplied
("missing"), or are new outputs not in the baseline ("new").
Usage
check_drift(
outputs,
against = "latest",
file = ".reproducr",
tolerance = 1e-10
)
Arguments
outputs |
A fully named list of current R objects – the same names used
in the |
against |
|
file |
|
tolerance |
|
Value
Invisibly returns a data.frame of class
c("drift_report", "data.frame") with columns output, status
("ok", "drifted", "missing", "new"), max_delta, and note.
Also emits a summary via message().
See Also
certify() to create a baseline; list_certs() to see available
tags.
Examples
cert_file <- tempfile()
model <- lm(mpg ~ wt, data = mtcars)
certify(list(coefs = coef(model)), tag = "v1", file = cert_file)
# Same outputs -- should report "ok"
result <- check_drift(list(coefs = coef(model)),
against = "v1", file = cert_file
)
print(result)
# Different model -- should report "drifted"
model2 <- lm(mpg ~ hp, data = mtcars)
check_drift(list(coefs = coef(model2)),
against = "v1", file = cert_file
)
List all certifications stored in a certification file
Description
A convenience function to inspect what certification tags are stored and
their key metadata, without needing to read the raw .rds file.
Usage
list_certs(file = ".reproducr")
Arguments
file |
|
Value
A data.frame with columns tag, timestamp, r_version,
os, n_outputs, script – one row per certification.
Returns an empty data frame if no certifications exist.
Examples
cert_file <- tempfile()
model <- lm(mpg ~ wt, data = mtcars)
certify(list(coefs = coef(model)), tag = "v1", file = cert_file)
certify(list(coefs = coef(model)), tag = "v2", file = cert_file)
list_certs(file = cert_file)
Generate a reproducibility status badge
Description
Produces a shields.io Markdown badge reflecting the current reproducibility status of a project. The badge is colour-coded:
-
Green (
reproducible) – no risks detected. -
Yellow (
caution) – medium-severity risks only. -
Red (
at risk) – one or more high-severity risks or drifted outputs. -
Grey (
unknown) – no risk information supplied.
Can be inserted automatically into a README.md (e.g. from a GitHub
Actions workflow).
Usage
repro_badge(
audit,
risks = NULL,
drift = NULL,
output = "markdown",
readme_path = "README.md"
)
Arguments
audit |
An |
risks |
A |
drift |
A |
output |
|
readme_path |
|
Value
Invisibly returns the badge Markdown string.
See Also
repro_report(), risk_score(),
check_drift()
Examples
script <- tempfile(fileext = ".R")
writeLines("x <- dplyr::filter(mtcars, cyl == 4)", script)
report <- audit_script(script, renv = FALSE, verbose = FALSE)
risks <- risk_score(report)
badge <- repro_badge(report, risks)
cat(badge)
Generate a human-readable reproducibility report
Description
Renders a reproducibility audit report from an audit_script() result
and optionally a risk_score() result and check_drift() result. Three
style presets are available:
-
"minimal"– compact summary suitable for console review or internal project documentation. -
"academic"– generates a ready-to-paste methods paragraph for journal submissions, listing all packages with versions and summarising risk findings. -
"pharma"– structured QC document with a risk register and sign-off fields, suitable for pharmaceutical or regulated analytical workflows.
Usage
repro_report(
audit,
risks = NULL,
drift = NULL,
format = "text",
style = "minimal",
output_file = NULL
)
Arguments
audit |
An |
risks |
A |
drift |
A |
format |
|
style |
|
output_file |
|
Value
Invisibly returns the report content as a character string. For file-based formats, the file is also written to disk.
See Also
audit_script(), risk_score(),
check_drift(), repro_badge()
Examples
script <- tempfile(fileext = ".R")
writeLines(c(
"set.seed(237)",
"x <- dplyr::filter(mtcars, cyl == 4)",
"y <- stats::rnorm(10)"
), script)
report <- audit_script(script, renv = FALSE, verbose = FALSE)
risks <- risk_score(report)
# Console summary
repro_report(report, risks, format = "text", style = "minimal")
# Academic methods paragraph (printed, not written to file)
cat(repro_report(report, risks, format = "text", style = "academic"))
Score function calls for reproducibility risk
Description
Takes an audit_report and checks every detected pkg::fn call against
three independent checks:
-
"changelog"– matches against a curated database of known breaking changes in popular CRAN packages, flagging calls where the installed version falls in a known-risky version window. -
"seed_check"– flags stochastic functions (rnorm,sample, etc.) where noset.seed()appears within 50 lines above the call. -
"locale_check"– flags functions whose output is locale-sensitive (sort(),format(),tolower(), etc.).
Usage
risk_score(
audit,
methods = c("changelog", "seed_check", "locale_check"),
min_risk = "low",
major_version_grace = 1L
)
## S3 method for class 'risk_report'
print(x, ...)
## S3 method for class 'risk_report'
as.data.frame(x, ...)
## S3 method for class 'risk_report'
x[i, j, ...]
Arguments
audit |
An |
methods |
|
min_risk |
|
major_version_grace |
|
x |
A |
... |
Additional arguments (currently unused). |
i |
Row index. |
j |
Column index. When columns are subsetted and required columns are
removed, the |
Value
A data.frame of class c("risk_report", "data.frame") with one
row per flagged call. Columns:
fileSource file path.
lineLine number of the call.
callThe
pkg::fnstring.pkg_versionInstalled or lockfile-resolved version.
risk"high","medium", or"low".checkWhich check flagged it:
"changelog","seed_check", or"locale_check".descriptionPlain-English explanation of the risk.
referenceURL to the relevant changelog or documentation.
Rows are ordered by risk severity (high first), then by file and line. If no risks are found, an empty data frame with the same columns is returned.
Version windows
The "changelog" check uses a half-open version window (from_ver, to_ver]:
a call is flagged only if the installed version is greater than
from_ver and at most to_ver. This means the risk is scoped to
versions where the breaking change is known to apply.
Major version grace
When an installed version is major_version_grace or more major versions
ahead of from_version, the entry is suppressed entirely. The user is
already past the breaking-change transition – flagging it at any severity
would be a false positive. The database staleness check
(check_db_staleness()) handles the maintenance concern of
identifying entries whose from_version floor is too old.
See Also
audit_script() to generate the input;
repro_report() to render the results;
check_db_staleness() to identify database entries with
windows that are too wide.
Examples
script <- tempfile(fileext = ".R")
writeLines(c(
"x <- dplyr::summarise(mtcars, n = dplyr::n())",
"y <- stats::rnorm(100)",
"z <- base::sort(letters)"
), script)
report <- audit_script(script, renv = FALSE, verbose = FALSE)
risks <- risk_score(report)
print(risks)
# High-severity items only
risk_score(report, min_risk = "high")
# Only the changelog check
risk_score(report, methods = "changelog")