Once risks is released on CRAN,
`install.packages("risks")`

will be available. Currently, the
development version of `risks`

can be installed from GitHub using:

We use a cohort of women with breast cancer, as used by Spiegelman
and Hertzmark (Am J
Epidemiol 2005) and Greenland (Am J Epidemiol
2004). The the categorical exposure is `stage`

, the
binary outcome is `death`

, and the binary confounder is
`receptor`

.

```
library(risks) # provides riskratio(), riskdiff(), postestimation functions
library(dplyr) # For data handling
library(broom) # For tidy() model summaries
data(breastcancer) # Load sample data
breastcancer %>% # Display the sample data
group_by(receptor, stage) %>%
summarize(deaths = sum(death), total = n(), risk = deaths/total)
#> # A tibble: 6 × 5
#> # Groups: receptor [2]
#> receptor stage deaths total risk
#> <chr> <fct> <dbl> <int> <dbl>
#> 1 High Stage I 5 55 0.0909
#> 2 High Stage II 17 74 0.230
#> 3 High Stage III 9 15 0.6
#> 4 Low Stage I 2 12 0.167
#> 5 Low Stage II 9 22 0.409
#> 6 Low Stage III 12 14 0.857
```

The risk of death is higher among women with higher-stage and hormone
receptor-low cancers, which also tend to be of higher stage. Using
`risks`

models to obtain (possibly multivariable-adjusted)
risk ratios or risk differences is similar to the standard code for
logistic models in R. As customary in R, log(RR) is returned; see below
for how to obtain exponentiated values. No options for model
`family`

or `link`

need to be supplied:

```
fit_rr <- riskratio(formula = death ~ stage + receptor, data = breastcancer)
summary(fit_rr)
#>
#> Risk ratio model, fitted via marginal standardization of a logistic model with delta method (margstd_delta).
#> Call:
#> stats::glm(formula = death ~ stage + receptor, family = binomial(link = "logit"),
#> data = breastcancer, start = "(no starting values)")
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#>
#>
#> Coefficients: (3 not defined because of singularities)
#> Estimate Std. Error z value Pr(>|z|)
#> stageStage I 0.0000 0.0000 NaN NaN
#> stageStage II 0.8989 0.3875 2.320 0.0203 *
#> stageStage III 1.8087 0.3783 4.781 1.75e-06 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for binomial family taken to be 1)
#>
#> Null deviance: 228.15 on 191 degrees of freedom
#> Residual deviance: 185.88 on 188 degrees of freedom
#> AIC: 193.88
#>
#> Number of Fisher Scoring iterations: 4
#>
#> Confidence intervals for coefficients: (delta method)
#> 2.5 % 97.5 %
#> stageStage I 0.0000000 0.000000
#> stageStage II 0.1395299 1.658324
#> stageStage III 1.0671711 2.550242
```

To obtain risk differences, use `riskdiff`

, which has the
same syntax:

```
fit_rd <- riskdiff(formula = death ~ stage + receptor, data = breastcancer)
summary(fit_rd)
#>
#> Risk difference model, fitted via marginal standardization of a logistic model with delta method (margstd_delta).
#> Call:
#> stats::glm(formula = death ~ stage + receptor, family = binomial(link = "logit"),
#> data = breastcancer, start = "(no starting values)")
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#>
#>
#> Coefficients: (3 not defined because of singularities)
#> Estimate Std. Error z value Pr(>|z|)
#> stageStage I 0.00000 0.00000 NaN NaN
#> stageStage II 0.16303 0.05964 2.734 0.00626 **
#> stageStage III 0.57097 0.09962 5.732 9.95e-09 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for binomial family taken to be 1)
#>
#> Null deviance: 228.15 on 191 degrees of freedom
#> Residual deviance: 185.88 on 188 degrees of freedom
#> AIC: 193.88
#>
#> Number of Fisher Scoring iterations: 4
#>
#> Confidence intervals for coefficients: (delta method)
#> 2.5 % 97.5 %
#> stageStage I 0.00000000 0.0000000
#> stageStage II 0.04614515 0.2799187
#> stageStage III 0.37571719 0.7662158
```

For example, the risk of death was 57 percentage points higher in women with stage III breast cancer compared to stage I (95% confidence interval, 38 to 77 percentage points), adjusting for hormone receptor status.

The model summary in `riskratio()`

and
`riskdiff()`

includes to two additions compared to a regular
`glm()`

model:

- In the first line of
`summary()`

, the type of risk regression model is displayed (in the example, “`Risk ratio model, fitted as binomial model...`

”). - At the end of the output, confidence intervals for the model coefficients are printed.

The risks package provides an interface to
`broom::tidy()`

, which returns a data frame of all
coefficients (risk differences in this example), their standard errors,
*z* statistics, and confidence intervals.

```
tidy(fit_rd)
#> # A tibble: 3 × 8
#> term estimate std.error statistic p.value conf.low conf.high model
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 stageStage I 0 0 NaN NaN 0 0 marg…
#> 2 stageStage II 0.163 0.0596 2.73 6.26e-3 0.0461 0.280 marg…
#> 3 stageStage III 0.571 0.0996 5.73 9.95e-9 0.376 0.766 marg…
```

In accordance with `glm()`

standards, coefficients for
relative risks are shown on the logarithmic scale. Exponentiated
coefficients for risk ratios are easily obtained via
`tidy(..., exponentiate = TRUE)`

:

```
tidy(fit_rr, exponentiate = TRUE)
#> # A tibble: 3 × 8
#> term estimate std.error statistic p.value conf.low conf.high model
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 stageStage I 1 0 NaN NaN 1 1 marg…
#> 2 stageStage II 2.46 0.387 2.32 2.03e-2 1.15 5.25 marg…
#> 3 stageStage III 6.10 0.378 4.78 1.75e-6 2.91 12.8 marg…
```

For example, the risk of death was 6 times higher in women with stage III breast cancer compared to stage I (95% confidence interval, 3 to 13 times), adjusting for hormone receptor status.

Typical R functions that build on regression models can further
process fitted `risks`

models. In addition to
`tidy()`

, examples are:

`coef(fit)`

or`coefficients(fit)`

return model coefficients (*i.e.*, RDs or log(RR)) as a numeric vector`confint(fit, level = 0.9)`

returns*90%*confidence intervals.`predict(fit, type = "response")`

or`fitted.values(fit)`

return predicted probabilities of the binary outcome.`residuals(fit)`

returns model residuals.