This vignette features functions that are not covered in other vignettes.
library(actxps)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
The pol_()
family of functions calculates policy years,
months, quarters, weeks, or any other arbitrary duration. Each function
accepts a vector of dates and a vector of issue dates.
Example: assume a policy was issued on 2022-05-10 and we are interested in calculating various policy duration values at the end of calendar years 2022-2032.
<- ymd("2022-12-31") + years(0:10)
dates
# policy years
pol_yr(dates, "2022-05-10")
#> [1] 1 2 3 4 5 6 7 8 9 10 11
# policy quarters
pol_qtr(dates, "2022-05-10")
#> [1] 3 7 11 15 19 23 27 31 35 39 43
# policy months
pol_mth(dates, "2022-05-10")
#> [1] 8 20 32 44 56 68 80 92 104 116 128
# policy weeks
pol_wk(dates, "2022-05-10")
#> [1] 34 86 139 191 243 295 347 399 452 504 556
The more general pol_interval()
function calculates any
arbitrary duration. This function has a third argument where the length
of the policy duration can be specified. This argument must be a period
object. See lubridate::period()
for more information.
# days
pol_interval(dates, "2022-05-10", days(1))
#> [1] 236 601 967 1332 1697 2062 2428 2793 3158 3523 3889
# fortnights
pol_interval(dates, "2022-05-10", weeks(2))
#> [1] 17 43 70 96 122 148 174 200 226 252 278
The add_predictions()
function attaches predictions from
any model with a predict()
method.
Below, a very simple logistic regression model is fit to surrender
experience in the first ten policy years. Predictions from this model
are then added to exposure records using add_predictions()
.
This function only requires a data frame of exposure records and a model
with a predict()
method. Often, it is necessary to specify
additional model-specific arguments like type
to ensure
predict()
returns the desired output. In the example below,
type
is set to “response” to return probabilities instead
of the default predictions on the log-odds scale.
The col_expected
argument is used to rename the
column(s) containing predicted values. If no names are specified, the
default name is “expected”.
# create exposure records
<- expose(census_dat, end_date = "2019-12-31",
exposed_data target_status = "Surrender") |>
filter(pol_yr <= 10) |>
# add a response column for surrenders
mutate(surrendered = status == "Surrender")
# create a simple logistic model
<- glm(surrendered ~ pol_yr, data = exposed_data,
mod family = "binomial", weights = exposure)
<- exposed_data |>
exp_res # attach predictions
add_predictions(mod, type = "response", col_expected = "logistic") |>
# summarize results
group_by(pol_yr) |>
exp_stats(expected = "logistic")
# create a plot
plot_termination_rates(exp_res)
In addition, for users of the tidymodels framework, the actxps
package includes a recipe step function, step_expose()
,
that can apply the expose()
function during data
preprocessing.
library(recipes)
recipe(~ ., data = census_dat) |>
step_expose(end_date = "2019-12-31", target_status = "Surrender")
#>
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#>
#> ── Inputs
#> Number of variables by role
#> predictor: 11
#>
#> ── Operations
#> • Exposed data based on policy years for target status Surrender: <none>