---
title: "Evaluating performance in time-to-event data"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{time-to-event}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```


```{r setup}
library(ipeval)
library(survival)
```

This vignette demonstrates how to estimate counterfactual performance metrics in right censored survival data. 

Just like in the binary outcome, an IP-weighted pseudopopulation is created which represents a situation in which everybody gets the treatment level of interest. On top of that, inverse probability of censoring weights are used such that the pseudopopulation also represents a situation where nobody gets censored. 

## Example 1, non informative censoring

We simulate time-to-event data, where approximately half of patients get censored. In this example, we use a more complex causal structure, with two confounders ($L1$ and $L2$) and two prognostic variables ($P1$ and $P2$). $L1$ and $P1$ are standard normal variables, and $L2$ and $P2$ are binary variables. 

```{r}
simulate_time_to_event <- function(n, constant_baseline_haz, LP) {
  u <- runif(n)
  -log(u) / (constant_baseline_haz * exp(LP))
}

simulate_data <- function(n, seed) {
  set.seed(seed)
  data <- data.frame(
    L1 = rnorm(n),
    L2 = rbinom(n, 1, 0.5),
    P1 = rnorm(n),
    P2 = rbinom(n, 1, 0.5)
  )
  data$A <- rbinom(n, 1, plogis(0.2 + 1.2*data$L1 - 0.3*data$L2))
  
  LP <- 0.8*data$L1 + 0.3*data$L2 + 0.5*data$P1 + 0.7*data$P2
  
  # time_0 is the uncensored untreated survival time,
  # time_1 is the uncensored treated   survival time 
  data$time_0 <- simulate_time_to_event(n, 0.04, LP)
  data$time_1 <- simulate_time_to_event(n, 0.04, LP - 0.5)
  
  # time_A is the uncensored survival time corresponding to assigned treatment 
  data$time_A <- ifelse(data$A == 1, data$time_1, data$time_0)
  
  # uninformative censoring
  data$censor_time <- simulate_time_to_event(n, 0.05, 0)
  
  # in practice, we only observe (time, status):
  data$status <- ifelse(data$time_A <= data$censor_time, TRUE, FALSE)
  data$time <- ifelse(data$status == TRUE,
                      data$time_A,
                      data$censor_time)
  return(data)
}

df_dev <- simulate_data(n = 10000, seed = 123)

summary(df_dev$time)
summary(df_dev$status)
```
Fit some models to validate:

```{r}
model_naive <- coxph(
  formula = Surv(time, status) ~ P1 + P2 + A,
  data = df_dev
)

coefficients(model_naive)

trt_model <- glm(A ~ L1 + L2, family = "binomial", data = df_dev)
propensity_score <- predict(trt_model, type = "response")

df_dev$iptw <- 1 / ifelse(df_dev$A == 1, propensity_score, 1 - propensity_score)

model_causal <- coxph(
  formula = Surv(time, status) ~ P1 + P2 + A,
  data = df_dev,
  weights = iptw
)

coefficients(model_causal)
```

The naive model estimates higher risk for treated patients than for untreated patients. The causal model correctly infers that treatment benefits patients (as the coefficient for $A$ is negative). Note that the 'true' effect of $A$ was generated within a model that also conditions on $L1$ and $L2$. Due to non-collapsibility, the estimated coefficient is not expected to coincide with the effect used in the data-generating mechanism.

We also simulate some validation data. In this example, we use the same data generating mechanism. We are interested in the predictive performance if every patient were to be assigned to treatment and remained uncensored, with a prediction horizon of 5 years. 

```{r}
df_val <- simulate_data(n = 10000, seed = 234)
```

To account for the time to event data, we specify a survival object as the outcome. As we have non-informative censoring in this example, we specify our censoring model to be estimated with Kaplan-Meier:

```{r, fig.width=6, fig.height=6}
cfs <- ip_score(
  object = list("naive model" = model_naive, "causal model" = model_causal),
  data = df_val,
  outcome = Surv(time, status),
  treatment_formula = A ~ L1 + L2,
  treatment_of_interest = 1,
  time_horizon = 5,
  cens_model = "KM"
)
cfs
```

## Example 2, informative censoring
We can also account for informative censoring. In this example, we keep the same prediction model as in example 1, but add an informative-censoring mechanism to the validation dataset.

```{r}
df_val$censortime <- simulate_time_to_event(10000, 0.04, 0.1*df_val$L1 + 0.5*df_val$P2)
summary(df_val$censortime)

df_val$status <- ifelse(df_val$time_A <= df_val$censortime, TRUE, FALSE)
df_val$time <- ifelse(df_val$status == TRUE,
                      df_val$time_A,
                      df_val$censortime)

summary(df_val$time)
summary(df_val$status)
```

We then set censoring model to "cox", and supply the formula needed for the censoring model as the right hand side of cens_formula. Internally, the censoring distribution is estimated with this expression, from which the inverse probability of **censoring** weights are computed.

```{r, fig.width=6, fig.height=6}
ip_score(
  object = list("naive model" = model_naive, "causal model" = model_causal),
  data = df_val,
  outcome = Surv(time, status),
  treatment_formula = A ~ L1 + L2,
  treatment_of_interest = 1,
  time_horizon = 5,
  cens_model = "cox",
  cens_formula = ~ L1 + P2,
  bootstrap = 100,
  bootstrap_progress = FALSE
)
```

Note that the performance metrics we found in this example are approximately equal to the setting with non informative censoring. This is reassuring, as we correctly adjust for the informative censoring mechanism in this setting.