% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/Coxmos_coxEN.R
\name{cv.coxEN}
\alias{cv.coxEN}
\title{coxEN Cross-Validation}
\usage{
cv.coxEN(
  X,
  Y,
  EN.alpha.list = seq(0, 1, 0.2),
  max.variables = NULL,
  n_run = 3,
  k_folds = 10,
  x.center = TRUE,
  x.scale = FALSE,
  remove_near_zero_variance = TRUE,
  remove_zero_variance = TRUE,
  toKeep.zv = NULL,
  remove_variance_at_fold_level = FALSE,
  remove_non_significant_models = FALSE,
  remove_non_significant = FALSE,
  alpha = 0.05,
  w_AIC = 0,
  w_C.Index = 0,
  w_AUC = 1,
  w_I.BRIER = 0,
  times = NULL,
  max_time_points = 15,
  MIN_AUC_INCREASE = 0.01,
  MIN_AUC = 0.8,
  MIN_COMP_TO_CHECK = 3,
  pred.attr = "mean",
  pred.method = "cenROC",
  fast_mode = FALSE,
  MIN_EPV = 5,
  return_models = FALSE,
  returnData = FALSE,
  PARALLEL = FALSE,
  n_cores = NULL,
  verbose = FALSE,
  seed = 123
)
}
\arguments{
\item{X}{Numeric matrix or data.frame. Explanatory variables. Qualitative variables must be
transform into binary variables.}

\item{Y}{Numeric matrix or data.frame. Response variables. Object must have two columns named as
"time" and "event". For event column, accepted values are: 0/1 or FALSE/TRUE for censored and
event observations.}

\item{EN.alpha.list}{Numeric vector. Elastic net mixing parameter values to test in cross
validation. EN.alpha = 1 is the lasso penalty, and EN.alpha = 0 the ridge penalty
(default: seq(0,1,0.2)).}

\item{max.variables}{Numeric. Maximum number of variables you want to keep in the cox model. If
NULL, the number of columns of X matrix is selected. When MIN_EPV is not meet, the value will be
change automatically (default: NULL).}

\item{n_run}{Numeric. Number of runs for cross validation (default: 3).}

\item{k_folds}{Numeric. Number of folds for cross validation (default: 10).}

\item{x.center}{Logical. If x.center = TRUE, X matrix is centered to zero means (default: TRUE).}

\item{x.scale}{Logical. If x.scale = TRUE, X matrix is scaled to unit variances (default: FALSE).}

\item{remove_near_zero_variance}{Logical. If remove_near_zero_variance = TRUE, near zero variance
variables will be removed (default: TRUE).}

\item{remove_zero_variance}{Logical. If remove_zero_variance = TRUE, zero variance variables will
be removed (default: TRUE).}

\item{toKeep.zv}{Character vector. Name of variables in X to not be deleted by (near) zero variance
filtering (default: NULL).}

\item{remove_variance_at_fold_level}{Logical. If remove_variance_at_fold_level = TRUE, (near) zero
variance will be removed at fold level. Not recommended. (default: FALSE).}

\item{remove_non_significant_models}{Logical. If remove_non_significant_models = TRUE,
non-significant models are removed before computing the evaluation. A non-significant model is a
model with at least one component/variable with a P-Value higher than the alpha cutoff.}

\item{remove_non_significant}{Logical. If remove_non_significant = TRUE, non-significant
variables/components in final cox model will be removed until all variables are significant by
forward selection (default: FALSE).}

\item{alpha}{Numeric. Numerical values are regarded as significant if they fall below the threshold
(default: 0.05).}

\item{w_AIC}{Numeric. Weight for AIC evaluator. All weights must sum 1 (default: 0).}

\item{w_C.Index}{Numeric. Weight for C-Index evaluator. All weights must sum 1 (default: 0).}

\item{w_AUC}{Numeric. Weight for AUC evaluator. All weights must sum 1 (default: 1).}

\item{w_I.BRIER}{Numeric. Weight for BRIER SCORE evaluator. All weights must sum 1 (default: 0).}

\item{times}{Numeric vector. Time points where the AUC will be evaluated. If NULL, a maximum of
'max_time_points' points will be selected equally distributed (default: NULL).}

\item{max_time_points}{Numeric. Maximum number of time points to use for evaluating the model
(default: 15).}

\item{MIN_AUC_INCREASE}{Numeric. Minimum improvement between different cross validation models to
continue evaluating higher values in the multiple tested parameters. If it is not reached for next
'MIN_COMP_TO_CHECK' models and the minimum 'MIN_AUC' value is reached, the evaluation stops
(default: 0.01).}

\item{MIN_AUC}{Numeric. Minimum AUC desire to reach cross-validation models. If the minimum is
reached, the evaluation could stop if the improvement does not reach an AUC higher than adding the
'MIN_AUC_INCREASE' value (default: 0.8).}

\item{MIN_COMP_TO_CHECK}{Numeric. Number of penalties/components to evaluate to check if the AUC
improves. If for the next 'MIN_COMP_TO_CHECK' the AUC is not better and the 'MIN_AUC' is meet, the
evaluation could stop (default: 3).}

\item{pred.attr}{Character. Way to evaluate the metric selected. Must be one of the following:
"mean" or "median" (default: "mean").}

\item{pred.method}{Character. AUC evaluation algorithm method for evaluate the model performance.
Must be one of the following: "risksetROC", "survivalROC", "cenROC", "nsROC", "smoothROCtime_C",
"smoothROCtime_I" (default: "cenROC").}

\item{fast_mode}{Logical. If fast_mode = TRUE, for each run, only one fold is evaluated
simultaneously. If fast_mode = FALSE, for each run, all linear predictors are computed for test
observations. Once all have their linear predictors, the evaluation is perform across all the
observations together (default: FALSE).}

\item{MIN_EPV}{Numeric. Minimum number of Events Per Variable (EPV) you want reach for the final
cox model. Used to restrict the number of variables/components can be computed in final cox models.
If the minimum is not meet, the model cannot be computed (default: 5).}

\item{return_models}{Logical. Return all models computed in cross validation (default: FALSE).}

\item{returnData}{Logical. Return original and normalized X and Y matrices (default: TRUE).}

\item{PARALLEL}{Logical. Run the cross validation with multicore option. As many cores as your
total cores - 1 will be used. It could lead to higher RAM consumption (default: FALSE).}

\item{n_cores}{Numeric. Number of cores to use for parallel processing. This parameter is only
used if \code{PARALLEL} is \code{TRUE}. If \code{NULL}, it will use all available cores minus one. Otherwise,
it will use the minimum between the value specified and the total number of cores - 1. The fewer
cores used, the less RAM memory will be used.(default: NULL).}

\item{verbose}{Logical. If verbose = TRUE, extra messages could be displayed (default: FALSE).}

\item{seed}{Number. Seed value for performing runs/folds divisions (default: 123).}
}
\value{
Instance of class "Coxmos" and model "cv.coxEN". The class contains the following elements:
\code{best_model_info}: A data.frame with the information for the best model.
\code{df_results_folds}: A data.frame with fold-level information.
\code{df_results_runs}: A data.frame with run-level information.
\code{df_results_comps}: A data.frame with component-level information (for cv.coxEN, EN.alpha
information).

\code{lst_models}: If return_models = TRUE, return a the list of all cross-validated models.
\code{pred.method}: AUC evaluation algorithm method for evaluate the model performance.

\code{opt.EN.alpha}: Optimal EN.alpha value selected by the best_model.
\code{opt.nvar}: Optimal number of variables selected by the best_model.

\code{plot_AIC}: AIC plot by each hyper-parameter.
\code{plot_C.Index}: C-Index plot by each hyper-parameter.
\code{plot_I.BRIER}: Integrative Brier Score plot by each hyper-parameter.
\code{plot_AUC}: AUC plot by each hyper-parameter.

\code{class}: Cross-Validated model class.

\code{lst_train_indexes}: List (of lists) of indexes for the observations used in each run/fold
for train the models.
\code{lst_test_indexes}: List (of lists) of indexes for the observations used in each run/fold
for test the models.

\code{time}: time consumed for running the cross-validated function.
}
\description{
This function performs cross-validated CoxEN (coxEN).
The function returns the optimal number of EN penalty value based on cross-validation.
The performance could be based on multiple metrics as Area Under the Curve (AUC), I. Brier Score or
C-Index. Furthermore, the user could establish more than one metric simultaneously.
}
\details{
The \verb{coxEN Cross-Validation} function provides a robust mechanism to optimize the hyperparameters
of the cox elastic net model through cross-validation. By systematically evaluating a range of
elastic net mixing parameters (\code{EN.alpha.list}), this function identifies the optimal balance
between lasso and ridge penalties for survival analysis.

The cross-validation process is structured across multiple runs (\code{n_run}) and folds (\code{k_folds}),
ensuring a comprehensive assessment of model performance. Users can prioritize specific evaluation
metrics, such as AUC, I. Brier Score, or C-Index, by assigning weights (\code{w_AIC}, \code{w_C.Index}, \code{w_AUC},
\code{w_I.BRIER}). The function also offers flexibility in the AUC evaluation method (\code{pred.method}) and
the attribute for metric evaluation (\code{pred.attr}).

One of the distinguishing features of this function is its adaptive evaluation process. The
function can terminate the cross-validation early if the improvement in AUC does not exceed the
\code{MIN_AUC_INCREASE} threshold or if a predefined AUC (\code{MIN_AUC}) is achieved. This adaptive approach
ensures computational efficiency without compromising the quality of the results.

Data preprocessing options are integrated into the function, emphasizing the significance of data
quality. Options to remove near-zero and zero variance variables, either globally or at the fold
level, are available. The function also supports multicore processing (\code{PARALLEL} option) to
expedite the cross-validation process.

Upon execution, the function returns a detailed output, encompassing information about the best
model, performance metrics at various granularities (fold, run, component), and if desired, all
cross-validated models.
}
\examples{
data("X_proteomic")
data("Y_proteomic")
X_proteomic <- X_proteomic[1:30,1:40]
Y_proteomic <- Y_proteomic[1:30,]
set.seed(123)
index_train <- caret::createDataPartition(Y_proteomic$event, p = .5, list = FALSE, times = 1)
X_train <- X_proteomic[index_train,]
Y_train <- Y_proteomic[index_train,]
cv.coxEN_model <- cv.coxEN(X_train, Y_train, EN.alpha.list = c(0.1,0.5),
x.center = TRUE, x.scale = TRUE)
}
\author{
Pedro Salguero Garcia. Maintainer: pedsalga@upv.edu.es
}
