% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/sim.LPA.R
\name{sim.LPA}
\alias{sim.LPA}
\title{Simulate Data for Latent Profile Analysis}
\usage{
sim.LPA(
  N = 1000,
  I = 5,
  L = 2,
  constraint = "VV",
  distribution = "random",
  mean.range = c(-2, 2),
  covs.range = c(0.01, 4),
  params = NULL,
  is.sort = TRUE
)
}
\arguments{
\item{N}{Integer; total number of observations to simulate. Must be \eqn{\geq} \code{L} (Default = 1000).}

\item{I}{Integer; number of continuous observed variables. Must be \eqn{\geq 1} (Default = 5).}

\item{L}{Integer; number of latent profiles (classes). Must be \eqn{\geq 1} (Default = 2).}

\item{constraint}{Character string or list specifying covariance constraints. See detailed description below.
Default is \code{"VV"} (fully heterogeneous covariances).}

\item{distribution}{Character; distribution of class sizes. Options: \code{"random"} (default) or \code{"uniform"}.}

\item{mean.range}{Numeric vector of length 2; range for sampling class-specific means.
Each variable's means are sampled uniformly from \code{mean.range[1]} to \code{mean.range[2]}.
Default: \code{c(-4, 4)}.}

\item{covs.range}{Numeric vector of length 2; range for sampling variance parameters (diagonal elements).
Must satisfy \code{covs.range[1] > 0} and \code{covs.range[2] > covs.range[1]}. Off-diagonal covariances
are derived from correlations scaled by these variances. Default: \code{c(0.01, 4)}.}

\item{params}{List with fixed parameters for simulation:
\describe{
\item{\code{par}}{\eqn{L \times I \times K_{\max}} array of conditional response probabilities per latent class.}
\item{\code{P.Z}}{Vector of length \eqn{L} with latent class prior probabilities.}
\item{\code{Z}}{Vector of length \eqn{N} containing the latent classes of observations. A fixed
observation classes \code{Z} is applied directly to simulate data only when \code{P.Z}
is \code{NULL} and \code{Z} is a \code{N} length vector.}
}}

\item{is.sort}{A logical value. If \code{TRUE} (Default), the latent classes will be ordered in descending
order according to \code{P.Z}. All other parameters will be adjusted accordingly
based on the reordered latent classes.}
}
\value{
A list containing:
\describe{
\item{response}{Numeric matrix (\eqn{N \times I}) of simulated observations. Rows are observations,
columns are variables named \code{"V1"}, \code{"V2"}, ..., or \code{"UV"} for univariate data.}
\item{means}{Numeric matrix (\eqn{L \times I}) of true class-specific means.
Row names: \code{"Class1"}, \code{"Class2"}, ...; column names match \code{response}.}
\item{covs}{Array (\eqn{I \times I \times L}) of true class-specific covariance matrices.
Dimensions: variables x variables x classes. Constrained parameters have identical values across class slices.
Dimension names match \code{response} and class labels.}
\item{P.Z.Xn}{Numeric matrix (\eqn{N \times L}) of true class membership probabilities (one-hot encoded).
Row \code{i}, column \code{l} = 1 if observation \code{i} belongs to class \code{l}, else 0.
Row names: \code{"O1"}, \code{"O2"}, ...; column names: \code{"Class1"}, \code{"Class2"}, ...}
\item{P.Z}{Numeric vector (length \eqn{L}) of true class proportions.
Named with class labels (e.g., \code{"Class1"}).}
\item{Z}{Integer vector (length \eqn{N}) of true class assignments (1 to L).
Named with observation IDs (e.g., \code{"O1"}).}
\item{constraint}{Original constraint specification (character string or list) passed to the function.}
}
}
\description{
Generates synthetic multivariate continuous data from a latent profile model with \code{L} latent classes.
Supports flexible covariance structure constraints (including custom equality constraints) and
class size distributions. All covariance matrices are ensured to be positive definite.
}
\details{
Mean Generation: For each variable, \eqn{3L} candidate means are sampled uniformly from \code{mean.range}.
\eqn{L} distinct means are selected without replacement to ensure separation between classes.

Covariance Generation:
\itemize{
\item \strong{Positive Definiteness:} All covariance matrices are adjusted using \code{Matrix::nearPD}
and eigenvalue thresholds (\eqn{> 10^{-8}}) to guarantee validity. Failed attempts trigger explicit errors.
\item \strong{Univariate Case (\code{I=1}):} Constraints \code{"UE"} and \code{"UV"} are enforced automatically.
Predefined constraints like \code{"E0"} map to \code{"UE"}.
\item \strong{VE Constraint:} Requires special handling—base off-diagonal elements are fixed, and diagonals
are sampled above a minimum threshold to maintain positive definiteness. May fail if \code{covs.range} is too narrow.
}

Class Assignment:
\itemize{
\item \code{"random"}: Uses Dirichlet distribution (\eqn{\alpha = 3}) to avoid extremely small classes.
Sizes are rounded and adjusted to sum exactly to \code{N}.
\item \code{"uniform"}: Simple random sampling with equal probability. May produce empty classes if \code{N} is small.
}

Data Generation: Observations are simulated using \code{mvtnorm::rmvnorm} per class.
Final data and class labels are shuffled to remove ordering artifacts.
}
\section{Covariance Constraints}{

The \code{constraint} parameter controls equality constraints on covariance parameters across classes:
\describe{
\item{Predefined Constraints (Character Strings):}{
\describe{
\item{\code{"UE"} (Univariate only)}{Equal variance across all classes.}
\item{\code{"UV"} (Univariate only)}{Varying variances across classes.}
\item{\code{"E0"}}{Equal variances across classes, zero covariances (diagonal matrix with shared variances).}
\item{\code{"V0"}}{Varying variances across classes, zero covariances (diagonal matrix with free variances).}
\item{\code{"EE"}}{Equal full covariance matrix across all classes (homogeneous).}
\item{\code{"EV"}}{Equal variances but varying covariances (equal diagonal, free off-diagonal).}
\item{\code{"VE"}}{Varying variances but equal correlations (free diagonal, equal correlation structure).}
\item{\code{"VV"}}{Varying full covariance matrices across classes (heterogeneous; default).}
}
}
\item{Custom Constraints (List of integer vectors):}{
Each element specifies a pair of variables whose covariance parameters are constrained equal across classes:
\describe{
\item{\code{c(i,i)}}{Constrains variance of variable \code{i} to be equal across all classes.}
\item{\code{c(i,j)}}{Constrains covariance between variables \code{i} and \code{j} to be equal across all classes
(symmetric: automatically includes \code{c(j,i)}).}
}
Unconstrained parameters vary freely. The algorithm ensures positive definiteness by:
\enumerate{
\item Generating a base positive definite matrix \code{S0}.
\item Applying constraints via a logical mask.
\item Adjusting unconstrained variances to maintain positive definiteness.
}
Critical requirements for custom constraints:
\describe{
\item{At least one variance must be unconstrained if any off-diagonal covariance is unconstrained.}{}
\item{All indices must be between 1 and \code{I}.}{}
\item{For univariate data (\code{I=1}), only \code{list(c(1,1))} is valid.}{}
}
}
}
}

\section{Class Size Distribution}{

\describe{
\item{\code{"random"}}{(Default) Class proportions drawn from Dirichlet distribution (\eqn{\alpha = 3} for all classes),
ensuring no empty classes. Sizes are rounded to integers with adjustment for exact \code{N}.}
\item{\code{"uniform"}}{Equal probability of class membership (\eqn{1/L} per class), sampled with replacement.}
}
}

\examples{
# Example 1: Bivariate data, 3 classes, heterogeneous covariances (default)
sim_data <- sim.LPA(N = 500, I = 2, L = 3, constraint = "VV")

# Example 2: Univariate data, equal variances
# 'E0' automatically maps to 'UE' for I=2
sim_uni <- sim.LPA(N = 200, I = 2, L = 2, constraint = "E0")

# Example 3: Custom constraints
# - Equal covariance between V1 and V2 across classes
# - Equal variance for V3 across classes
sim_custom <- sim.LPA(
  N = 300,
  I = 3,
  L = 4,
  constraint = list(c(1, 2), c(3, 3))
)

# Example 4: VE constraint (varying variances, equal correlations)
sim_ve <- sim.LPA(N = 400, I = 3, L = 3, constraint = "VE")

# Example 5: Uniform class sizes
sim_uniform <- sim.LPA(N = 300, I = 4, L = 5, distribution = "uniform")

}
