% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/gen_bin_data.R
\name{gen_bin_data}
\alias{gen_bin_data}
\title{generate the data used for the model experiment}
\usage{
gen_bin_data(beta, N, nclass, seed)
}
\arguments{
\item{beta}{A numeric vector that represent the true coefficient that used to
generate the synthesized data.}

\item{N}{A numeric number specifying the number of the synthesized data. It
should be a integer.}

\item{nclass}{A numeric number used to specify how many clusters the original
data would be transformed into. It should be a integer.}

\item{seed}{Set random number seed.}
}
\value{
a list of seven elements:
\item{data.clust}{list with clustering results. Samples in the same list
element are closer with each other}
\item{X}{the samples with the smallest variance from each cluster. Note that
the length of X is the same as the number of data.clust}
\item{y}{the target value of 0 or 1 corresponding to X}
}
\description{
\code{gen_bin_data} generate the data used for the model experiment
}
\details{
The function gen_bin_data generates N points. That is,the first column of the
design matrix is 1 and the second column has a normal distribution with a
mean of 1 and a variance of 1 and the rest columns with a mean of 0 and a
variance of 1. Next, they are clustered into classes to decrease the
computation cost. You should specify the number of classes. In the function,
it's the parameter nclass.
}
\examples{
## For an example, see example(seq_bin_model)
}
\references{
{
Wang, Z., & Chang, Y. I. (2013). Sequential estimate for linear regression
models with uncertain number of effective variables. \emph{Metrika}, 76(7), 949–978.
doi:10.1007/s00184-012-0426-4
}
}
\seealso{
{
   \code{\link{gen_multi_data}} for categorical and ordinal case

   \code{\link{gen_GEE_data}} for generalized estimating equations case.


}
}
