Ex. 3 - Retrieving regression coefficients

library(lsasim)
packageVersion("lsasim")
[1] '2.1.2'

With the arguments theta = TRUE, full_Output = TRUE and family = "gaussian", the output will automatically contain the $$\beta$$ vector and the $$R$$ matrix (i.e., beta_gen will be called automatically from within questionnaire_gen).

We generate one latent trait, two continuous, one binary, and one 3-category covariates. The data is generated from a multivariate normal distribution. The logical argument full_output is TRUE.

set.seed(1234)
bg <- questionnaire_gen(n_obs = 1000, n_X = 2, n_W = list(2, 3), theta = TRUE, family = "gaussian",
full_output = TRUE)
str(bg$bg) 'data.frame': 1000 obs. of 6 variables:$ subject: int  1 2 3 4 5 6 7 8 9 10 ...
$theta : num -1.732 0.707 0.911 1.509 -0.5 ...$ q1     : num  -0.491 0.16 -0.39 -1.307 0.602 ...
$q2 : num 0.5499 0.0669 0.087 -1.628 0.2559 ...$ q3     : Factor w/ 2 levels "1","2": 2 2 2 1 2 2 1 1 2 1 ...
$q4 : Factor w/ 3 levels "1","2","3": 1 2 3 2 1 2 1 1 1 3 ... linear_regression is a list that contains two elements. The first element, betas, summarizes the true regression coefficients $$\beta$$. The second element, vcov_YXW, shows the $$R$$ matrix. bg$linear_regression
$betas theta q1 q2 q3.2 q4.2 q4.3 -0.8174844 -0.5836818 -0.5402292 -0.1049199 0.8699444 1.7398887$vcov_YXW
theta          q1          q2          q3.2          q4.2        q4.3
theta  1.00000000 -0.17564245 -0.26487195 -7.889541e-02  0.000000e+00  0.12421718
q1    -0.17564245  1.00000000 -0.28027150  3.541595e-02  0.000000e+00  0.14963273
q2    -0.26487195 -0.28027150  1.00000000  2.556079e-01  0.000000e+00  0.07965237
q3.2  -0.07889541  0.03541595  0.25560791  2.500000e-01 -2.775558e-17  0.06097692
q4.2   0.00000000  0.00000000  0.00000000 -2.775558e-17  2.400000e-01 -0.12000000
q4.3   0.12421718  0.14963273  0.07965237  6.097692e-02 -1.200000e-01  0.21000000

beta_gen uses the output from questionnaire_gen to generate linear regression coefficients.

beta_gen(bg)
     theta         x1         x2        w12        w22        w23
-0.8174844 -0.5836818 -0.5402292 -0.1049199  0.8699444  1.7398887 

If the logical argument MC is TRUE in beta_gen, Monte Carlo simulation is used to estimate regression coefficients. If the logical argument rename_to_q is TRUE, the background variables are all labeled as q to match the default behavior of questionnaire_gen.

The first column contains the true $$\beta$$, as estimated by the covariance matrix, which will always be the same for the same data. The column of MC reports the Monte Carlo simulation estimates for $$\beta$$, which is sample-dependent and will change each time the beta_gen function is called. The next two columns summarize the 99% confidence interval for these estimates. And the column of cov_in_CI return to logical argument whether the cov_matrix estimates are contained within their respective confidence intervals (“1” corresponds to “yes” and “0” to “no”).

beta_gen(bg, MC = TRUE, MC_replications = 100, rename_to_q = TRUE)
      cov_matrix         MC       0.5%       99.5% cov_in_CI
theta -0.8174844 -0.7528447 -0.8380719 -0.64147366         1
q1    -0.5836818 -0.5627695 -0.6279581 -0.49666666         1
q2    -0.5402292 -0.4950472 -0.5571855 -0.44193911         1
q3.2  -0.1049199 -0.1876017 -0.3357658 -0.08231601         1
q4.2   0.8699444  0.8165117  0.7057719  0.92812704         1
q4.3   1.7398887  1.7389447  1.5867220  1.88949655         1
beta_gen(bg, MC = TRUE, MC_replications = 100, rename_to_q = TRUE)
      cov_matrix         MC       0.5%       99.5% cov_in_CI
theta -0.8174844 -0.7485083 -0.8550206 -0.63323259         1
q1    -0.5836818 -0.5650178 -0.6309527 -0.51627209         1
q2    -0.5402292 -0.4954573 -0.5544919 -0.43396311         1
q3.2  -0.1049199 -0.1793640 -0.3039317 -0.05262306         1
q4.2   0.8699444  0.8092411  0.6481412  0.95344844         1
q4.3   1.7398887  1.7205485  1.5569970  1.90153362         1