Search papers, labs, and topics across Lattice.
This paper addresses the problem of determining the number of latent classes in latent class models applied to ordinal categorical data, a common challenge in social sciences. A novel test statistic based on the largest singular value of a normalized residual matrix, adjusted for sample size, is proposed. The statistic exhibits dichotomous behavior, converging to zero under the null hypothesis of a correctly specified model and exceeding a positive constant under an under-fitted alternative, enabling consistent estimation of the true number of latent classes.
A simple sample-size adjustment to the largest singular value of a normalized residual matrix provably and practically identifies the correct number of latent classes in ordinal categorical data.
Ordinal categorical data are widely collected in psychology, education, and other social sciences, appearing commonly in questionnaires, assessments, and surveys. Latent class models provide a flexible framework for uncovering unobserved heterogeneity by grouping individuals into homogeneous classes based on their response patterns. A fundamental challenge in applying these models is determining the number of latent classes, which is unknown and must be inferred from data. In this paper, we propose one test statistic for this problem. The test statistic centers the largest singular value of a normalized residual matrix by a simple sample-size adjustment. Under the null hypothesis that the candidate number of latent classes is correct, its upper bound converges to zero in probability. Under an under-fitted alternative, the statistic itself exceeds a fixed positive constant with probability approaching one. This sharp dichotomous behavior of the test statistic yields two sequential testing algorithms that consistently estimate the true number of latent classes. Extensive experimental studies confirm the theoretical findings and demonstrate their accuracy and reliability in determining the number of latent classes.