Search papers, labs, and topics across Lattice.
This paper introduces a novel, unsupervised method for measuring psychological states from language using semantic projection, operationalizing constructs like depression and anxiety as interpretable semantic axes derived from clinical scales. Sentence-BERT embeddings of participant text are projected onto these axes to generate continuous psychological scores. Experiments across various response formats (selected words, phrases, free-text) demonstrate strong correlations between projection scores and standardized clinical measures, particularly with structured formats and sentence-level aggregation for free-text.
Ditch the black box: This unsupervised semantic projection method rivals supervised models in psychological assessment, offering interpretability and generalizability that supervised methods lack.
Recent advances in natural language processing have enabled increasingly accurate estimation of psychological traits from language. However, most existing approaches rely on supervised models trained to predict questionnaire scores, limiting interpretability and generalizability across contexts. The present study introduces a theory-driven and fully unsupervised framework for measuring psychological states directly from natural language using semantic projection. Psychological constructs were operationalized as interpretable semantic axes derived from lexical anchors and items from validated clinical scales assessing depression, anxiety, and worry. Participants textual responses were embedded using Sentence-BERT and projected onto these axes to generate continuous psychological scores across multiple response formats, including selected words, generated words, phrases, and free-text responses. Projection scores were evaluated through correlations with standardized clinical measures , split-half reliability analyses, attenuation corrections, distributional similarity using Wasserstein distance, and comparisons with lexicon-based sentiment analysis (VADER). Results showed strong associations between projection scores and clinical measures, particularly for structured formats such as selected words, written words, and phrases. Free-text responses produced weaker results when analyzed as whole texts, but performance improved substantially when sentence-level aggregation strategies were applied. These findings support semantic projection as an interpretable and scalable alternative to supervised language models for psychological assessment and highlight the importance of response format and text-processing strategies in language-based mental health measurement.