Mar 31, 2026arXiv:2603.29123

Concept Training for Human-Aligned Language Models

Christine Zhang, Daniel Jurafsky, C. Shani

AI Summary

This paper introduces concept training, an alternative to next-token prediction (NTP) that trains language models to predict sets of semantically related tokens rather than single continuations. By treating semantically similar words as equally valid continuations, concept training improves alignment with human semantic similarity judgments on lexical benchmarks. The approach achieves this improved alignment and lower perplexity on semantically meaningful words, with only a modest increase in global token-level perplexity.

Key Contribution

LLMs can better capture human semantic similarity by predicting sets of related concepts instead of single next tokens.

Abstract

The next-token prediction (NTP) objective trains language models to predict a single continuation token at each step. In natural language, however, a prefix can be continued in many valid ways, and even similar meanings may differ in surface form. For example, the sentence ``this website is safe to \underline{browse}''could plausibly continue with words such as browse, search, visit, surf, or navigate. While standard NTP training treats these alternatives as mutually exclusive targets, we explore a framework that instead predicts concepts, approximated as sets of semantically related tokens. We show that models trained with concept supervision exhibit stronger alignment with human semantic similarity judgments on multiple lexical benchmarks. These gains are accompanied by lower perplexity on semantically meaningful words (definition in Section 3.1), and a modest increase in global token-level perplexity, reflecting a tradeoff between standard NTP optimization and concept-level supervision. Our results suggest that concept-level objectives can improve semantic alignment while maintaining competitive language modeling performance.

Data Curation & Synthetic Data Natural Language Processing Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Concept Training for Human-Aligned Language Models

Related Papers