Search papers, labs, and topics across Lattice.
This paper investigates why transformer language model probabilities are sometimes poor predictors of human reading time, despite their strong next-word prediction performance. The authors hypothesize that reading time is more sensitive to simple n-gram statistics than the complex patterns learned by transformers. They show that neural language models whose predictions correlate more strongly with n-gram probabilities also exhibit a higher correlation with eye-tracking-based reading time metrics.
State-of-the-art language models might be too sophisticated: simpler n-gram statistics better explain human reading times.
Recent work has found that contemporary language models such as transformers can become so good at next-word prediction that the probabilities they calculate become worse for predicting reading time. In this paper, we propose that this can be explained by reading time being sensitive to simple n-gram statistics rather than the more complex statistics learned by state-of-the-art transformer language models. We demonstrate that the neural language models whose predictions are most correlated with n-gram probability are also those that calculate probabilities that are the most correlated with eye-tracking-based metrics of reading time on naturalistic text.