Feb 15, 2026arXiv:2602.14100

Character-aware Transformers Learn an Irregular Morphological Pattern Yet None Generalize Like Humans

Akhilesh Kakolu Ramarao, Kevin Tang, Dinah Baer-Henney

AI Summary

This paper investigates whether character-aware transformer models can learn and generalize the irregular Spanish L-shaped morphome, where the first-person singular indicative shares a stem with all subjunctive forms. Five encoder-decoder transformer variants were trained and evaluated, varying positional encoding (sequential vs. position-invariant) and tag representations (atomic vs. decomposed). The key finding is that while position-invariant models can recover the L-shaped paradigm clustering, none of the models generalize the pattern productively to novel forms in a human-like manner, revealing a gap between statistical learning and morphological abstraction.

Key Contribution

Despite capturing irregular morphological patterns during training, transformers fail to generalize these patterns to novel forms in a human-like way, highlighting a key difference between statistical learning and true morphological abstraction.

Abstract

Whether neural networks can serve as cognitive models of morphological learning remains an open question. Recent work has shown that encoder-decoder models can acquire irregular patterns, but evidence that they generalize these patterns like humans is mixed. We investigate this using the Spanish \emph{L-shaped morphome}, where only the first-person singular indicative (e.g., \textit{pongo} `I put') shares its stem with all subjunctive forms (e.g., \textit{ponga, pongas}) despite lacking apparent phonological, semantic, or syntactic motivation. We compare five encoder-decoder transformers varying along two dimensions: sequential vs. position-invariant positional encoding, and atomic vs. decomposed tag representations. Positional encoding proves decisive: position-invariant models recover the correct L-shaped paradigm clustering even when L-shaped verbs are scarce in training, whereas sequential positional encoding models only partially capture the pattern. Yet none of the models productively generalize this pattern to novel forms. Position-invariant models generalize the L-shaped stem across subjunctive cells but fail to extend it to the first-person singular indicative, producing a mood-based generalization rather than the L-shaped morphomic pattern. Humans do the opposite, generalizing preferentially to the first-person singular indicative over subjunctive forms. None of the models reproduce the human pattern, highlighting the gap between statistical pattern reproduction and morphological abstraction.

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Character-aware Transformers Learn an Irregular Morphological Pattern Yet None Generalize Like Humans

Related Papers