Search papers, labs, and topics across Lattice.
This paper investigates whether character-aware transformer models can learn and generalize the irregular Spanish L-shaped morphome, where the first-person singular indicative shares a stem with all subjunctive forms. Five encoder-decoder transformer variants were trained and evaluated, varying positional encoding (sequential vs. position-invariant) and tag representations (atomic vs. decomposed). The key finding is that while position-invariant models can recover the L-shaped paradigm clustering, none of the models generalize the pattern productively to novel forms in a human-like manner, revealing a gap between statistical learning and morphological abstraction.
Despite capturing irregular morphological patterns during training, transformers fail to generalize these patterns to novel forms in a human-like way, highlighting a key difference between statistical learning and true morphological abstraction.
Whether neural networks can serve as cognitive models of morphological learning remains an open question. Recent work has shown that encoder-decoder models can acquire irregular patterns, but evidence that they generalize these patterns like humans is mixed. We investigate this using the Spanish \emph{L-shaped morphome}, where only the first-person singular indicative (e.g., \textit{pongo} `I put') shares its stem with all subjunctive forms (e.g., \textit{ponga, pongas}) despite lacking apparent phonological, semantic, or syntactic motivation. We compare five encoder-decoder transformers varying along two dimensions: sequential vs. position-invariant positional encoding, and atomic vs. decomposed tag representations. Positional encoding proves decisive: position-invariant models recover the correct L-shaped paradigm clustering even when L-shaped verbs are scarce in training, whereas sequential positional encoding models only partially capture the pattern. Yet none of the models productively generalize this pattern to novel forms. Position-invariant models generalize the L-shaped stem across subjunctive cells but fail to extend it to the first-person singular indicative, producing a mood-based generalization rather than the L-shaped morphomic pattern. Humans do the opposite, generalizing preferentially to the first-person singular indicative over subjunctive forms. None of the models reproduce the human pattern, highlighting the gap between statistical pattern reproduction and morphological abstraction.