Search papers, labs, and topics across Lattice.
This paper introduces a framework that uses the English Grammar Profile (EGP) to analyze L2 learners' grammatical competence by detecting and classifying their attempts at specific grammatical constructs as successful or unsuccessful. The framework uses both rule-based and LLM-based classifiers, finding that LLMs excel at semantically nuanced constructs while rule-based methods are better for morphological/syntactic features. A hybrid approach combining rule-based pre-filtering with LLMs achieves the best performance in proficiency assessment, and a fully automated pipeline using grammatical error correction approaches the performance of semi-automated systems.
LLMs beat rule-based systems at understanding nuanced grammar in language learners, but good old-fashioned rules still win on pure syntax.
Evaluating the grammatical competence of second language (L2) learners is essential both for providing targeted feedback and for assessing proficiency. To achieve this, we propose a novel framework leveraging the English Grammar Profile (EGP), a taxonomy of grammatical constructs mapped to the proficiency levels of the Common European Framework of Reference (CEFR), to detect learners'attempts at grammatical constructs and classify them as successful or unsuccessful. This detection can then be used to provide fine-grained feedback. Moreover, the grammatical constructs are used as predictors of proficiency assessment by using automatically detected attempts as predictors of holistic CEFR proficiency. For the selection of grammatical constructs derived from the EGP, rule-based and LLM-based classifiers are compared. We show that LLMs outperform rule-based methods on semantically and pragmatically nuanced constructs, while rule-based approaches remain competitive for constructs that rely purely on morphological or syntactic features and do not require semantic interpretation. For proficiency assessment, we evaluate both rule-based and hybrid pipelines and show that a hybrid approach combining a rule-based pre-filter with an LLM consistently yields the strongest performance. Since our framework operates on pairs of original learner sentences and their corrected counterparts, we also evaluate a fully automated pipeline using automatic grammatical error correction. This pipeline closely approaches the performance of semi-automated systems based on manual corrections, particularly for the detection of successful attempts at grammatical constructs. Overall, our framework emphasises learners'successful attempts in addition to unsuccessful ones, enabling positive, formative feedback and providing actionable insights into grammatical development.