Search papers, labs, and topics across Lattice.
This paper benchmarks Large Language Models (LLMs) against Knowledge Tracing (KT) models on the task of predicting student responses to questions. The study evaluates predictive performance, deployment cost, and inference speed. Results indicate that KT models achieve higher accuracy and F1 scores, while also being significantly faster and cheaper to deploy compared to LLMs for this domain-specific task.
Domain-specific Knowledge Tracing models crush LLMs in accuracy, speed, and cost for predicting student responses, proving that bigger isn't always better.
Predicting future student responses to questions is particularly valuable for educational learning platforms where it enables effective interventions. One of the key approaches to do this has been through the use of knowledge tracing (KT) models. These are small, domain-specific, temporal models trained on student question-response data. KT models are optimised for high accuracy on specific educational domains and have fast inference and scalable deployments. The rise of Large Language Models (LLMs) motivates us to ask the following questions: (1) How well can LLMs perform at predicting students'future responses to questions? (2) Are LLMs scalable for this domain? (3) How do LLMs compare to KT models on this domain-specific task? In this paper, we compare multiple LLMs and KT models across predictive performance, deployment cost, and inference speed to answer the above questions. We show that KT models outperform LLMs with respect to accuracy and F1 scores on this domain-specific task. Further, we demonstrate that LLMs are orders of magnitude slower than KT models and cost orders of magnitude more to deploy. This highlights the importance of domain-specific models for education prediction tasks and the fact that current closed source LLMs should not be used as a universal solution for all tasks.