Punjabi UniversityThapar Institute of Engineering and TechnologyUniversity Institute of Computing Chandigarh UniversityJan 24, 2025

Evaluating Large Language Models for Pronoun Resolution in Punjabi: A Comparative Study of ChatGPT and Gemini

Priyanka Prajapati, Vishal Goyal, Kawaljit Kaur

AI Summary

This paper evaluates the performance of ChatGPT and Gemini on pronoun resolution in Punjabi, a low-resource language, to assess their cross-lingual transfer capabilities. The study uses a specifically constructed Punjabi pronoun resolution dataset to benchmark the models. Results show that ChatGPT outperforms Gemini, but both models still require further refinement for robust performance in low-resource settings.

Key Contribution

Despite advances in LLMs, pronoun resolution in low-resource languages like Punjabi remains a challenge, with even top models like ChatGPT and Gemini showing limitations.

Abstract

In artificial intelligence, Large Language Models (LLMs) are becoming essential, especially in Natural Language Processing (NLP). These models enable researchers to develop more complex and effective solutions for various NLP tasks, including text summarization, sentiment analysis, question-answering systems, and many more. Reference resolution (pronoun resolution), is essential for the functionality of systems such as question-answering platforms and machine translation tools. While LLMs have demonstrated remarkable performance in reference resolution for languages with abundant resources, such as English, their effectiveness in languages with limited resources remains unknown. This paper evaluates LLMs, specifically ChatGPT and Gemini, emphasizing their ability to resolve references in Punjabi, a resource-poor language. In addition to assessing the current capabilities of these models, our research aims to identify their shortcomings and areas for improvement. The experiment indicates that ChatGPT demonstrates superior performance compared to Gemini, but additional refinement and enhancement are needed to produce more consistent and reliable outcomes.

Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations1

Influential citations0

References10

Year2025

Venue2025 International Conference on Intelligent Systems and Computational Networks (ICISCN)

Related Papers

Finding related papers...

Search

Evaluating Large Language Models for Pronoun Resolution in Punjabi: A Comparative Study of ChatGPT and Gemini

Related Papers