Search papers, labs, and topics across Lattice.
This paper evaluates the performance of ChatGPT and Gemini on pronoun resolution in Punjabi, a low-resource language, to assess their cross-lingual transfer capabilities. The study uses a specifically constructed Punjabi pronoun resolution dataset to benchmark the models. Results show that ChatGPT outperforms Gemini, but both models still require further refinement for robust performance in low-resource settings.
Despite advances in LLMs, pronoun resolution in low-resource languages like Punjabi remains a challenge, with even top models like ChatGPT and Gemini showing limitations.
In artificial intelligence, Large Language Models (LLMs) are becoming essential, especially in Natural Language Processing (NLP). These models enable researchers to develop more complex and effective solutions for various NLP tasks, including text summarization, sentiment analysis, question-answering systems, and many more. Reference resolution (pronoun resolution), is essential for the functionality of systems such as question-answering platforms and machine translation tools. While LLMs have demonstrated remarkable performance in reference resolution for languages with abundant resources, such as English, their effectiveness in languages with limited resources remains unknown. This paper evaluates LLMs, specifically ChatGPT and Gemini, emphasizing their ability to resolve references in Punjabi, a resource-poor language. In addition to assessing the current capabilities of these models, our research aims to identify their shortcomings and areas for improvement. The experiment indicates that ChatGPT demonstrates superior performance compared to Gemini, but additional refinement and enhancement are needed to produce more consistent and reliable outcomes.