Search papers, labs, and topics across Lattice.
This paper benchmarks ChatGPT-4o, GeminiAI, and Perplexity AI on their ability to answer maternal health queries in Telugu, a low-resource language. Performance was evaluated using BERTScore and expert gynecologist assessments across metrics like accuracy, fluency, and relevance. GeminiAI performed best when prompted in English, while Perplexity AI excelled with Telugu prompts, highlighting the importance of prompt language selection for LLM performance in low-resource settings.
Prompting language significantly impacts the accuracy and coherence of LLM responses for maternal health queries in Telugu, with GeminiAI favoring English prompts and Perplexity AI preferring Telugu.
Large Language Models (LLMs) have been progressively exhibiting there capabilities in various areas of research. The performance of the LLMs in acute maternal healthcare area, predominantly in low resource languages like Telugu, Hindi, Tamil, Urdu etc are still unstudied. This study presents how ChatGPT-4o, GeminiAI, and Perplexity AI respond to pregnancy related questions asked in different languages. A bilingual dataset is used to obtain results by applying the semantic similarity metrics (BERT Score) and expert assessments from expertise gynecologists. Multiple parameters like accuracy, fluency, relevance, coherence and completeness are taken into consideration by the gynecologists to rate the responses generated by the LLMs. Gemini excels in other LLMs in terms of producing accurate and coherent pregnancy relevant responses in Telugu, while Perplexity demonstrated well when the prompts were in Telugu. ChatGPT's performance can be improved. The results states that both selecting an LLM and prompting language plays a crucial role in retrieving the information. Altogether, we emphasize for the improvement of LLMs assistance in regional languages for healthcare purposes.