MKSSS's Cummins College of Engg. for WomenJan 16, 2025

“Shabda Anveshak”: Comparative Review and Analysis of Marathi Part of Speech Tagging Approaches

Purva Sarda, Dhanashree Kokare, Ritika Mokashi, Tejaswini Patkar, Pranjali Deshpande, Sunita Jahirabadkar

AI Summary

This paper reviews and analyzes various Part-of-Speech (POS) tagging approaches for the low-resource language Marathi, focusing on rule-based, statistical (HMM), hybrid, machine learning, and deep learning methods. The study addresses the challenges posed by Marathi's complex morphology and the limitations of data availability for ML/DL techniques. The review identifies significant and emerging directions for improving Marathi POS taggers based on a comparative analysis of recent studies.

Key Contribution

Marathi POS tagging struggles highlight the broader challenges of applying NLP techniques to low-resource languages, where data scarcity limits the effectiveness of ML/DL approaches.

Abstract

In Natural Language Processing (NLP), Part-of-Speech (POS) tagging is an essential task wherein each word in a sentence is assigned a grammatical category, such as noun, verb, or adjective. It enables applications such as information retrieval, text summarization, and machine translation. POS tagging poses special difficulties for Low Resource Languages like Marathi because of complicated morphology. This is addressed by a variety of techniques, such as rule-based techniques that rely on linguistic rules, and statistical models such as Hidden Markov Models (HMMs) using probabilities dependent on big annotated datasets. Hybrid models are the combination of these two approaches showing enhanced robustness. Despite their potential, machine learning (ML) and deep learning (DL) approaches encounter challenges because of insufficient data. This review compares recent studies and highlights the significant and emerging directions for further improving POS Taggers for Marathi.

Natural Language Processing

Citation Metrics

Citations1

Influential citations0

References32

Year2025

Venue2025 1st International Conference on AIML-Applications for Engineering & Technology (ICAET)

Related Papers

Finding related papers...

Search

“Shabda Anveshak”: Comparative Review and Analysis of Marathi Part of Speech Tagging Approaches

Related Papers