Korea UNAVER LabsApr 28, 2026arXiv:2604.25297

LegalMidm: Use-Case-Driven Legal Domain Specialization for Korean Large Language Model

Youngjoon Jang, Chanhee Park, Hyeonseok Moon, Young-kyoung Ham, Jiwon Moon, Jinhyeon Kim, JuKyung Jung, Heuiseok Lim

AI Summary

The paper introduces LegalMidm, a Korean legal-domain LLM, trained using a novel framework emphasizing real-world legal use cases. They construct high-quality datasets through collaboration with legal professionals and rigorous data curation, focusing on relevance and factual accuracy. Experiments demonstrate LegalMidm's effectiveness in key legal tasks, suggesting the importance of use-case-driven training for domain specialization.

Key Contribution

Forget generic legal LLMs – LegalMidm shows that focusing on specific Korean legal use cases, with data curated by legal pros, unlocks real-world performance gains.

Abstract

In recent years, the rapid proliferation of open-source large language models (LLMs) has spurred efforts to turn general-purpose models into domain specialists. However, many domain-specialized LLMs are developed using datasets and training protocols that are not aligned with the nuanced requirements of real-world applications. In the legal domain, where precision and reliability are essential, this lack of consideration limits practical utility. In this study, we propose a systematic training framework grounded in the practical needs of the legal domain, with a focus on Korean law. We introduce LegalMidm, a Korean legal-domain LLM, and present a methodology for constructing high-quality, use-case-driven legal datasets and optimized training pipelines. Our approach emphasizes collaboration with legal professionals and rigorous data curation to ensure relevance and factual accuracy, and demonstrates effectiveness in key legal tasks.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

LegalMidm: Use-Case-Driven Legal Domain Specialization for Korean Large Language Model

Related Papers