Mar 19, 2026arXiv:2603.18788

Mi:dm K 2.5 Pro

AI Summary

Mi:dm K 2.5 Pro, a 32B parameter LLM, was developed using a reasoning-focused optimization strategy tailored for enterprise-grade complexity, particularly in Korean-language and domain-specific scenarios. The model's training involved a quality-centric data curation pipeline, layer-predictor-based Depth Upscaling (DuS) for pre-training, and a multi-stage post-training pipeline including Reasoning SFT, model merging, and asynchronous reinforcement learning. Evaluations demonstrate competitive performance against leading models, state-of-the-art results on Korean-specific benchmarks, and validated safety against attacks.

Key Contribution

Forget scaling laws: Mi:dm K 2.5 Pro proves that targeted training pipelines and data curation can enable a 32B parameter model to achieve state-of-the-art performance in enterprise reasoning tasks, especially in low-resource languages like Korean.

Abstract

The evolving LLM landscape requires capabilities beyond simple text generation, prioritizing multi-step reasoning, long-context understanding, and agentic workflows. This shift challenges existing models in enterprise environments, especially in Korean-language and domain-specific scenarios where scaling is insufficient. We introduce Mi:dm K 2.5 Pro, a 32B parameter flagship LLM designed to address enterprise-grade complexity through reasoning-focused optimization. Our methodology builds a robust data foundation via a quality-centric curation pipeline utilizing abstract syntax tree (AST) analysis for code, gap-filling synthesis for mathematics, and an LLM-based quality evaluator. Pre-training scales the model via layer-predictor-based Depth Upscaling (DuS) and a progressive strategy supporting a 128K token context window. Post-training introduces a specialized multi-stage pipeline, including Reasoning SFT, model merging, and asynchronous reinforcement learning (RL), to develop complex problem-solving skills."Fusion Training"then rebalances these capabilities with conversational fluency, consistent response styling, and reliable tool-use. The evaluations show that Mi:dm K 2.5 Pro achieves competitive performance against leading global and domestic models. In addition, it sets state-of-the-art results on Korean-specific benchmarks, showcasing deep linguistic and cultural understanding. Finally, Responsible AI evaluations validate safety against attacks, ensuring a secure profile for deployment with a balance of harmlessness and responsiveness.

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References85

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Mi:dm K 2.5 Pro

Related Papers