Center for Machine Learning (MCML)Henan UniversityLMUApr 13, 2026arXiv:2604.11012

Min-$k$ Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics

Yuanhao Ding, Meimingwei Li, Esteban Garces Arias, Matthias Aßenmacher, Christian Heumann, Chongsheng Zhang

AI Summary

This paper introduces Min-$k$ sampling, a novel decoding strategy for LLMs that dynamically truncates the logit distribution based on local shape analysis to identify semantic cliffs. Unlike Top-$k$ or Top-$p$ sampling, Min-$k$ is temperature invariant and robust to hyperparameter choices, addressing a key limitation of existing probability-space truncation methods. Empirical results across reasoning, creative writing, and human evaluations demonstrate that Min-$k$ consistently improves text quality, especially under extreme temperature settings.

Key Contribution

Forget temperature tuning: Min-$k$ sampling finds the "semantic cliff" in your LLM's logits, delivering robust and high-quality text even when other methods fall apart.

Abstract

The quality of text generated by large language models depends critically on the decoding sampling strategy. While mainstream methods such as Top-$k$, Top-$p$, and Min-$p$ achieve a balance between diversity and accuracy through probability-space truncation, they share an inherent limitation: extreme sensitivity to the temperature parameter. Recent logit-space approaches like Top-$nσ$ achieve temperature invariance but rely on global statistics that are susceptible to long-tail noise, failing to capture fine-grained confidence structures among top candidates. We propose \textbf{Min-$k$ Sampling}, a novel dynamic truncation strategy that analyzes the local shape of the sorted logit distribution to identify "semantic cliffs": sharp transitions from high-confidence core tokens to uncertain long-tail tokens. By computing a position-weighted relative decay rate, Min-$k$ dynamically determines truncation boundaries at each generation step. We formally prove that Min-$k$ achieves strict temperature invariance and empirically demonstrate its low sensitivity to hyperparameter choices. Experiments on multiple reasoning benchmarks, creative writing tasks, and human evaluation show that Min-$k$ consistently improves text quality, maintaining robust performance even under extreme temperature settings where probability-based methods collapse. We make our code, models, and analysis tools publicly available.

Inference & Quantization Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Min-$k$ Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics

Related Papers