Search papers, labs, and topics across Lattice.
The paper introduces LOCUS, a distribution-free wrapper that estimates a per-input loss-scale reliability score for a fixed prediction function by modeling the realized loss of the prediction. This approach addresses the problem of ML models being accurate on average but incurring disproportionate costs due to large errors in specific instances. LOCUS uses a split-calibration step to transform the predicted loss distribution into a distribution-free score, enabling risk ranking and thresholding for controlling large-loss events, demonstrating improved performance over standard heuristics across 13 regression benchmarks.
Stop relying on heuristics: LOCUS offers a distribution-free, per-input loss-scale reliability score that lets you directly quantify and control the risk of large losses from your ML models.
Modern machine learning models can be accurate on average yet still make mistakes that dominate deployment cost. We introduce Locus, a distribution-free wrapper that produces a per-input loss-scale reliability score for a fixed prediction function. Rather than quantifying uncertainty about the label, Locus models the realized loss of the prediction function using any engine that outputs a predictive distribution for the loss given an input. A simple split-calibration step turns this function into a distribution-free interpretable score that is comparable across inputs and can be read as an upper loss level. The score is useful on its own for ranking, and it can optionally be thresholded to obtain a transparent flagging rule with distribution-free control of large-loss events. Experiments across 13 regression benchmarks show that Locus yields effective risk ranking and reduces large-loss frequency compared to standard heuristics.