Search papers, labs, and topics across Lattice.
This paper presents a two-stage Optical Music Recognition (OMR) pipeline, focusing on the second stage of decoding symbol and event candidates into an editable score structure for complex polyphonic music, particularly piano scores. The core innovation lies in formulating the decoding as a structure decoding problem and employing topology recognition with probability-guided search (BeadSolver). The approach is enhanced by a data strategy combining procedural generation and recognition-feedback annotations, resulting in a practical OMR decoding component.
Finally, a practical OMR system can handle complex polyphonic music, like piano scores, by intelligently decoding visual symbols into editable scores.
We propose a new approach for the second stage of a practical two-stage Optical Music Recognition (OMR) pipeline. Given symbol and event candidates from the visual pipeline, we decode them into an editable, verifiable, and exportable score structure. We focus on complex polyphonic staff notation, especially piano scores, where voice separation and intra-measure timing are the main bottlenecks. Our approach formulates second-stage decoding as a structure decoding problem and uses topology recognition with probability-guided search (BeadSolver) as its core method. We also describe a data strategy that combines procedural generation with recognition-feedback annotations. The result is a practical decoding component for real OMR systems and a path to accumulate structured score data for future end-to-end, multimodal, and RL-style methods.