Search papers, labs, and topics across Lattice.
This paper introduces a novel RNA design algorithm that combines motif-level divide-and-conquer with structure-level rival search to efficiently identify RNA sequences folding into a target secondary structure. The algorithm decomposes the target structure into a tree of motifs, designs partial sequences for each motif, and recursively combines them using cube pruning to optimize ensemble-based metrics. A final rival search step refines sequences to suppress misfolded alternatives and improve MFE-based performance, achieving state-of-the-art results on standard benchmarks with significant speedups.
Achieve an order-of-magnitude speedup in RNA design while nearly doubling the folding probability of long RNA structures by combining motif-level design with whole-structure rival search.
RNA design aims to identify RNA sequences that fold into a target secondary structure. This task is challenging in terms of computational efficiency. Most existing methods focus on either minimum free energy (MFE)-based or ensemble-based metrics, leaving a gap for a unified approach that performs well across both. We introduce a fast and versatile RNA design algorithm inspired by our previous work on the undesignability of RNA structures and motifs (i.e., sets of contiguous structural loops). Our approach decomposes a target structure into a tree of sub-targets where each leaf node corresponds to a motif and each internal node corresponds to a substructure. We first design partial sequences for each motif, then these partial sequences are selectively and recursively combined via the cube pruning strategy borrowed from computational linguistics, enabling effective optimization of ensemble-based metrics. Finally, a novel whole-structure rival search further refines sequences to suppress misfolded alternatives and enhance MFE-based performance. Our method is highly efficient and also achieves state-of-the-art results on native RNAsolo structures and the Eterna100 benchmark, excelling in both ensemble- and MFE-based metrics. Additionally, it substantially improves the design of long-structure benchmark derived from 16S rRNA, increasing average folding probability from 0.18 to 0.39 with an order-of-magnitude speedup, demonstrating its effectiveness and scalability. Availability: Source code and data are available at: https://github.com/shanry/FastDesign.