Search papers, labs, and topics across Lattice.
The paper introduces auto-WHATMD, an algorithm that uses optimal transport (Wasserstein distance) and simulated annealing to automatically identify key residues that distinguish multiple protein systems in molecular dynamics simulations. This method addresses the challenge of selecting relevant features from high-dimensional trajectory data, which typically relies on domain expertise and can introduce bias. Applying auto-WHATMD to bromodomain 4 systems with different ligands, the authors identified discriminative residues in the loop region and demonstrated that these residues capture the correlation with ligand-binding affinities.
Ditch the guesswork in molecular dynamics analysis: auto-WHATMD automatically pinpoints the key residues that differentiate protein systems, revealing insights previously buried in high-dimensional data.
Comparing multiple protein systems with variation such as different binding ligands or mutations, and understanding their effects is one of the objectives in molecular dynamics simulations. Representation of these systems by a few features enables quantitative comparison. However, because molecular dynamics simulation trajectories are high-dimensional spatiotemporal data, selection of key features relies on domain expertise, sometimes introducing arbitrary assumptions. Here, we present an approach that uses the optimal transport distance to compare high-dimensional trajectory data, and employs simulated annealing to identify the residues that best distinguish multiple systems. We term this algorithm auto-WHATMD (automated Wasserstein-based High-dimensional feature extraction Analysis for Trajectories of Molecular Dynamics). We applied auto-WHATMD to multiple protein-ligand systems of bromodomain 4 with different ligands, identifying the most discriminative residues in the loop region. Moreover, even a few selected residues were sufficient to capture the correlation with ligand-binding affinities, indicating that auto-WHATMD effectively prioritizes the most informative residues. Our approach can be used to efficiently determine key residues and design features for multiple analogous systems.