Scientific Discovery & Drug Design
ApplicationsAI for scientific research, protein structure prediction, drug discovery, materials science, and climate modeling.
Keywords
Top Labs in This Topic
Recent Papers
The paper addresses the problem of efficiently estimating atmospheric particle properties from routine observations in a heteroscedastic regression setting, where noise varies with input. They introduce Confidence-Aware Active Learning (CAAL), which decouples the optimization of predictive mean and noise level during training and uses a confidence-aware acquisition function to weight epistemic uncertainty by predicted aleatoric uncertainty. Experiments on simulations and real data demonstrate that CAAL outperforms standard active learning baselines in expanding atmospheric particle property databases.
Introduces a confidence-aware active learning framework (CAAL) that dynamically weights epistemic uncertainty with predicted aleatoric uncertainty to improve sample selection in heteroscedastic regression problems.
This paper introduces a calibrated Bayesian deep learning framework for medical imaging decision support, addressing the critical need for reliable uncertainty quantification in AI-assisted diagnostics. The framework combines a novel Confidence-Uncertainty Boundary Loss (CUB-Loss) during training, which penalizes high-confidence errors and low-confidence correct predictions, with a post-hoc Dual Temperature Scaling (DTS) strategy to refine the posterior distribution. Validated on pneumonia screening, diabetic retinopathy detection, and skin lesion identification, the approach demonstrates improved calibration, robust performance in data-scarce scenarios, and effectiveness on imbalanced datasets.
Introduces a novel Confidence-Uncertainty Boundary Loss (CUB-Loss) and Dual Temperature Scaling (DTS) strategy to improve calibration and uncertainty quantification in Bayesian deep learning models for medical imaging.
The paper introduces KAN-FIF, a lightweight neural network architecture leveraging Kolmogorov-Arnold Networks (KANs) with spline parameterization to estimate tropical cyclone intensity from meteorological satellite data. KAN-FIF addresses the limitations of existing physics-guided models, which suffer from high parameter counts and computational inefficiency due to their inability to capture complex feature interactions. Experiments demonstrate that KAN-FIF achieves superior accuracy with significantly reduced parameters and faster inference speed compared to baseline models like Phy-CoCo, making it suitable for deployment on resource-constrained edge devices.
Introduces KAN-FIF, a novel and lightweight neural network architecture for tropical cyclone intensity estimation that integrates spline-parameterized KAN layers to efficiently capture complex feature interactions.
The paper introduces KeplerAgent, an LLM-based agent designed for symbolic equation discovery that mimics the scientific reasoning process of inferring physical properties before guessing equations. KeplerAgent coordinates physics-based tools to extract intermediate structure from data and uses this information to configure symbolic regression engines like PySINDy and PySR. Experiments on physical equation benchmarks demonstrate that KeplerAgent achieves significantly higher symbolic accuracy and robustness to noisy data compared to existing LLM and traditional baselines.
Introduces KeplerAgent, an agentic framework that enhances symbolic equation discovery by explicitly modeling the scientific reasoning process of inferring physical properties and using them to constrain the search space of candidate equations.
This paper introduces PuYun-LDM, a latent diffusion model for high-resolution ensemble weather forecasting that addresses the limited diffusability of LDMs in this domain. To improve diffusability, the authors incorporate weather-state evolution features encoded by a 3D Masked AutoEncoder (3D-MAE) as additional conditioning. They also propose a Variable-Aware Masked Frequency Modeling (VA-MFM) strategy to adaptively regularize the spectral energy distribution of each variable, leading to improved performance compared to ENS at short lead times.
Introduces a novel latent diffusion model, PuYun-LDM, incorporating 3D-MAE conditioning and Variable-Aware Masked Frequency Modeling to enhance diffusability and improve high-resolution ensemble weather forecasting.
The paper investigates the phenomenon of "benchmark illusion," where LLMs with similar benchmark accuracy exhibit significant disagreement on individual data points. Using MMLU-Pro and GPQA benchmarks, the authors quantify the disagreement rates between various LLMs, including top-performing frontier models. They demonstrate that this disagreement can lead to substantial variability in scientific research outcomes when LLMs are used for data annotation and inference, impacting the reproducibility of results.
Demonstrates that seemingly convergent benchmark accuracy among LLMs masks substantial disagreement on individual data points, leading to significant consequences for scientific reproducibility.
The paper introduces PhyNiKCE, a neurosymbolic agentic framework that addresses the limitations of LLMs in autonomous CFD by decoupling neural planning from symbolic validation. PhyNiKCE uses a Symbolic Knowledge Engine to enforce physical constraints via a Deterministic RAG Engine, treating simulation setup as a Constraint Satisfaction Problem. Experiments using OpenFOAM and Gemini-2.5-Pro/Flash demonstrate a 96% improvement over baselines, a 59% reduction in self-correction loops, and a 17% decrease in LLM token consumption.
Introduces PhyNiKCE, a neurosymbolic framework that integrates neural planning with symbolic constraint enforcement to improve the reliability and efficiency of autonomous CFD agents.
The paper introduces a supervise-assisted multi-modality fusion diffusion model (MFdiff) to restore standard-dose PET (SPET) images from low-dose PET (LPET) and MR images. MFdiff uses a multi-modality feature fusion module to learn optimized fusion features from MR images and incorporates these features as additional conditions in a diffusion model for iterative SPET image generation. A two-stage supervise-assisted learning strategy leverages both generalized priors from simulated data and specific priors from in-vivo data to improve restoration quality, demonstrating superior performance compared to existing methods.
Introduces a novel supervise-assisted multi-modality fusion diffusion model (MFdiff) that effectively leverages MR images to restore high-quality SPET images from LPET data by using a two-stage training approach.
This paper introduces a geometric model for optimal locomotion of slender bodies based on sub-Riemannian geodesics, accounting for both environmental displacement and internal shape-change energy dissipation. They formulate Lagrangian least-dissipation principles as boundary value problems and solve them numerically using a consistent time and space discretization for various boundary conditions. The resulting optimal gaits match observed biological motion and provide insights into locomotion mechanisms, particularly for generalized Purcell's swimmers.
Introduces a novel geometric model for optimal locomotion that accounts for both environmental displacement and internal shape-change energy dissipation, enabling the computation of optimal gaits for slender bodies.
The paper extends Markov State Models (MSMs) to analyze molecular dynamics (MD) simulations of hydrogen dynamics on rhodium catalysts, specifically slab and nanoparticle geometries. This approach is motivated by the limitations of transition state theory (TST) in complex catalytic systems with structural fluctuations and many interacting species. The key finding is that nanoparticle features slow down hydrogen association/dissociation, and cooperative hydrogen-hydrogen interactions lead to a non-monotonic concentration dependence of reaction rates, contradicting TST predictions.
Demonstrates the application of MSMs to MD simulation data for capturing complex reaction dynamics on catalytic nanoparticles, revealing deviations from TST predictions.
This paper introduces a dissipative ground state preparation protocol tailored for simulating chemical reactions, specifically targeting strongly correlated transition states that are difficult for traditional methods. The protocol propagates a state along a discretized reaction coordinate using Procrustes-aligned orbital rotations, stabilized by engineered dissipative cooling. The authors demonstrate that for reaction paths satisfying a localized Eigenstate Thermalization Hypothesis (ETH) drift condition, the algorithm achieves ground state preparation with a gate complexity of $\widetilde{O}(N_o^{3}/\epsilon_E)$, and provide resource estimates for relevant chemical systems.
Introduces a dissipative ground state preparation protocol leveraging Procrustes-aligned orbital rotations and engineered dissipation to efficiently prepare ground states at chemical transition states.
This paper investigates the generation of stationary Fano coherence in a V-type three-level quantum system driven by polarized incoherent radiation, aiming to establish its potential for energy conversion. The authors derive the Bloch-Redfield equation from first principles by quantizing the incoherent radiation to analyze the system's reduced dynamics and determine the conditions for stationary Fano coherence. They characterize distinct dynamical regimes and quantify the magnitude of the generated coherence, also assessing the impact of symmetric/asymmetric decay rates and discussing experimental challenges using Rubidium atoms.
Demonstrates the possibility of achieving steady-state Fano coherence in a V-type three-level quantum system using polarized incoherent radiation, without requiring near-zero energy difference between excited levels.
This paper introduces a reciprocal-space generative pipeline for crystalline materials, representing crystals via a truncated Fourier transform of the species-resolved unit-cell density. This Fourier representation inherently handles periodic boundary conditions and crystallographic symmetries, while also supporting variable atomic multiplicities. The pipeline is instantiated using a transformer variational autoencoder and a latent diffusion model, demonstrating effective reconstruction and unconditional generation of crystal structures.
Introduces a novel reciprocal-space generative pipeline using Fourier transforms to represent and generate crystalline materials, inherently addressing periodicity, symmetry, and variable atomic multiplicities.
The paper introduces LRBTC, a modular LLM and VLM-driven architecture for quality control in pharmaceutical content, addressing the need for scalable and verifiable validation in regulated domains. LRBTC employs a Student-Teacher dual model architecture combined with a human-in-the-loop workflow and waterfall rule filtering. The approach achieves significant improvements on AIReg-Bench (83.0% F1, 97.5% recall) and CSpelling (26.7% accuracy improvement), demonstrating its effectiveness in reducing missed violations and improving content quality.
Introduces LRBTC, a novel LLM and VLM-driven quality control architecture that leverages a Student-Teacher dual model and HITL workflow for pharmaceutical content optimization.
The paper introduces ProtoMech, a framework for mechanistic interpretability of protein language models (pLMs) that uses cross-layer transcoders to learn sparse latent representations capturing the model's full computational circuitry. By jointly analyzing representations across layers of ESM2, ProtoMech identifies compressed circuits that retain significant performance on protein family classification and function prediction while using only a small fraction of the latent space. Steering along these identified circuits enables high-fitness protein design, demonstrating the framework's utility in understanding and manipulating pLM behavior.
Introduces ProtoMech, a novel framework that discovers computational circuits in protein language models by learning sparse, cross-layer latent representations.
The authors introduce ADRD-Bench, a new benchmark dataset for evaluating LLMs on Alzheimer's Disease and Related Dementias (ADRD), comprising a unified QA set from existing medical benchmarks and a novel QA set derived from the Aging Brain Care (ABC) program. They aim to address the lack of ADRD-specific evaluation resources and practical caregiving context in existing benchmarks. Evaluating 33 state-of-the-art LLMs, they found that while some models achieve high accuracy, inconsistencies in reasoning quality and stability remain a significant limitation.
Introduces ADRD-Bench, the first ADRD-specific benchmark dataset designed for rigorous evaluation of LLMs, incorporating both unified clinical knowledge and practical caregiving questions.
This paper introduces a universal diffusion-based downscaling framework that converts low-resolution weather forecasts into high-resolution probabilistic predictions without model-specific fine-tuning. A conditional diffusion model is trained on coarse-resolution inputs and high-resolution reanalysis targets and then applied in a zero-shot manner to deterministic forecasts from various weather models. The downscaled forecasts consistently improve upon the raw deterministic forecasts, with significant gains in probabilistic skill (CRPS) when evaluated against independent station observations.
Demonstrates a scalable, model-agnostic probabilistic interface for enhancing spatial resolution and uncertainty representation in operational weather forecasting pipelines via diffusion-based downscaling.
This paper introduces a latent-variable approach for learning linear stochastic partial differential equations (SPDEs) with additive Gaussian noise directly from spatiotemporal solution observations. The method combines spectral Galerkin projection with a truncated Wiener chaos expansion to separate deterministic evolution from stochastic forcing, reducing the SPDE to a finite-dimensional latent ODE system. Variational learning is then used to jointly infer the latent dynamics and stochastic forcing, enabling recovery of the underlying stochastic structure without noise observations.
Introduces a novel latent-variable model that learns SPDE dynamics from solution trajectories by integrating spectral Galerkin projection, Wiener chaos expansion, and variational inference.
The paper introduces AlphaPROBE, a novel framework for alpha factor mining in quantitative finance that represents the factor pool as a Directed Acyclic Graph (DAG) to capture the evolutionary relationships between factors. AlphaPROBE employs a Bayesian Factor Retriever to identify promising seed factors and a DAG-aware Factor Generator to produce context-aware and non-redundant optimizations based on the full ancestral trace of factors. Experiments on Chinese stock market datasets demonstrate that AlphaPROBE outperforms existing methods in predictive accuracy, return stability, and training efficiency by leveraging the global evolutionary topology.
Introduces a DAG-based framework for alpha factor mining that explicitly models the evolutionary relationships between factors to improve search efficiency and factor diversity.
This paper introduces Electrostatics-Inspired Surface Reconstruction (EISR), a novel method for 3D surface reconstruction that represents shapes as solutions to Poisson's equation. By drawing an analogy to electrostatics and utilizing Green's functions, the method derives a closed-form parametric expression for the implicit field. The key result is improved reconstruction of high-frequency details compared to existing SDF-based methods, even with limited shape priors, by leveraging the superposition principle of Poisson's equation solutions.
Formulates 3D surface reconstruction as solving Poisson's equation using Green's functions and superposition, enabling improved high-frequency detail recovery.
The paper introduces MaxExp, a decision-driven binarization framework for species distribution models (SDMs) that selects the most probable species assemblage by directly maximizing a chosen evaluation metric without requiring calibration data. They also propose Set Size Expectation (SSE), a computationally efficient alternative predicting assemblages based on expected species richness. Empirical evaluations across three case studies demonstrate that MaxExp consistently matches or surpasses widely used thresholding and calibration methods, particularly under class imbalance and high rarity, while SSE provides a simpler, competitive option.
Introduces MaxExp, a novel decision-driven binarization framework for multispecies presence-absence predictions that directly optimizes a chosen evaluation metric.
This paper introduces a semantically conditioned latent diffusion model (LDM) for synthesizing arterial-phase cerebral digital subtraction angiography (DSA) images, addressing the scarcity of DSA data due to its invasive nature. The LDM is conditioned on text embeddings representing anatomical circulation (anterior/posterior) and C-arm positions, enabling explicit control over the synthesis process. Evaluation by medical experts showed high clinical realism with Likert scores of 3.1-3.3 and a low Fréchet inception distance (FID) of 15.27, demonstrating the potential for generating realistic synthetic DSAs for research and training.
Demonstrates semantically controlled synthesis of realistic cerebral DSA images using a latent diffusion model conditioned on anatomical and geometric parameters.
This paper introduces GR-Diffusion, a novel framework for 3D whole-body PET reconstruction that combines a 3D Gaussian representation (GR) with diffusion models. GR is used to generate a reference 3D PET image from projection data, providing a geometric prior to guide the diffusion process. A hierarchical guidance mechanism refines local details and corrects deviations, enabling the diffusion model to integrate the GR prior and recover sub-voxel information.
Introduces a GR-Diffusion framework that leverages 3D Gaussian representations to guide diffusion models for improved 3D whole-body PET reconstruction, achieving state-of-the-art performance.
The paper introduces VasoMIM, a vascular anatomy-aware masked image modeling framework for self-supervised learning on X-ray angiograms, addressing the scarcity of annotated data in this domain. VasoMIM uses an anatomy-guided masking strategy and an anatomical consistency loss to improve the learning of vascular semantics and structural consistency. The framework is pre-trained on XA-170K, a newly curated large-scale X-ray angiogram dataset, and achieves state-of-the-art performance on four downstream tasks across six datasets, demonstrating its transferability.
Introduces VasoMIM, a novel self-supervised learning framework incorporating anatomy-guided masking and anatomical consistency loss, specifically designed for X-ray angiogram analysis.
This paper investigates the thermodynamic stability of mixed halide perovskites using ab initio molecular dynamics to decompose the free energy of mixing into enthalpic, configurational, and rotational entropic contributions. The study finds that while the enthalpy of mixing is generally positive, the large configurational entropy arising from random cation and halide substitution leads to thermodynamic stability against phase separation. Furthermore, the analysis reveals that hydrogen bonding does not control thermodynamic stability; instead, it's the balance between configurational entropy and rotational entropy.
Demonstrates that configurational entropy, rather than hydrogen bonding, is the primary driver of thermodynamic stability in mixed halide perovskites, counteracting positive mixing enthalpies.
The paper reports the on-surface synthesis and characterization of antiferromagnetic S=1/2 quantum spin rings composed of [2]triangulene units on Au(111). Cyclic five- and six-membered spin rings were constructed using stepwise on-surface synthesis followed by STM tip-induced dehydrogenation, and their spin states were probed using scanning probe microscopy and multireference calculations. The six-membered ring exhibits a uniform excitation gap well-described by a Heisenberg spin model, while the five-membered ring shows asymmetric spin distributions due to structural distortion-induced degeneracy lifting.
Demonstrates the on-surface synthesis of cyclic five- and six-membered quantum spin rings composed of [2]triangulene units and elucidates their distinct magnetic properties arising from structural differences.
The paper introduces a few-shot design optimization setting where high-dimensional auxiliary information $h(x)$ is generated alongside the performance measure $f(x)$ and a history of related tasks is available. They propose a neural model that predicts $f(x)$ for new designs using few-shot context containing observations of $h(x)$, effectively leveraging auxiliary feedback. Experiments on robotic hardware design and neural network hyperparameter tuning demonstrate that the method achieves more accurate few-shot prediction and faster optimization compared to multi-task optimization baselines.
Introduces a novel few-shot design optimization framework that leverages auxiliary information to accelerate optimization in black-box settings.
The paper introduces ArGEnT, a novel Transformer-based architecture for operator learning that directly encodes geometric information from point cloud representations of arbitrary domains. ArGEnT integrates into DeepONet as the trunk network, enabling the learning of operator mappings dependent on both geometric and non-geometric inputs without explicit geometry parametrization. Experiments on fluid dynamics, solid mechanics, and electrochemical systems demonstrate that ArGEnT achieves significantly improved prediction accuracy and generalization compared to standard DeepONet and other geometry-aware surrogates.
Introduces ArGEnT, a geometry-aware Transformer architecture that learns operator mappings on arbitrary domains by directly encoding geometric information from point cloud representations.
The paper introduces Fun-DDPS, a generative framework for carbon capture and storage (CCS) modeling that combines function-space diffusion models with differentiable neural operator surrogates for both forward and inverse problems. By decoupling the learning of a prior over geological parameters from the physics-consistent guidance provided by a Local Neural Operator (LNO) surrogate, Fun-DDPS effectively handles data sparsity and ensures physically realistic solutions. Experiments on synthetic CCS datasets demonstrate that Fun-DDPS significantly outperforms standard surrogates in forward modeling with sparse observations and achieves comparable accuracy to rejection sampling in inverse modeling, while also generating physically consistent realizations with improved sample efficiency.
Introduces a function-space decoupled diffusion framework (Fun-DDPS) that improves both the accuracy and physical realism of forward and inverse modeling in carbon capture and storage.
This paper investigates the impact of incorporating quantum-chemical bonding descriptors into machine learning models for predicting materials properties. They leverage an extended Quantum-Chemical Bonding Database for Solid-State Materials, encompassing approximately 13,000 materials, to derive a new set of bonding descriptors. Their systematic assessment demonstrates that including these descriptors enhances the predictive performance of models for elastic, vibrational, and thermodynamic properties and facilitates the discovery of intuitive expressions for properties like the projected force constant and lattice thermal conductivity through symbolic regression.
Demonstrates the utility of quantum-chemical bonding descriptors in improving the performance and interpretability of machine learning models for predicting materials properties.
This paper introduces a novel data augmentation framework for cardiac scar segmentation using implicit neural representations (INRs) and denoising diffusion models to synthesize late gadolinium enhancement (LGE) images and corresponding segmentation masks. INRs are trained to capture continuous spatial representations of LGE data and masks, compressed into latent embeddings, and then used by a diffusion model to generate new representations that are decoded into synthetic LGE images with anatomically consistent segmentation masks. Experiments demonstrate that augmenting training data with synthetic volumes improves fibrosis segmentation performance, increasing the Dice score from 0.509 to 0.524.
Introduces a novel annotation-free data augmentation method for cardiac scar segmentation by synthesizing LGE images and segmentation masks using INRs and diffusion models.
This paper investigates the problem of unstable feature importance estimates in expressive machine learning models, which hinders their use in scientific discovery. The authors theoretically analyze the bias-variance tradeoff in aggregating feature importance estimates, demonstrating that ensembling at the model level yields more accurate estimates by reducing excess risk. They empirically validate their theoretical findings on benchmark datasets and a large-scale proteomic study from the UK Biobank.
Demonstrates theoretically and empirically that ensembling at the model level, rather than aggregating individual model explanations, provides more accurate feature importance estimates, especially for expressive models.
The paper introduces Sci-CoE, a two-stage co-evolution framework for scientific reasoning LLMs that transitions from sparse supervision to unsupervised learning. Sci-CoE uses a small labeled dataset to bootstrap a Verifier and then employs a geometric reward mechanism incorporating consensus, reliability, and diversity to drive self-iteration on unlabeled data. Experiments on scientific benchmarks demonstrate that Sci-CoE improves complex reasoning capabilities and evaluation robustness.
Introduces a geometric reward mechanism that jointly considers consensus, reliability, and diversity to drive the co-evolution of scientific reasoning LLMs in an unsupervised manner.
This paper introduces PLOT-CT, a novel framework for low-dose CT reconstruction that operates in the pre-log domain using Voronoi decomposition to address noise amplification from logarithmic transformation. The method decomposes pre-log sinograms into distinct components embedded in separate latent spaces, enhancing feature learning and noise mitigation. Experiments demonstrate that PLOT-CT achieves state-of-the-art performance, with a 2.36dB PSNR improvement over existing methods at the 1e4 incident photon level.
Introduces a pre-log domain CT reconstruction framework using Voronoi decomposition to disentangle sinogram data and improve noise resilience.
The paper re-examines single-minus tree-level n-gluon scattering amplitudes, demonstrating that they do not vanish for specific "half-collinear" configurations in Klein space or with complexified momenta, contrary to common assumptions. The authors derive a closed-form, piecewise-constant expression for the decay of a single minus-helicity gluon into n-1 plus-helicity gluons as a function of their momenta. This derived formula is shown to satisfy Weinberg's soft theorem, confirming its consistency.
Discovers and formulates a non-zero solution for single-minus gluon tree amplitudes under specific kinematic conditions.
This paper introduces SpaTeoGL, a spatiotemporal graph learning framework that constructs window-level spatial graphs of iEEG electrode interactions and a temporal graph linking time windows based on spatial graph similarity. The method uses a smooth graph signal processing formulation solved via alternating block coordinate descent, providing convergence guarantees. Experiments on a multicenter iEEG dataset demonstrate that SpaTeoGL achieves competitive SOZ localization performance compared to horizontal visibility graphs and logistic regression, while also enhancing non-SOZ identification and offering interpretable insights into seizure dynamics.
Introduces a novel spatiotemporal graph learning framework, SpaTeoGL, to model and interpret seizure onset zone dynamics from iEEG data.
The paper introduces iUzawa-Net, a novel optimization-informed deep neural network designed for real-time solutions of nonsmooth optimal control problems governed by linear PDEs. iUzawa-Net unrolls an inexact Uzawa method, substituting traditional preconditioners and PDE solvers with learnable neural networks. The authors demonstrate universal approximation properties and asymptotic ε-optimality, showcasing numerical efficiency on elliptic and parabolic optimal control problems.
Introduces iUzawa-Net, a deep learning architecture that learns to solve nonsmooth PDE-constrained optimal control problems in real-time by unrolling an inexact Uzawa method and replacing key components with learned neural networks.
The paper introduces GRXForm, a Graph Transformer model for amortized molecular optimization that sequentially adds atoms and bonds to a molecule. To improve generalization, the authors identify and address the high variance in rewards caused by heterogeneous starting structures by using Group Relative Policy Optimization (GRPO). GRXForm demonstrates strong generalization to out-of-distribution molecular scaffolds, achieving competitive performance with instance optimizers in multi-objective optimization without requiring inference-time oracle calls or refinement.
Introduces Group Relative Policy Optimization (GRPO) to normalize rewards relative to the starting structure, thereby mitigating variance and improving generalization in amortized molecular optimization.
The paper introduces Iskra, a system for automatically differentiating through geometry processing algorithms implemented using imperative code. Iskra leverages the adjoint method and scatter-gather mesh processing to efficiently compute gradients for algorithms using local-global and ADMM solvers. The system enables inverse geometry processing applications by providing a low-effort, fast, and memory-efficient alternative to generic differentiable optimization.
Introduces Iskra, a system that automatically generates efficient backward passes for existing geometry processing algorithms by applying the adjoint method to imperative code.
This paper presents a hardware implementation of semi-empirical electronic structure methods, specifically Extended Hückel Theory (EHT) and non-self-consistent Density Functional Tight Binding (DFTB0), on a field-programmable gate array (FPGA). By implementing Hamiltonian construction and diagonalization directly on the FPGA using a streaming dataflow architecture, the design achieves deterministic execution and eliminates host intervention. The FPGA-based DFTB0 Hamiltonian generator demonstrates a greater than fourfold throughput improvement compared to a server-class CPU on a mid-range Artix-7 FPGA, highlighting the potential for significant acceleration.
Demonstrates a hardware-native implementation of semi-empirical electronic structure theory on an FPGA, achieving superior throughput compared to a CPU.
The authors introduce LiveMedBench, a dynamically updated medical benchmark designed to address data contamination and temporal misalignment in LLM evaluation by continuously harvesting real-world clinical cases from online medical communities. They employ a Multi-Agent Clinical Curation Framework to filter noise and validate clinical integrity, and an Automated Rubric-based Evaluation Framework for granular, case-specific assessment. Evaluation of 38 LLMs on LiveMedBench reveals significant performance degradation on post-cutoff cases and identifies contextual application as a major bottleneck, highlighting the limitations of current LLMs in clinical reasoning.
Introduces LiveMedBench, a novel, continuously updated medical benchmark with automated rubric evaluation, to mitigate data contamination and improve the reliability of LLM evaluation in clinical settings.
This scoping review examines the applications of Large Language Models (LLMs) and Vision-Language Models (VLMs) in glaucoma care, focusing on patient education, diagnosis/risk prediction, and surgical management. The review analyzed 27 studies from a pool of 316 records across five databases (PubMed, Scopus, Web of Science, arXiv, and IEEE Xplore). The findings indicate that LLMs show promise as assistive tools, particularly in patient communication and text-based clinical decision support, but require further development in accuracy, multimodal integration, and ophthalmology-specific fine-tuning.
Synthesizes current evidence on the applications of LLMs and VLMs in glaucoma, highlighting their potential and limitations across various clinical tasks.
This paper introduces a real-time, low-latency Named Entity Recognition (NER) system tailored for cancer therapy-related clinical records and Traditional Chinese Medicine (TCM) using deep learning architectures. The study addresses the challenges of applying NER to complex medical terminology and the need for high accuracy in clinical contexts, particularly in cross-lingual speech-to-text applications. The authors propose a semi-supervised approach that integrates TCM-specific corpora with biomedical resources, demonstrating improved recognition accuracy for real-time clinical applications.
Introduces a semi-supervised NER approach that leverages TCM-specific corpora and biomedical resources to enhance recognition accuracy in real-time clinical applications.
This paper addresses the singularity issue in harmonic-to-anharmonic thermodynamic integration (TI) for solids with diffusive degrees of freedom, which hinders accurate free energy estimation. The authors introduce a regularization technique called Regularized End point Gradient (REG) TI to eliminate the singularity and produce a well-behaved integrand. They demonstrate the effectiveness of REG TI on a model system and in predicting the relative stability of paracetamol polymorphs, showcasing its ability to simplify anharmonic free energy calculations.
Introduces Regularized End point Gradient (REG) TI, a novel regularization method, to resolve the singularity problem in harmonic-to-anharmonic thermodynamic integration for solids with diffusive degrees of freedom.
This paper optimizes the YOLOv8 architecture for cataract detection using fundus and anterior segment images. The optimization focuses on enhancing feature extraction and employs techniques like five-fold cross-validation, color magnification, and stochastic weight averaging (SWA) during training. The resulting model achieves a 98.9% F1-score and 0.995 mAP50 on an external test set, demonstrating improved accuracy and real-time performance compared to ResNet- and MobileNet-based approaches.
Improves cataract detection by optimizing the YOLOv8 architecture, achieving state-of-the-art accuracy and inference speed on low-cost GPUs.
The paper introduces BioAgent Bench, a new benchmark and evaluation suite for assessing AI agents on end-to-end bioinformatics tasks like RNA-seq and variant calling. It uses prompts that specify concrete output artifacts to support automated assessment and stress testing under controlled perturbations such as corrupted inputs and prompt bloat. The authors evaluated both closed-source and open-weight models using an LLM-based grader, finding that while frontier agents can complete multi-step pipelines, they exhibit failure modes under perturbations, and open-weight models may be preferable in privacy-sensitive settings.
Introduces BioAgent Bench, a novel benchmark dataset and evaluation suite for assessing the performance and robustness of AI agents in bioinformatics tasks.
This study evaluated the readability and quality of patient education materials (PEMs) generated by five AI chatbots (ChatGPT, Microsoft Copilot, Google Gemini, Perplexity, and Claude AI) in response to questions about familial adenomatous polyposis (FAP). The PEMs exhibited above-average quality as measured by DISCERN and PEMAT scores, but demonstrated poor readability, with a mean reading grade level of 12.44, significantly exceeding the recommended level for patient education. These findings suggest that while AI chatbots can provide valuable information, adjustments are needed to improve the accessibility of AI-generated PEMs for patients with varying literacy levels.
Reveals that AI chatbots generate patient education materials on familial adenomatous polyposis with acceptable quality but poor readability, highlighting a need for improved accessibility.
The authors present a refactored and optimized Python framework, built upon the LAMOST Atmospheric Parameter Pipeline (LASP), for scalable stellar parameter inference from large spectroscopic datasets. The framework includes a CPU-optimized module (LASP-CurveFit) and a GPU-accelerated module (LASP-Adam-GPU) that uses grouped optimization to process multiple spectra simultaneously. Applied to 10 million LAMOST spectra, the framework achieves significant speedups (reducing runtime to 7 hours on an NVIDIA A100 GPU) while maintaining accuracy and demonstrating improved transferability to the DESI DR1 dataset compared to the DESI pipeline, particularly for effective temperature and surface gravity of cool giants.
Introduces a modular, parallelized, and GPU-accelerated Python framework for stellar parameter inference that achieves significant speedups and improved accuracy compared to existing pipelines, particularly for cool giants.
This paper investigates the novelty of AI-generated research plans using multi-workflow LLM pipelines, addressing concerns about "smart plagiarism" in single-step prompting. They benchmarked five reasoning architectures, including reflection-based refinement, evolutionary algorithms, a multi-agent framework, recursive decomposition, and a multimodal long-context pipeline, evaluating each on novelty, feasibility, and impact. The results demonstrate that decomposition-based and long-context workflows achieve significantly higher novelty scores compared to reflection-based approaches, suggesting that multi-stage agentic workflows can enhance AI-assisted research ideation.
Demonstrates that multi-stage LLM agentic workflows, particularly those employing decomposition or long-context reasoning, generate more novel and feasible research plans compared to simpler reflection-based approaches.
This paper introduces a Deep Recurrent Reinforcement Learning (DRL) framework using a CNN-LSTM architecture for dynamic scheduling of radio telescopes to mitigate radio frequency interference (RFI). The DRL agent learns to control the KMITL radio telescope in a custom simulation environment, optimizing for survey coverage while avoiding RFI. Results show the recurrent DRL agent achieves a 72.7% improvement in mean effective survey coverage compared to a non-recurrent baseline, with minimal performance degradation in real-world deployment.
Demonstrates a DRL-based approach for radio telescope scheduling that significantly improves survey coverage and robustness against RFI compared to non-recurrent methods.

