Search papers, labs, and topics across Lattice.
This paper addresses the critical problem of antimicrobial resistance (AMR) by developing a machine learning framework to forecast population-level resistance trends using WHO GLASS surveillance data. The study benchmarks six models, finding that XGBoost outperforms others with a test MAE of 7.07% and R-squared of 0.854, and identifies prior-year resistance rate as the most important predictor. Furthermore, the authors implement a Retrieval-Augmented Generation (RAG) pipeline using ChromaDB and Phi-3 Mini to provide evidence-grounded policy decision support.
XGBoost can forecast antimicrobial resistance trends with surprising accuracy (7% MAE) using global surveillance data, offering a data-driven approach to combatting a looming health crisis.
Antimicrobial resistance (AMR) is a growing global crisis projected to cause 10 million deaths per year by 2050. While the WHO Global Antimicrobial Resistance and Use Surveillance System (GLASS) provides standardized surveillance data across 44 countries, few studies have applied machine learning to forecast population-level resistance trends from this data. This paper presents a two-component framework for AMR trend forecasting and evidence-grounded policy decision support. We benchmark six models -- Naive, Linear Regression, Ridge Regression, XGBoost, LightGBM, and LSTM -- on 5,909 WHO GLASS observations across six WHO regions (2021-2023). XGBoost achieved the best performance with a test MAE of 7.07% and R-squared of 0.854, outperforming the naive baseline by 83.1%. Feature importance analysis identified the prior-year resistance rate as the dominant predictor (50.5% importance), while regional MAE ranged from 4.16% (European Region) to 10.14% (South-East Asia Region). We additionally implemented a Retrieval-Augmented Generation (RAG) pipeline combining a ChromaDB vector store of WHO policy documents with a locally deployed Phi-3 Mini language model, producing source-attributed, hallucination-constrained policy answers. Code and data are available at https://github.com/TanvirTurja