Search papers, labs, and topics across Lattice.
The authors developed TACDPPM, a Transformer-based model incorporating input-aware, expert knowledge, and multi-scale time-aware modules, to predict Crohn's Disease (CD) progression events using longitudinal electronic health records (EHRs). This is important because prior CD prediction models often neglect the dynamic nature of the disease by only using baseline data, resulting in lower prediction accuracy. TACDPPM demonstrated superior performance, achieving AUROCs of 0.910-0.979 for predicting disease behavior progression and 0.729-0.823 for predicting surgery, outperforming LSTM, GRU, and conventional models across multiple prediction horizons (1, 3, and 5 years).
Forget recurrent models: a Transformer-based model leverages longitudinal EHR data to predict Crohn's disease progression with significantly higher accuracy.
Crohn’s disease (CD) has a non-negligible prevalence globally especially in the developed countries. Due to its large influenced population over world and great medical burden after disease progression, numerous researches focus on the risk factors and prediction models of CD. However, CD is a chronic and dynamically progressive disease, previous models were mostly convention models utilized only baseline data and ignored medical events during disease course, causing a low prediction efficiency and lack of real time prediction capacity. Here we developed a Transformer-based prediction framework: Time-Aware CD Progression Prediction Model (TACDPPM) that utilized patients’ all longitudinal electronic health records (EHRs) to predict future CD related events including disease behavior progression, surgery and medical usage and compare the prediction efficiency with Long Short Term Memory (LSTM) model, Gated Recurrent Unit (GRU) model, Logistic regression model and Time-varying Cox regression model. Based on transformer structure we developed TACDPPM with an input-aware module, an expert knowledge module and a multi-scale time-aware module. As for datasets we collected 66 static and dynamic variables from 761 CD patients from Peking Union Medical College Hospital as internal training/validation cohort, 74 CD patients from Guizhou Provincial People’s Hospital and Zunyi Medical University Affiliated Hospital as external validation 1 and 170 CD patients from Nanjing Jing Hospital as external validation 2. TACDPPM forecasts the risk of three types of events mentioned above in 1 year, 3 years, and 5 years after supposed starting time points. Also, SHAP of each variable was calculated to show the association with outcome. TACDPPM showed excellent results in internal validation with 0.910-0.979 AUROC (0.835-0.955 for LSTM; 0.821-0.955 for GRU) in predicting 1-, 3- and 5-years disease behavior progression; 0.811 Macro-AUROC in predicting disease behavior progression; 0.729-0.823 AUROC (0.698-0.715 for LSTM; 0.704-0.724 for GRU) in predicting 1-, 3- and 5-years surgery; 0.811 Macro-AUROC in predicting disease behavior progression; 0.813-0.930 AUROC (0.738-0.884 for LSTM; 0.748-0.876 for GRU) in predicting 1-, 3- and 5-years glucocorticoid usage; 0.796-0.901 AUROC (0.735-0.855 for LSTM; 0.738-0.854 for GRU) in predicting 1-, 3- and 5-years immunosuppressant usage; 0.939-0.943 AUROC (0.844-0.873 for LSTM; 0.831-0.874 for GRU) in predicting 1-, 3- and 5-years biologics usage. Our TACDPPM showed much better predicting results in CD related medical events than LSTM, GRU and conventional results. But heterogeneity of CD patients, lack of EHR data and different therapy habits may decrease the efficiency of TACDPPM. Conflict of interest: Dr. Wang, Beiming: No conflict of interest Yang, Yingliang: No conflict of interest Liu, Honglei: No conflict of interest Yang, Hong: No conflict of interest Bai, Xiaoyin: No conflict of interest Xu, Hui: No conflict of interest Ruan, Gechong: No conflict of interest Xu, Zhiwei: No conflict of interest Cui, Dejun: No conflict of interest Yan, Fang: No conflict of interest