Search papers, labs, and topics across Lattice.
This paper introduces an LLM-RAG pipeline using GPT-4o to reconcile international IBD guidelines from ACG, ECCO, BSG, and ACPGBI, aiming to improve readability and usability for clinicians. The system segments guidelines, enriches them with metadata, and retrieves relevant content to synthesize recommendations, highlighting consensus and disagreement. Benchmarking against expert summaries showed the tool effectively distilled complex text, achieving high scores for consensus recognition (4.34) and disagreement detection (4.61), demonstrating its potential to enhance clinical decision-making.
Clinicians drowning in IBD guidelines, rejoice: GPT-4o can now distill multiple international recommendations into concise, actionable statements, flagging areas of consensus and controversy with impressive accuracy.
Clinical guidelines for Inflammatory Bowel Disease (IBD) are essential for standardizing care, but their length, technical language, and inconsistent recommendations make them difficult for busy clinicians to use at the point of care. Manually comparing and reconciling multiple guidelines is labor-intensive and often impractical in real-world clinical practice. We aimed to develop and evaluate a proof-of-concept tool using large language models (LLMs) with retrieval-augmented generation (RAG) to improve guideline readability by harmonizing recommendations, highlighting consensus and controversy, and generating concise, actionable statements. We designed an LLM-RAG pipeline (GPT-4o) to automatically segment guideline documents into manageable units, enrich them with metadata and summaries, and retrieve relevant content in response to clinical queries. The system synthesizes recommendations across guidelines, presenting areas of consensus and disagreement in a structured format. Four major international IBD guidelines (ACG, ECCO, BSG, ACPGBI) were analyzed across eight common clinical questions spanning Crohn’s disease and ulcerative colitis. Tool-generated outputs were benchmarked against expert summaries and evaluated by four independent reviewers using 5-point Likert scales for completeness, accuracy, relevance, coherence, and conciseness. The tool consistently improved guideline readability by distilling complex text into shorter, structured responses. It achieved mean scores of 4.34 (95% CI, 4.20–4.48) for consensus recognition and 4.61 (95% CI, 4.46–4.77) for disagreement detection. Completeness, accuracy, and relevance all scored >4.0. Although conciseness was lower (3.84, 95% CI, 3.50–4.19), reviewers noted that outputs captured essential information while substantially reducing textual burden. Outline generation performance was moderate (3.25, 95% CI, 2.85–3.65), reflecting challenges in extracting all relevant subtopics. In 7 of 8 clinical scenarios (87.5%), the tool’s recommendations aligned with expert conclusions. This proof-of-concept study demonstrates that an LLM-RAG framework can systematically reconcile international IBD guidelines and present them in a more readable, clinically usable format. By reducing complexity and making consensus and controversy explicit, such tools can help clinicians access key evidence more efficiently, support faster decision-making at the bedside, and reduce practice variation. With further refinement, this approach could contribute to “living guidelines” that are continuously updated and more accessible to end-users, ultimately enhancing patient care. References: Shekelle P, Woolf S, Grimshaw JM, Schünemann HJ, Eccles MP. Developing clinical practice guidelines: reviewing, reporting, and publishing guidelines; updating guidelines; and the emerging issues of enhancing guideline implementability and accounting for comorbid conditions in guideline development. Implementation Sci [Internet]. 2012 July 4 [cited 2025 Aug 3];7(1):62. Available from: https://doi.org/10.1186/1748-5908-7-62 Lichtenstein GR, Loftus EV, Isaacs KL, Regueiro MD, Gerson LB, Sands BE. ACG Clinical Guideline: Management of Crohn’s Disease in Adults. Official journal of the American College of Gastroenterology | ACG [Internet]. 2018 Apr [cited 2025 Aug 3];113(4):481. Available from: https://journals.lww.com/ajg/fulltext/2018/04000/acg_clinical_guideline__management_of_crohn_s.10.aspx Grimshaw JM, Shirran L, Thomas R, Mowatt G, Fraser C, Bero L, et al. Changing provider behavior: an overview of systematic reviews of interventions. Med Care. 2001 Aug;39(8 Suppl 2):II2-45. Gagliardi AR, Brouwers MC. Integrating guideline development and implementation: analysis of guideline development manual instructions for generating implementation advice. Implement Sci. 2012 July 23;7:67. Brouwers MC, Kerkvliet K, Spithoff K, Consortium ANS. The AGREE Reporting Checklist: a tool to improve reporting of clinical practice guidelines. 2016 Mar 8 [cited 2025 Aug 3]; Available from: https://www.bmj.com/content/352/bmj.i1152.long Salari H, Najm F, Yazdankhahfard M, Esfandiari A. Challenges, barriers and solutions for implementing clinical practice guidelines: a qualitative study in southern Iran. BMJ Open Qual [Internet]. 2024 July 23 [cited 2025 Aug 3];13(3). Available from: https://bmjopenquality.bmj.com/content/13/3/e002595 Eguia H, Sánchez-Bocanegra CL, Vinciarelli F, Alvarez-Lopez F, Saigí-Rubió F. Clinical Decision Support and Natural Language Processing in Medicine: Systematic Literature Review. Journal of Medical Internet Research [Internet]. 2024 Sept 30 [cited 2025 Aug 3];26(1):e55315. Available from: https://www.jmir.org/2024/1/e55315 Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language Models are Few-Shot Learners. Adv Neural Inf Process Syst [Internet]. 2020 [cited 2025 Aug 3];33:1901. Available from: http://arxiv.org/abs/2005.14165 Conflict of interest: Salahi-Niri, Aryan: No conflict of interest Safavi-Naini, Seyed Amir Ahmad: No conflict of interest Devi, Jalpa: No conflict of interest Naderi, Nariman: No conflict of interest Sebastian, Shaji: No conflict of interest Adamina, Michel: No conflict of interest Nadkarni, Girish: No conflict of interest Soroush, Ali: Advisory Board member and equity holder for Virgo Surgical Video Solutions. Dr. El-Hussuna, Alaa: No conflict of interest