ChatGPT Follows Standardized Terminology & Practices on Urodynamics’ Interpretation: The New Virtual Functional Urologist

Almazeedi A1, AlBoloushi N1, Abdullah A1, Almarzouq A2, AL-shaiji T1, Yaiesh S1

Research Type

Clinical

Abstract Category

Urodynamics

Abstract 237
Urology 8 - Innovation in Clinical and Surgical Technology
Scientific Podium Short Oral Session 20
Saturday 20th September 2025
10:00 - 10:07
Parallel Hall 2
Voiding Dysfunction Stress Urinary Incontinence Urgency Urinary Incontinence Underactive Bladder Urgency/Frequency
1. Jaber Alahmad Hospital Kuwait, 2. Sabah Alahmad Urology center
Presenter
Links

Abstract

Hypothesis / aims of study
Since its availability for public usage, ChatGPT and other generative artificial intelligence (gAI) and large language model services have been tested for their ability and utilization in a number of medical diagnostic procedures and surgical education. Over time, ChatGPT’s ability to provide concise assessments and evaluations has improved so has its ability to learn and research true resources and references, while previous concerns about its shortcomings with regard to hallucinations and fallacy, among other issues, are easing. Urodynamic studies’ good practices and terminology have been described and published by the International Continence Society (ICS) and are recommended for practitioners [1-3]. We report on ChatGPTs ability to analyze urodynamic traces correctly before and after teaching it the ICS best practices and terms for urodynamics, cystometry and pressure-flow studies.
Study design, materials and methods
We chose at random urodynamic traces including cystometries and pressure-flow studies conducted at our institution by a single functional urology expert. The traces were then reassessed by another expert, and agreement in the assessments was noted. Any disagreement was resolved by a third expert. ChatGPT 4o was used, and it was asked to analyze the same traces twice: first batch without instruction to use certain resources or teaching, and the second batch after it was taught and instructed to use the ICS standards [1-3]. Parameters analyzed include components of the urodynamics traces as well as select nomograms, overall diagnosis and devising a management plan accordingly. Concordance with expert opinion before and after teaching ChatGPT was calculated and statistical significance of the teaching was calculated using McNemar’s test. Institutional board review approval was obtained.
Results
We analyzed a total of 100 urodynamic traces of different etiologies, of which 75% were of female patients. The most common presenting complaint was mixed urinary incontinence, while the most common diagnosis was non-neurogenic dry detrusor overactivity. With regards to assessments of filling cystometry results, ChatGPT did not exhibit any improvement or regression in its ability to correctly diagnose dysfunctions in sensation (p=1), cystometric capacity (p=0.1336), bladder compliance (p=0.4795), urinary incontinence (p=0.3711), and electromyography (EMG) synergy or dyssynergia (p=0.7728). The only parameter that reached some statistical significance was ChatGPTs ability to identify uninhibited detrusor contractions after teaching it (p=0.07). In a similar manner, ChatGPT’s ability to assess voided volume, detrusor pressure at maximal flow (Pdet@Qmax), maximum urinary flow rate (Qmax), and identify after contractions remained unchanged after instruction and teaching (p>0.1), as well as its ability to calculate the bladder contractility index (p=0.25), provide an overall diagnosis (p=1) and formulate an appropriate management plan (p=0.13).
Interpretation of results
This study demonstrates that ChatGPT-4o has a consistent baseline ability to interpret urodynamic traces irrespective of formal instruction. Notably, its capacity to identify uninhibited detrusor contractions improved following exposure to ICS terminology and standards (p = 0.07), suggesting promise for refinement with targeted training.
Concluding message
Our findings underscore ChatGPT’s emerging utility in functional urology as a consistent, adaptable tool with the potential to support clinical education and decision-making. To enhance ChatGPT’s accuracy in interpreting urodynamic studies, future development should focus on multimodal fine-tuning using annotated trace-image datasets aligned with ICS standards, and gAI may eventually play a valuable role in standardizing and scaling access to urodynamic interpretation expertise.
References
  1. Rosier, Peter F W M et al. “International Continence Society Good Urodynamic Practices and Terms 2016: Urodynamics, uroflowmetry, cystometry, and pressure-flow study.” Neurourology and urodynamics vol. 36,5 (2017): 1243-1260. doi:10.1002/nau.23124
  2. Rosier, Peter F W M et al. “ICS-SUFU standard: Theory, terms, and recommendations for pressure-flow studies performance, analysis, and reporting. Part 1: Background theory and practice.” Neurourology and urodynamics vol. 42,8 (2023): 1590-1602. doi:10.1002/nau.25192
  3. Rosier, Peter F W M et al. “ICS-SUFU standard: Theory, terms, and recommendations for pressure-flow studies performance, analysis, and reporting. Part 2: Analysis of PFS, reporting, and diagnosis.” Neurourology and urodynamics vol. 42,8 (2023): 1603-1627. doi:10.1002/nau.25187
Disclosures
Funding none Clinical Trial No Subjects Human Ethics Committee IRB jaber alahmed hospital Helsinki Yes Informed Consent Yes
10/07/2025 21:40:42