Integrating AI into clinical education: evaluating general practice trainees’ proficiency in distinguishing AI-generated hallucinations and impacting factors | BMC Medical Education

0
Integrating AI into clinical education: evaluating general practice trainees’ proficiency in distinguishing AI-generated hallucinations and impacting factors | BMC Medical Education
  • Mekki YM, Zughaier SM. Teaching artificial intelligence in medicine. Nat Rev Bioeng. 2024;2:450–1.

    Article 

    Google Scholar 

  • Yan M, Cerri GG, Moraes FY. ChatGPT and medicine: how AI Language models are shaping the future and health related careers. Nat Biotechnol. 2023;41:1657–8.

    Article 

    Google Scholar 

  • Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large Language models in medicine. Nat Med. 2023;29:1930–40.

    Article 

    Google Scholar 

  • Omiye JA, Gui H, Rezaei SJ, Zou J, Daneshjou R. Large Language models in medicine:the potentials and pitfalls. Ann Intern Med. 2024;177:210–20.

    Article 

    Google Scholar 

  • Cheong RCT, Pang KP, Unadkat S, Mcneillis V, Williamson A, Joseph J, et al. Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard. European archives of oto-rhino-laryngology: official journal of the European federation of Oto-Rhino-Laryngological societies (EUFOS): affiliated with the German society for Oto-Rhino-Laryngology -. Head Neck Surg. 2024;281:2137–43.

    Google Scholar 

  • Tripathi S, Patel J, Mutter L, Dorfner FJ, Bridge CP, Daye D. Large Language models as an academic resource for radiologists stepping into artificial intelligence research. Curr Probl Diagn Radiol. 2024;S0363–0188(24):00232–9.

    Google Scholar 

  • Meyer JG, Urbanowicz RJ, Martin PCN, O’Connor K, Li R, Peng P-C, et al. ChatGPT and large Language models in academia: opportunities and challenges. BioData Min. 2023;16:20.

    Article 

    Google Scholar 

  • Pfohl SR, Cole-Lewis H, Sayres R, Neal D, Asiedu M, Dieng A, et al. A toolbox for surfacing health equity harms and biases in large Language models. Nat Med. 2024;30:3590–600.

    Article 

    Google Scholar 

  • Omar M, Soffer S, Agbareia R, Bragazzi NL, Apakama DU, Horowitz CR et al. Socio-Demographic Biases in Medical Decision-Making by Large Language Models:A Large-Scale Multi-Model Analysis. 2024;2024. 10. 29. 24316368.

  • Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large Language models. PLOS Digit Health. 2023;2:e0000198.

    Article 

    Google Scholar 

  • Tran CG, Chang J, Sherman SK, De Andrade JP. Performance of ChatGPT on American board of surgery In-Training examination Preparation questions. J Surg Res. 2024;299:329–35.

    Article 

    Google Scholar 

  • Herrmann-Werner A, Festl-Wietek T, Holderried F, Herschbach L, Griewatz J, Masters K, et al. Assessing ChatGPT’s mastery of Bloom’s taxonomy using psychosomatic medicine exam questions: Mixed-Methods study. J Med Internet Res. 2024;26:e52113.

    Article 

    Google Scholar 

  • Jussupow E, Spohrer K, Heinzl A, Gawlitza J. Augmenting medical diagnosis decisions?? An investigation into physicians’ decisions?-Making process with artificial intelligence. Inform Syst Res. 2021;32:713–35.

    Article 

    Google Scholar 

  • Summerton N, Cansdale M. Artificial intelligence and diagnosis in general practice. Br J Gen Practice: J Royal Coll Gen Practitioners. 2019;69 684:324–5.

    Article 

    Google Scholar 

  • Everson J, Hendrix N, Phillips RL, Adler-Milstein J, Bazemore A, Patel V. Primary care physicians’ satisfaction with interoperable health information technology. JAMA Netw Open. 2024;7:e243793.

    Article 

    Google Scholar 

  • Buck C, Doctor E, Hennrich J, Jöhnk J, Eymann T. General practitioners’ attitudes toward artificial Intelligence–Enabled systems: interview study. J Med Internet Res. 2022;24:e28916.

    Article 

    Google Scholar 

  • Tong L, Wang J, Rapaka S, Garg PS. Can ChatGPT generate practice question explanations for medical students, a new faculty teaching tool?Med. Teach. 2024;1–5.

  • Liu Z, Zhang W. A qualitative analysis of Chinese higher education students’ intentions and influencing factors in using ChatGPT: a grounded theory approach. Sci Rep. 2024;14:1–11.

    Google Scholar 

  • Gruda D. Three ways ChatGPT helps me in my academic writing. Nature. 2024. Accessed 11 Jan 2025.

  • Zack T, Lehman E, Suzgun M, Rodriguez JA, Celi LA, Gichoya J et al. Coding inequity: assessing GPT-4’s potential for perpetuating Racial and gender biases in healthcare. 2023;2023.07.13.23292577.

  • Du QF, Wang JJ. 2024 General Medicine Practice Mock Exam. People’s Medical Publishing House; 2023:3–39. ISBN:9787117355421.

  • Ten Cate O, Carraccio C, Damodaran A, Gofton W, Hamstra SJ, Hart DE, et al. Entrustment decision making: extending Miller’s pyramid. Acad Med. 2021;96:199–204.

    Article 

    Google Scholar 

  • Thampy H, Willert E, Ramani S. Assessing clinical reasoning:targeting the higher levels of the pyramid. J Gen Intern Med. 2019;34:1631–6.

    Article 

    Google Scholar 

  • Hasani H, Khoshnoodifar M, Khavandegar A, Ahmadi S, Alijani S, Mobedi A, et al. Comparison of electronic versus conventional assessment methods in ophthalmology residents; a learner assessment scholarship study. BMC Med Educ. 2021;21:342.

    Article 

    Google Scholar 

  • Johri S, Jeong J, Tran BA, Schlessinger DI, Wongvibulsin S, Barnes LA et al. An evaluation framework for clinical use of large Language models in patient interaction tasks. Nat Med. 2025;1–10.

  • Meskó B. Prompt engineering as an important emerging skill for medical professionals:tutorial. J Med Internet Res. 2023;25:e50638.

    Article 

    Google Scholar 

  • Wang L, Chen X, Deng X, Wen H, You M, Liu W, et al. Prompt engineering in consistency and reliability with the evidence-based guideline for LLMs. NPJ Digit Med. 2024;7:41.

    Article 

    Google Scholar 

  • Aujla H. d[Formula:see text]:sensitivity at the optimal criterion location. Behav Res Methods. 2023;55:2532–58.

    Article 

    Google Scholar 

  • Wang S, Shi Y, Sui M, Shen J, Chen C, Zhang L, et al. Telephone follow-up based on artificial intelligence technology among hypertension patients: reliability study. J Clin Hypertens (Greenwich). 2024;26:656–64.

    Article 

    Google Scholar 

  • Li J, Guan Z, Wang J, Cheung CY, Zheng Y, Lim L-L, et al. Integrated image-based deep learning and Language models for primary diabetes care. Nat Med. 2024. https://doi.org/10.1038/s41591-024-03139-8.

    Article 

    Google Scholar 

  • Tung JYM, Gill SR, Sng GGR, Lim DYZ, Ke Y, Tan TF, et al. Comparison of the quality of discharge letters written by large Language models and junior Clinicians:Single-Blinded study. J Med Internet Res. 2024;26:e57721.

    Article 

    Google Scholar 

  • Zaretsky J, Kim JM, Baskharoun S, Zhao Y, Austrian J, Aphinyanaphongs Y, et al. Generative artificial intelligence to transform inpatient discharge summaries to Patient-Friendly Language and format. JAMA Netw Open. 2024;7:e240357.

    Article 

    Google Scholar 

  • Aljamaan F, Temsah M-H, Altamimi I, Al-Eyadhy A, Jamal A, Alhasan K, et al. Reference hallucination score for medical artificial intelligence chatbots: development and usability study. JMIR Med Inf. 2024;12:e54345.

    Article 

    Google Scholar 

  • Huang Y, Gomaa A, Semrau S, Haderlein M, Lettmaier S, Weissmann T et al. Benchmarking ChatGPT-4 on a radiation oncology in-training exam and red journal Gray zone cases: potentials and challenges for ai-assisted medical education and decision making in radiation oncology. Front Oncol. 2023;13.

  • Goddard J. Hallucinations in ChatGPT: A cautionary Tale for biomedical researchers. Am J Med. 2023;136:1059–60.

    Article 

    Google Scholar 

  • Boscardin CK, Gin B, Golde PB, Hauer KE. ChatGPT and generative artificial intelligence for medical education: potential impact and opportunity. Acad Medicine: J Association Am Med Colleges. 2024;99:22–7.

    Article 

    Google Scholar 

  • Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare:past, present and future. Stroke Vasc Neurol. 2017;2:230–43.

    Article 

    Google Scholar 

  • Fitzek S, Choi K-EA. Shaping future practices:German-speaking medical and dental students’ perceptions of artificial intelligence in healthcare. BMC Med Educ. 2024;24:844.

    Article 

    Google Scholar 

  • Micocci M, Borsci S, Thakerar V, Walne S, Manshadi Y, Edridge F, et al. Attitudes towards trusting artificial intelligence insights and factors to prevent the passive adherence of GPs: A pilot study. J Clin Med. 2021;10:3101.

    Article 

    Google Scholar 

  • Shang L, Li R, Xue M, Guo Q, Hou Y. Evaluating the application of ChatGPT in China’s residency training education: an exploratory study. Med Teach. 2024;1–7.

  • Li J, Zhou L, Zhan Y, Xu H, Zhang C, Shan F, et al. How does the artificial intelligence-based image-assisted technique help physicians in diagnosis of pulmonary adenocarcinoma?A randomized controlled experiment of multicenter physicians in China. J Am Med Inf Assoc. 2022;29:2041–9.

    Article 

    Google Scholar 

  • Wang W, Gao G (Gordon), Agarwal R, editors. Friend or Foe? Teaming Between Artificial Intelligence and Workers with Variation in Experience. Management Science. 2024;70:5753–75.

  • Larson BZ, Moser C, Caza A, Muehlfeld K, Colombo LA. Critical thinking in the age of generative AI. AMLE. 2024;23:373–8.

    Article 

    Google Scholar 

  • Moulin TC. Learning with AI Language models: guidelines for the development and scoring of medical questions for higher education. J Med Syst. 2024;48:45.

    Article 

    Google Scholar 

  • Student interaction with. ChatGPT can promote complex critical thinking skills. Learn Instruction. 2025;95:102011.

    Article 

    Google Scholar 

  • link

    Leave a Reply

    Your email address will not be published. Required fields are marked *