Evaluation of DeepSeek-R1 and ChatGPT-4o on the Chinese national medical licensing examination: a multi-year comparative study
Appendix Tables 1 and 2 show the accuracy of ChatGPT-4o and Deepseek-R1 on NMLE questions. The data is grouped by...
Appendix Tables 1 and 2 show the accuracy of ChatGPT-4o and Deepseek-R1 on NMLE questions. The data is grouped by...
AbstractAdvanced general-purpose Large Language Models (LLMs), including OpenAI’s Chat Generative Pre-trained Transformer (ChatGPT), Google’s Gemini and Anthropic’s Claude, have demonstrated...
This study compares the performance of GPT-3.5, GPT-4, and GPT-4o on the 2020 and 2021 Chinese NMLE, focusing on the...
Numerical scores for the Comprehensive Osteopathic Medical Licensing Examination (COMLEX) Level 1 were mistakenly made visible to ob/gyn programs despite...
Tamblyn R, Abrahamowicz M, Dauphinee WD, Hanley JA, Norcini J, Girard N, et al. Association between licensure examination scores and...
Study selectionThe literature search yielded 433 articles after the initial search. After removing duplicates and performing title/abstract screening, a total...