A Survey of Research on the Role of AI, ChatGPT, and Chatbots in Modern Medicine

Welcome to one of the first research posts on the Caduceus Blog, and this is an introduction to using LLMs such as ChatGPT from OpenAI with an overview of the potential benefits and risks of large language model AI systems and chatbots in healthcare, such as assisting with drafting notes, administrative tasks, summarizing patient records, and facilitating patient communication.

You may have heard recently about ChatGPT, a large language model developed by OpenAI that generates human-like text, offering potential applications in various industries, including healthcare. Its ability to generate contextually relevant responses makes it a promising tool for answering patient questions, offering practical advice, and providing emotional support.

A landmark study that was just published in the Journal of the American Medical Association, directed by John W. Ayers PhD/MA, from the University of California San Diego, examined the role of chatbots in responding to patient questions and discovered that the answers from ChatGPT (GPT-3.5) were both of a higher quality and showed more empathy compared to answers from real doctors. 

Additionally, in the context of the COVID-19 pandemic, the need for accessible mental health care has become increasingly important, as a separate study highlights a significant increase in depression and anxiety symptoms. AI-based tools like ChatGPT could potentially help address this need by offering support and resources to vulnerable populations.

Large language models, such as ChatGPT, are trained on vast amounts of text data, enabling them to generate contextually appropriate responses in various domains, including medical information. These models can be fine-tuned to answer specific types of questions or provide information on particular topics. However, it is important to note that AI-generated information should be treated with caution, as it may not always be accurate or up-to-date.

Simulated examsGPT-4 estimated percentileGPT-4 (no vision) estimated percentileGPT-3.5 estimated percentile
SAT Math700 / 800~89th690 / 800~89th590 / 800~70th
Graduate Record Examination (GRE) Quantitative163 / 170~80th157 / 170~62nd147 / 170~25th
Graduate Record Examination (GRE) Verbal169 / 170~99th165 / 170~96th154 / 170~63rd
Graduate Record Examination (GRE) Writing4 / 6~54th4 / 6~54th4 / 6~54th
Medical Knowledge Self-Assessment Program75%75%53%
Results from OpenAI’s website about the capabilities of their GPT models.

In the realm of medical education, GPT-4 from OpenAI has scored a 75% on the Medical Knowledge Self-Assessment Program.

This suggests the potential for ChatGPT to be an innovative medical education tool, possibly useful in small group settings like problem-based learning or clinical problem-solving. Further research is needed to evaluate its efficacy in these contexts.

As AI technologies like ChatGPT continue to advance, their applications in healthcare have the potential to reshape patient care, medical education, and mental health support. However, it is crucial to acknowledge their limitations and to conduct further research to ensure responsible adoption and the provision of accurate, reliable information. That’s why we’re creating a system for healthcare providers to use OpenAI‘s technology for providing automations with speaking with patients.

AI Technologies in Healthcare and Education

An examination of the impact of chatbots and AI systems on modern medicine and education, including their benefits, limitations, and the importance of responsible adoption by physicians and educators.

The impact of chatbots and AI systems on modern medicine and education has been profound, offering numerous benefits and opportunities for growth. However, these technologies come with limitations that must be considered by physicians and educators for responsible adoption.

Benefits:

  • Improved access to information: AI chatbots like ChatGPT can provide accurate and relevant information on various medical conditions. By answering common patient questions through supervision by phone nurses or similar public-facing clinicians that offer practical advice and providing emotional support, AI systems can serve as useful adjunct tools for patients.
  • Enhanced medical education: ChatGPT has demonstrated potential in medical education, performing high on the Medical Knowledge Self-Assessment Program, better than many med students. AI systems could be valuable in small group settings like problem-based learning or clinical problem-solving.
  • Efficient patient communication: AI chatbots can generate quality and empathetic responses to patient questions, potentially improving workflow, patient outcomes, and reducing clinical visits.
  • Personalized patient education: AI language models like ChatGPT can offer automated scoring, teaching assistance, personalized learning, research help, information access, case scenario generation, content creation, and language translation in educational settings.

Limitations:

  • Propagation of false information and biases: AI systems may inadvertently generate misleading or biased information, which could have detrimental effects on patient care and education.
  • Lack of human-like understanding: AI technologies do not possess the same level of understanding and empathy as human healthcare providers or educators, which may limit their effectiveness in certain situations.
  • Outdated data input: AI models rely on the data they have been trained on, and may not be up-to-date with the latest medical research or guidelines.
  • Must have connection to EMRs: to provide contextually relevant patient information, the ChatGPT-like models must have connection to EMRs, or they could provide technically-correct information that is wrong for the specific patient’s needs.

Responsible Adoption:

To ensure responsible adoption of AI technologies in healthcare and education, physicians and educators must:

  • Understand the capabilities and limitations of AI systems and use them as complementary tools rather than replacements for human providers.
  • Continuously update and refine AI models to ensure they remain current with the latest medical research and guidelines.
  • Implement strict data security measures to protect patient privacy and maintain trust in digital health technologies.
  • Foster open dialogue and collaboration between AI developers, healthcare providers, and educators to address ethical concerns and maintain transparency in the use of AI technologies.

AI technologies like ChatGPT offer significant potential in revolutionizing healthcare and education. However, responsible adoption by physicians and educators is crucial to ensure these technologies are used effectively and ethically, and do not compromise the quality of care or education provided.

Survey of Articles Related to using ChatGPT in Healthcare

Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum

John W. Ayers, PhD, MA; Adam Poliak, PhD; Mark Dredze, PhD; Eric C. Leas, PhD, MPH; Zechariah Zhu, BS; Jessica B. Kelley, MSN; Dennis J. Faix, MD; Aaron M. Goodman, MD; Christopher A. Longhurst, MD, MS; Michael Hogarth, MD; Davey M. Smith, MD, MAS

https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2804309

A cross-sectional study found that AI chatbots generated quality and empathetic responses to patient questions, with evaluators preferring chatbot responses to physician responses in 78.6% of cases. These AI assistants may help draft messages for clinicians, improving workflow, patient outcomes, and reducing clinical visits. However, further research is needed to assess their impact in clinical settings.

Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma

Ron Li, MD; Andre Kumar, MD, MEd; Jonathan H. Chen, MD, PhD

https://www.medrxiv.org/content/10.1101/2023.02.06.23285449v1

The study evaluates the accuracy and reproducibility of ChatGPT in providing information on cirrhosis and hepatocellular carcinoma (HCC). It demonstrates the model’s potential in answering common patient questions, offering practical advice, and providing emotional support. However, limitations include the inability to provide specific cut-off values and guideline recommendations. As ChatGPT continues to improve, it may serve as a useful adjunct tool for patients, but not a replacement for healthcare providers.

How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment

Aidan Gilson, BS; Conrad W Safranek, BS; Thomas Huang, BS; Vimig Socrates, MS; Ling Chi, BSE; Richard Andrew Taylor, MD, MHS; David Chartash, PhD

https://mededu.jmir.org/2023/1/e45312

This study assessed ChatGPT’s performance on the United States Medical Licensing Examination (USMLE) Step 1 and Step 2 questions, finding that it performed more accurately on Step 1 questions and outperformed other language models like GPT-3 and InstructGPT. The results suggest ChatGPT has the potential to be an innovative medical education tool, comparable to a third-year medical student’s knowledge level, and could be useful in small group settings like problem-based learning or clinical problem-solving. However, further research is needed to evaluate its efficacy in these contexts.

Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

Tiffany H. Kung, Morgan Cheatham, Arielle Medenilla, Czarina Sillos, Lorie De Leon, Camille Elepaño, Maria Madriaga, Rimel Aggabao, Giezel Diaz-Candido, James Maningo, Victor Tseng

https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000198

This study evaluates the performance of ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. The researchers found that ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education and potentially clinical decision-making. The authors highlight the importance of trust and explainability in AI systems for medical care and believe that ChatGPT’s performance on the USMLE is a notable milestone in AI maturation.

How Chatbots and Large Language Model Artificial Intelligence Systems Will Reshape Modern Medicine: Fountain of Creativity or Pandora’s Box?

Ron Li, MD; Andre Kumar, MD, MEd; Jonathan H. Chen, MD, PhD

https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2804310

The rapid development of chatbots and large language model AI systems offers potential to reshape modern medicine. By easing physicians’ burdens, these technologies can enhance understanding and engagement. However, they come with limitations such as propagating false information and biases. To ensure responsible adoption, physicians and educators must understand and drive the conversation around these technologies.

Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge

Annette Flanagin, RN, MA; Kirsten Bibbins-Domingo, PhD, MD, MAS; Michael Berkwits, MD, MSCE; Stacy L. Christiansen, MA

https://jamanetwork.com/journals/jama/fullarticle/2801170

AI technologies, such as ChatGPT, are revolutionizing scientific publishing and the way that authors write. While these advancements offer improved manuscript quality, concerns have been raised about potential misuse and lack of transparency in scientific publications. As a result, policies have been implemented to regulate AI-generated content. Overall, AI technologies are rapidly advancing and transforming multiple sectors, but it is essential to address the ethical concerns and maintain transparency in their use.

Performance of an Artificial Intelligence Chatbot in Ophthalmic Knowledge Assessment

Andrew Mihalache, BMSc(C); Marko M. Popovic, MD, MPH(C); Rajeev H. Muni, MD, MSc2

https://jamanetwork.com/journals/jamaophthalmology/article-abstract/2804364

In a study evaluating ChatGPT’s ability to answer ophthalmology board certification practice questions, the chatbot achieved a 46% success rate, with its performance varying across different categories. Although ChatGPT may not be ready to provide substantial assistance in board certification preparation, its application in revolutionizing ophthalmic literature is evident. AI chatbots, including ChatGPT, are enhancing the publication process by providing personalized content recommendations, streamlining communication, and improving efficiency in screening and ranking submissions. Medical professionals and students should be aware of the potential benefits of AI advancements in medicine while acknowledging the current limitations.

Appropriateness of Cardiovascular Disease Prevention Recommendations Obtained From a Popular Online Chat-Based Artificial Intelligence Model

Ashish Sarraju, MD; Dennis Bruemmer, MD, PhD; Erik Van Iterson, PhD; et al Leslie Cho, MD; Fatima Rodriguez, MD, MPH; Luke Laffin, MD

https://jamanetwork.com/journals/jama/article-abstract/2801244

This study assessed the quality of cardiovascular disease (CVD) prevention advice provided by ChatGPT. With a user base of over a million, the researchers evaluated the appropriateness of its responses to fundamental questions on CVD prevention to determine the accuracy and reliability of the information provided. The chatbot answered 21 of 25 questions appropriately and none of the 25 responses were considered unreliable.

Harnessing the Promise of Artificial Intelligence Responsibly

David A. Dorr, MD, MS; Laura Adams, MS; Peter Embí, MD, MS

https://jamanetwork.com/journals/jama/article-abstract/2803078

Artificial intelligence (AI) algorithms, such as ChatGPT and DALL-E, show promise in revolutionizing healthcare by providing insights, optimizing systems, and guiding medical decisions. However, they also pose potential risks, such as harm, inequity, and failure to perform. It is crucial to emphasize that true AI enhances human cognition rather than replacing it. To harness the full potential of AI in learning health systems, responsible implementation is necessary, focusing on improving care effectiveness, reliability, and efficiency while mitigating potential risks.

Comparison Between ChatGPT and Google Search as Sources of Postoperative Patient Instructions

Noel F. Ayoub, MD, MBA; Yu-Jin Lee, MD, MS; David Grimm, BS; et alKarthik Balakrishnan, MD, MPH

https://jamanetwork.com/journals/jamaotolaryngology/article-abstract/2804300

A recent study in JAMA Otolaryngol Head Neck Surg evaluated ChatGPT as a source of postoperative patient instructions compared to Google Search. The study focused on ChatGPT’s potential in enhancing patient knowledge and providing postoperative guidance for populations with low educational or health literacy levels. Although ChatGPT cannot replace human clinicians, it shows promise as a supplementary medical knowledge resource. ChatGPT scored slightly lower than Google search of postoperative instructions and pre-existing institutional (Stanford University) postoperative instructions.

ChatGPT – Reshaping medical education and clinical management

Rehan Ahmed Khan, Masood Jawaid, Aymen Rehan Khan, Madiha Sajjad

https://pubmed.ncbi.nlm.nih.gov/36950398/

AI language models like ChatGPT are revolutionizing industries with their ability to generate human-like text, proving particularly useful in medical education and clinical management. In clinical management, it assists in patient data management, documentation, decision support, and patient communication. However, ChatGPT is not a replacement for health professionals and has limitations, including a lack of human-like understanding and possibly outdated data input.

How Does ChatGPT Perform on the Medical Licensing Exams? The Implications of Large Language Models for Medical Education and Knowledge Assessment

Aidan Gilson, Conrad Safranek, Thomas Huang, Vimig Socrates, Ling Chi, R. Andrew Taylor, and David Chartash

https://www.medrxiv.org/content/10.1101/2022.12.23.22283901v1

This article evaluates ChatGPT’s performance on USMLE Step 1 and Step 2 exam questions, highlighting its potential as a medical education tool with a success rate comparable to a third-year medical student. The article also discusses various medical knowledge platforms and datasets designed to improve medical education and information retrieval for healthcare professionals, such as PubMedQA and Medical Exams. Overall, ChatGPT shows promise in enhancing medical education and training.

ChatGPT in Medical Education and Clinical Management

In this survey, ChatGPT has shown potential in revolutionizing medical education and clinical management. This article explores the various applications of ChatGPT in these domains, emphasizing its benefits and limitations, and highlighting the importance of considering it as an adjunct tool rather than a replacement for human teachers.

A study evaluating the accuracy and reproducibility of ChatGPT in providing information on cirrhosis and hepatocellular carcinoma (HCC) demonstrated its potential in answering common patient questions, offering practical advice, and providing emotional support. However, the model’s limitations include the inability to provide specific cut-off values and guideline recommendations. As ChatGPT continues to improve, it may serve as a valuable tool for patients but should not replace healthcare providers.

In another study, ChatGPT’s performance on USMLE Step 1 and Step 2 questions was assessed. The model performed more accurately on Step 1 questions and outperformed other language models like GPT-3 and InstructGPT. These results suggest that ChatGPT could be a promising medical education tool, with a knowledge level comparable to a third-year medical student. It may prove useful in small group settings, such as problem-based learning or clinical problem-solving, but further research is needed to evaluate its efficacy in these contexts.

The rapid development of chatbots and AI systems like ChatGPT offers the potential to reshape modern medicine and revolutionize education through virtual and augmented reality technologies. By easing physicians’ burdens and creating immersive learning experiences, these technologies can enhance understanding and engagement. However, they also come with limitations, such as propagating false information and biases. To ensure responsible adoption, physicians and educators must understand and drive the conversation around these technologies.

AI technologies are transforming scientific publishing. However, as these advancements continue to grow, it is essential to address ethical concerns and maintain transparency in their use. In medical education and clinical management, AI technologies like ChatGPT can provide significant benefits, but their limitations must be acknowledged.

ChatGPT has shown promise as a medical education tool and can potentially improve clinical management. However, it is crucial to emphasize that it is not a replacement for human teachers or healthcare providers. As AI technologies continue to evolve, responsible implementation and an understanding of their limitations will be essential in harnessing their full potential in medical education and clinical management.

woman in white button up long sleeve shirt holding white card
Photo by National Cancer Institute on Unsplash

Empathy and AI Chatbots

Most surprising out of all the articles surveyed is that that patients consistently rated the responses of ChatGPT higher than responses from Doctors in the Ayers study, Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. This is fascinating and important for our work because one of the main areas for patient satisfaction is receiving a human response.

A recent cross-sectional study evaluated the quality and empathy of AI chatbot-generated responses to patient questions related to cirrhosis and hepatocellular carcinoma (HCC). The study demonstrated that AI chatbots can provide accurate information, practical advice, and emotional support to patients, albeit with some limitations, such as the inability to provide specific cut-off values and guideline recommendations.

As AI chatbots continue to improve, they may serve as a valuable adjunct tool for patients seeking information and support. However, it is crucial to emphasize that chatbots should not replace healthcare providers but rather complement their services by drafting messages and assisting in patient communication. This approach can potentially improve workflow, patient outcomes, and reduce clinical visits.

The importance of empathetic AI chatbot responses is further highlighted by the significant increase in depression and anxiety symptoms during the COVID-19 pandemic. By prioritizing and making mental health care accessible to vulnerable populations, AI chatbots can play a crucial role in addressing the growing mental health crisis.

Despite the promising results, further research is needed to assess the impact of AI chatbots in clinical settings, including evaluating their effectiveness in providing emotional support and addressing the unique needs of patients with different cultural backgrounds, languages, and social circumstances.

AI chatbots like ChatGPT show great potential in generating quality and empathetic responses to patient questions, as well as drafting messages for clinicians. As these technologies continue to advance, it is essential to prioritize research in clinical settings to ensure the responsible adoption and integration of AI chatbots into healthcare systems. By doing so, the healthcare industry can harness the full potential of AI chatbots in enhancing patient care, providing emotional support, and improving overall medical outcomes.

ChatGPT’s Failures in Ophthalmology and Other Specialized Domains

A recent study evaluated the performance of ChatGPT in answering ophthalmology board certification practice questions. With only a 46% success rate, ChatGPT may not yet be prepared to offer substantial assistance in board certification preparation.

AI chatbots like ChatGPT are transforming the publication process by providing personalized health recommendations, streamlining communication, and improving efficiency in screening and ranking submissions. As a result, the process of disseminating valuable medical information becomes more accessible and effective for medical professionals and students.

The study evaluating ChatGPT’s performance on cirrhosis and hepatocellular carcinoma (HCC) questions highlights the need for continued development and research to enhance its capabilities. As AI technologies advance, they can offer valuable support to healthcare professionals and students, assisting in tasks such as patient education, communication, and decision-making.

However, it is essential to recognize the current limitations of AI chatbots like ChatGPT and to ensure responsible adoption of these technologies in medical practice. As the potential of AI in medicine continues to grow, it is crucial for physicians and educators to understand and drive the conversation around these advancements, ensuring their benefits are maximized while minimizing potential risks and addressing ethical concerns.

Assessing Cardiovascular Disease Prevention Advice from ChatGPT

One interesting study in the survey aims to evaluate the quality of cardiovascular disease (CVD) prevention advice provided by ChatGPT, focusing on the accuracy and reliability of the information.

As AI technologies continue to advance, they can significantly contribute to patient care and streamline healthcare processes. However, it is essential to ensure robust data security measures are in place to protect sensitive medical information. Furthermore, the responsible implementation of AI in healthcare settings should focus on improving care effectiveness, reliability, and efficiency while mitigating potential risks.

The study highlights the potential of ChatGPT in providing CVD prevention advice and its potential role as a supplementary resource for patients. As the technology continues to improve, it could enhance patient support and understanding of CVD prevention strategies. However, healthcare providers should remain the primary source of medical information, and AI models like ChatGPT should be used as an adjunct tool to complement human expertise.

ChatGPT in Postoperative Patient Instructions

As artificial intelligence continues to evolve, its applications in healthcare, particularly in providing postoperative patient instructions, are becoming increasingly relevant. One of the studies in the survey evaluated ChatGPT as a source of postoperative patient instructions, comparing it to Google Search. The study aimed to assess ChatGPT’s potential for enhancing patient knowledge and offering guidance to populations with low educational or health literacy levels.

The study’s findings suggest that ChatGPT holds promise as a supplementary medical knowledge resource for postoperative patient instructions. By providing accurate and relevant information, it can help bridge the knowledge gap for patients with low health literacy levels.

a garden of flowers
Photo by DeepMind on Unsplash

New Research Frontiers and Areas for AI in Healthcare IT

As artificial intelligence (AI) technologies like ChatGPT continue to advance, they open new research frontiers in various fields, including healthcare, medical education, and mental health. Ongoing research related to ChatGPT, medical knowledge platforms, and the COVID-19 pandemic emphasizes the importance of continuous exploration and responsible implementation of AI technologies in healthcare, such as we hope to do with Caduceus and the patients that speak with our chatbot on WhatsApp and SMS.

ChatGPT’s potential in answering common patient questions, offering practical advice, and providing emotional support has been demonstrated in studies evaluating its accuracy and reproducibility in providing information on cirrhosis and hepatocellular carcinoma (HCC). However, limitations such as the inability to provide specific cut-off values and guideline recommendations necessitate further research to improve the model.

These technologies show promise in revolutionizing healthcare by providing insights, optimizing systems, and guiding medical decisions. However, they also pose potential risks, such as harm, inequity, and failure to perform. It is crucial to emphasize that true AI enhances human cognition rather than replacing it. To harness the full potential of AI in learning health systems, responsible implementation is necessary, focusing on improving care effectiveness, reliability, and efficiency while mitigating potential risks.

In conclusion, the ongoing research related to ChatGPT, medical knowledge platforms, and the COVID-19 pandemic highlights the importance of continuous exploration and responsible implementation of AI technologies in healthcare. These advancements have the potential to transform the way we approach medical education, patient care, and mental health, but must be balanced with ethical considerations and transparency to ensure their responsible and effective use.


Your AI Scribe Partner in Combating Doctor Burnout. Automation in Documentation and instructional resources for how to save time. 

Product

© 2023 Caduceus. All Rights Reserved.