Published on in Vol 7 (2024)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/54919, first published .
Efficacy of ChatGPT in Educating Patients and Clinicians About Skin Toxicities Associated With Cancer Treatment

Efficacy of ChatGPT in Educating Patients and Clinicians About Skin Toxicities Associated With Cancer Treatment

Efficacy of ChatGPT in Educating Patients and Clinicians About Skin Toxicities Associated With Cancer Treatment

Department of Dermatology, Icahn School of Medicine at Mount Sinai, 5 East 98th St, 5th Floor, New York, NY, United States

Corresponding Author:

Nicholas Gulati, MD, PhD


This study investigates the application of ChatGPT, an artificial intelligence tool, in providing information on skin toxicities associated with cancer treatments, highlighting that while ChatGPT can serve as a valuable resource for clinicians, its use for patient education requires careful consideration due to the complex nature of the information provided.

JMIR Dermatol 2024;7:e54919

doi:10.2196/54919

Keywords



Cancer therapy often results in systemic side effects that manifest as skin toxicities [1]. While oncologists regularly interact with patients undergoing treatment, they may not possess specialized dermatological knowledge. Similarly, dermatologists may lack insights into the nuances of cancer treatment–related skin conditions. This underscores the need for a collaborative approach, to manage these complications effectively and educate patients about them. Artificial intelligence (AI) tools such as ChatGPT can enhance this effort by providing comprehensive, accessible medical information [2,3]. This study evaluates ChatGPT’s effectiveness in offering detailed information on cancer treatment–related skin toxicities, aiming to bridge the gap between patient education and medical professionals’ expertise.


Overview

We developed 22 patient-oriented and 18 oncologist-oriented questions regarding the management of cancer treatment–related skin toxicities, based on our clinical experience and research on patients undergoing cancer therapy and designed to mirror common issues observed in clinical practice. Responses to these questions were generated using ChatGPT (OpenAI) version 3.5 (Supplementary Material S1 in Multimedia Appendix 1) [4].

Three board-certified dermatologists (AL, NG, AP) specializing in oncodermatology and affiliated with a tertiary academic institution in New York City evaluated these responses. Accuracy was assessed on a scale of 1 (completely inaccurate) to 5 (completely accurate), while comprehensiveness was rated on a scale of 1 (not at all comprehensive) to 5 (extremely comprehensive). The Flesch Reading Ease Score (FRES) was interpreted on a scale of 0 (extremely difficult to read, professional level) to 100 (extremely easy to read, fifth-grade level) and calculated using an online readability tool [5]. Interrater reliability was calculated to assess the consistency of ratings across reviewers.

Ethical Considerations

This study did not involve human subjects or patient data and was therefore exempt from institutional review board approval.


Accuracy scores (out of 5) averaged 4.57 (SD 0.71) for patient questions and 4.54 (SD 0.68) for oncologist questions. Comprehensiveness scores (out of 5) averaged 4.43 (SD 0.69) for patient questions and 4.37 (SD 0.80) for oncologist questions. The average FRES scores were 41.9, 47.5, and 36.0 (overall, patient, and oncologist responses, respectively), all indicating college-level comprehension. Most (13/18, 72%) oncologist responses were unanimously deemed suitable for a patient-facing educational platform (Table 1). Interrater reliability analysis for all responses demonstrated a fair level of agreement between reviewers (27.7%; Fleiss κ coefficient of 0.227; P<.001) (Table 2).

Table 1. Twenty-two patient questions and 18 oncologist questions generated based on prior consultations received by the Dermatology Department at the Icahn School of Medicine at Mount Sinai from oncologists and graded on accuracy, comprehensiveness, and reading level.
QuestionsAccuracy (for each of the 3 reviewers; 1-5)Comprehensiveness (for each of the 3 reviewers; 1-5)Flesch Reading Ease Score (0-100)
Patient questions
General questions
How will my skin change on chemotherapy?5/5/44/5/449.7
What types of chemotherapy cause my hair to fall out?5/5/53/5/447.7
What types of chemotherapy cause skin reactions?5/5/53/4/342.5
How often should my doctor monitor my skin during cancer treatment to stay on top of any changes?4/3/54/3/541.3
How long might skin reactions last after finishing my cancer treatment?4/3/54/3/546.4
Why am I seeing skin changes on immunotherapy treatment?5/4/34/5/333.2
How should I take care of my skin while on Keytruda treatment?5/5/44/5/554.4
After completing cancer treatment with Taxol, what long-term effects could there be on my skin?5/5/54/5/542.3
Evaluation questions
I am starting treatment with Taxol. Could you explain what skin side effects I should expect during this treatment?5/5/45/5/450.5
I am starting treatment with radiation therapy. Could you explain what skin side effects I should expect during this treatment?5/5/54/5/547.7
I am getting a bone marrow transplant. Could you explain what skin side effects I should expect during this treatment?5/5/54/5/545.0
I developed a rash on my face after starting Keytruda. What could be causing this?5/4/34/5/436.3
My nails have started separating from the nail bed after treatment with Tarceva. Is this normal and what should I do?5/5/54/5/539.4
I’m feeling depressed about the blisters on my feet from chemotherapy. Do you have any advice for coping, physically and mentally?5/5/54/5/547.7
I’m concerned about the changes I’ve noticed in my skin texture since starting Taxol. When should I contact my doctor regarding these changes?5/5/44/5/555.9
Management questions
Treatment with Tagrisso has caused me to have acne. How can I best manage this side effect?4/5/54/5/556.4
My skin has become very itchy since starting Keytruda. What should I do?4/5/54/5/548.0
I started getting blisters after radiation therapy. How can I best manage this side effect?5/5/43/5/547.0
Since starting methotrexate, I have started to lose a lot of hair. What should I do to prevent hair loss?4/5/24/4/366.1
I have very dry, cracked skin on my hands since taking Taxol. What moisturizers or creams can help with this?4/5/24/4/366.1
What types of moisturizers would you recommend for my skin during radiation therapy?5/5/44/5/540.8
My skin is more prone to sunburn since starting treatment with Xeloda. Are there specific sunscreen recommendations I should follow?5/5/55/5/552.7
Oncologist questions
General questions
I am an oncologist. What preventive measures can I take to minimize the risk of skin reactions in patients undergoing radiation therapy?5/5/43/5/546.1
What topical treatments are recommended for skin reactions on chemotherapy treatment?5/5/34/5/436.9
What types of skin reactions are most commonly seen with Tagrisso?5/5/44/5/548.5
What distinguishes between mild, moderate, and severe skin reactions on immunotherapy treatment?4/5/54/5/527.1
Are there certain patients who may be more susceptible to severe skin reactions during cancer treatment?4/5/54/5/539.2
Evaluation questions
I am an oncologist treating a patient with Gleevec. What types of rashes warrant holding this therapy?4/5/44/5/517.9
I am an oncologist treating a patient with Taxol. What types of rashes warrant holding this therapy?4/5/43/5/517.8
I am an oncologist. If a rash resolves but then recurs for my patient on Herceptin, could it be a sign of allergy?4/5/54/5/532.4
I am an oncologist. My patient has formed blisters on the hand and feet since starting Xeloda. When should I consider a dermatology consult?4/5/54/5/527.1
I am an oncologist treating a patient with 5-FUa. What features help distinguish between rashes needing dermatology consult versus those I can manage?4/5/55/5/533.1
Management questions
I am an oncologist and my patient undergoing treatment with hydroxyurea is experiencing hair loss. How should I counsel this patient regarding hair regrowth?5/5/54/5/549.0
I am an oncologist. My patient is on Gleevec and experiencing blisters on their skin. How should I treat them?4/5/35/5/344.0
I am an oncologist and my patient is experiencing hand-foot syndrome after starting Xeloda treatment. What are the best approaches to manage this condition?5/3/34/4/445.5
I am an oncologist treating a patient with 5-FU. How can I guide them in caring for their nails to prevent discoloration and brittleness?5/5/43/5/553.1
I am an oncologist observing rashes and blisters in my patient taking Padcev. How should I treat them?5/5/34/3/340.9
I am an oncologist and my patient has a grade 2 maculopapular rash. Do I need to give systemic steroids for the rash?5/5/53/5/532.8
I am an oncologist and my patient has a grade 3 maculopapular rash. Do I need to give systemic steroids for the rash?5/5/53/5/526.2
I am an oncologist. When should I consider dose reductions for my patient on radiation therapy?5/5/53/5/531.2

a5-FU: 5-fluorouracil.

Table 2. Assessment of interrater reliability by question type.
Question typePercent agreementFleiss κ coefficientFleiss κ coefficient P valueStrength of agreement
All responses
 All questions27.70.227<.001Fair agreement
 General questions19.40.103.10Slight agreement
 Evaluation questions34.50.246<.001Fair agreement
 Management questions29.30.290<.001Fair agreement
Patient question responses
 All questions20.5–0.118.08Poor agreement
 General questions18.8–0.130.22Poor agreement
 Evaluation questions28.6–0.189.18Poor agreement
 Management questions14.3–0.124.30Poor agreement
Oncologist question responses
 All questions33.30.358<.001Fair agreement
 General questions200.243<.001Fair agreement
 Evaluation questions400.359<.001Fair agreement
 Management questions370.386<.001Fair agreement

Our findings suggest that ChatGPT holds promise as a resource for both patients and clinicians navigating the complexities of cancer treatment–related skin toxicities, given the relatively high levels of accuracy and comprehensiveness of its responses. However, the college reading level of ChatGPT’s responses poses a potential hurdle to widespread use; ChatGPT may currently be a more appropriate tool for clinicians, who will be able to comprehend its responses more uniformly compared to patients. It will likely be practically utilized by oncologists to complement their clinical judgment and that of dermatologists, particularly as AI-driven tools become increasingly integrated into clinical settings.

Reviewers identified occasional redundancies, irrelevant information, and minor inaccuracies in ChatGPT’s responses. They noted the need for its responses to be more evidence-based and to offer more up-to-date clinical recommendations when addressing oncologist questions. For instance, when responding to a question regarding the treatment of a patient experiencing rashes and blisters while taking enfortumab, ChatGPT did not recognize Stevens-Johnson syndrome as a potential concern. It only suggested “temporarily” holding the medication, even though current research recommends “permanently” discontinuing enfortumab in cases of suspected Stevens-Johnson syndrome [6]. These observations underscore that although ChatGPT generally provides useful information and could streamline its dissemination, its responses still require refinement, careful implementation, and regular monitoring to be considered for clinical use.

Integrating AI into dermatology-related patient education raises several technical and ethical considerations, including patient privacy, potential biases in AI responses, and the vital need to keep AI models current with the latest dermatology guidelines. A limitation of our study is the use of a single AI model; a comparison of ChatGPT with other models would provide a more rounded perspective on the capabilities of AI in this context.

Future research should involve incorporating additional metrics, such as clinical applicability and impact on patient outcomes, to provide a more comprehensive evaluation of ChatGPT’s potential in clinical settings. Studies with larger sample sizes, broader diversity of questions, and wider ranges of evaluators will improve our findings’ generalizability. It would be valuable to study how variations in prompt formulation affect the accuracy and comprehensiveness of ChatGPT’s responses. These enhancements would further improve ChatGPT’s ability to support both patient education and clinical decision-making in the context of cancer therapy–related skin toxicities.

Conflicts of Interest

None declared.

Multimedia Appendix 1

Sample responses created by ChatGPT.

DOCX File, 18 KB

  1. Lacouture ME, Sibaud V, Gerber PA, et al. Prevention and management of dermatological toxicities related to anticancer agents: ESMO Clinical Practice Guidelines☆. Ann Oncol. Feb 2021;32(2):157-170. [CrossRef] [Medline]
  2. Young JN, O’Hagan R, Poplausky D, et al. The utility of ChatGPT in generating patient-facing and clinical responses for melanoma. J Am Acad Dermatol. Sep 2023;89(3):602-604. [CrossRef] [Medline]
  3. Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. Mar 1, 2023;7(2):pkad010. [CrossRef] [Medline]
  4. Haupt CE, Marks M. AI-generated medical advice-GPT and beyond. JAMA. Apr 25, 2023;329(16):1349-1350. [CrossRef] [Medline]
  5. Kher A, Johnson S, Griffith R. Readability assessment of online patient education material on congestive heart failure. Adv Prev Med. 2017;2017:9780317. [CrossRef] [Medline]
  6. Nguyen MN, Reyes M, Jones SC. Postmarketing cases of enfortumab vedotin-associated skin reactions reported as Stevens-Johnson syndrome or toxic epidermal necrolysis. JAMA Dermatol. Oct 1, 2021;157(10):1237-1239. [CrossRef] [Medline]


AI: artificial intelligence
FRES: Flesch Reading Ease Score


Edited by Ian Brooks, Robert Dellavalle; submitted 30.11.23; peer-reviewed by Hao Sun, Jaidip Jagtap; final revised version received 30.07.24; accepted 23.08.24; published 20.11.24.

Copyright

© Annie Chang, Jade Young, Andrew Para, Angela Lamb, Nicholas Gulati. Originally published in JMIR Dermatology (http://derma.jmir.org), 20.11.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Dermatology, is properly cited. The complete bibliographic information, a link to the original publication on http://derma.jmir.org, as well as this copyright and license information must be included.