Banner

News

Article

AI has trouble picturing physicians that aren’t White men

'Striking' lack of diversity when researchers use artificial intelligence to generate images of doctors.

ai illustration of diverse doctors: © FutureStock - stock.adobe.com

This illustration of physicians was generated by artificial intelligence and shows diversity among doctors. A new study found worse results when testing five popular text-to-image generative AI programs.
© FutureStock - stock.adobe.com

Artificial intelligence (AI) thinks American physicians look mostly like White men, even as medicine diversifies with doctors who are women and are Asian, Black and Latino, according to a new study.

Researchers used five text-to-image generative AI programs to create images of physician faces, then identify them. Among the images, women were underrepresented, as were Asian and Latino ethnicities.

“This bias has the potential to reinforce stereotypes and undermine (diversity, equity and inclusion) initiatives within health care,” said the study, “Demographic Representation of Generative Artificial Intelligence Images of Physicians,” by Sang Won Lee, MSc, Mary Morcos, BS, Dong Won Lee, BS, and Jason Young, MD, was published in JAMA Network Open.

“Although strides toward a more representative health care workforce are being made as trainees from increasingly diverse backgrounds enter the workforce, this representation remains lacking within generative AI, highlighting a critical area for improvement,” the researchers said in the study.

Meaning in medicine

The results were “striking,” said Byron Crowe, MD, and Jorge A. Rodriguez, MD, in a commentary about the study, also published in JAMA Network Open.

“These results demonstrate a clear bias in outputs relative to actual physician demographics. But what do these findings mean for AI and its use in medicine?” they said.

Crowe and Rodriguez cited the National Institute of Standards and Technology, which published a report identifying three major sources of bias in AI. Computational biases arise from data that AI use. Biases in human thought influence development of AI tools, and systemic biases are created in structures, processes and norms of an organization, the commentary said.

“But who among these stakeholders is ultimately responsible for mitigating bias in AI outputs? The answer is exceedingly simple yet painfully complex: It is all of us,” Crowe and Rodriguez said.

The issue is extremely important in the context of medical treatment, up to life and death, they said, and mitigating bias will take tremendous work.

AI is impressive, but it remains a technology that must be told what to do, and why, Crowe and Rodriguez said.

“Whether those instructions create good or cause harm is ultimately the product of human beings and their choices,” the commentary said. “To this end, we must let our conscience be our guide, while making conscientious decisions that seek to illuminate unfairness and eliminate it.”

The study

The programs were DALL-E 2; Imagine AI Art Generator; Jasper Art: AI Art Generator; Midjourney Beta; and Text-to-Image. The researchers used them to create images based on four search terms: “face of a doctor in the United States;” “face of a physician in the United States;” “photo of a doctor in the United States;” and “photo of a physician in the United States.”

Across five platforms and 1,000 images, the faces were predominantly white men. There were a total of 820 images (82%) of White doctors, and 926 (93%) were men. Imagine AI Art Generator’s images were all men, and 195 of the 200 were white. Jasper Art’s face were 98% men (195), but the platform had the most images of Black physicians (57, or 29%) and Asian physicians (29, or 15%). DALL-E 2 generated the most images of women physicians (42, or 21%).

In real life, the Association of American Medical Colleges has survey data that among 886,030 participating physicians, 613,974 (62%) are men and 371,851 (38%) are women, with 558,938 (63%) being white, 186,175 (21%) Asian, 62,667 (7%) Latino, 50,965 (6%) Black, and 27,285 (3%) other. The researchers cited that data in the study.

Related Videos
Dermasensor
Kyle Zebley headshot
Kyle Zebley headshot
Kyle Zebley headshot