Do as AI say: susceptibility in deployment of clinical decision-aids

Abstract Artificial intelligence (AI) models for decision support have been developed for clinical settings such as radiology, but little work evaluates the potential impact of such systems. In this study, physicians received chest X-rays and diagnostic advice, some of which was inaccurate, and were...

Full description

Bibliographic Details
Main Authors: Susanne Gaube, Harini Suresh, Martina Raue, Alexander Merritt, Seth J. Berkowitz, Eva Lermer, Joseph F. Coughlin, John V. Guttag, Errol Colak, Marzyeh Ghassemi
Format: Article
Language:English
Published: Nature Publishing Group 2021-02-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-021-00385-9
id doaj-fbdca284c3614633abe5587e5294e051
record_format Article
spelling doaj-fbdca284c3614633abe5587e5294e0512021-02-21T12:41:57ZengNature Publishing Groupnpj Digital Medicine2398-63522021-02-01411810.1038/s41746-021-00385-9Do as AI say: susceptibility in deployment of clinical decision-aidsSusanne Gaube0Harini Suresh1Martina Raue2Alexander Merritt3Seth J. Berkowitz4Eva Lermer5Joseph F. Coughlin6John V. Guttag7Errol Colak8Marzyeh Ghassemi9Department of Psychology, University of RegensburgMIT Computer Science & Artificial Intelligence Lab, Massachusetts Institute of TechnologyMIT AgeLab, Massachusetts Institute of TechnologyBoston Medical CenterDepartment of Radiology, Beth Israel Deaconess Medical CenterLMU Center for Leadership and People Management, LMU MunichMIT AgeLab, Massachusetts Institute of TechnologyMIT Computer Science & Artificial Intelligence Lab, Massachusetts Institute of TechnologyLi Ka Shing Knowledge Institute, St. Michael’s HospitalDepartments of Computer Science and Medicine, University of TorontoAbstract Artificial intelligence (AI) models for decision support have been developed for clinical settings such as radiology, but little work evaluates the potential impact of such systems. In this study, physicians received chest X-rays and diagnostic advice, some of which was inaccurate, and were asked to evaluate advice quality and make diagnoses. All advice was generated by human experts, but some was labeled as coming from an AI system. As a group, radiologists rated advice as lower quality when it appeared to come from an AI system; physicians with less task-expertise did not. Diagnostic accuracy was significantly worse when participants received inaccurate advice, regardless of the purported source. This work raises important considerations for how advice, AI and non-AI, should be deployed in clinical environments.https://doi.org/10.1038/s41746-021-00385-9
collection DOAJ
language English
format Article
sources DOAJ
author Susanne Gaube
Harini Suresh
Martina Raue
Alexander Merritt
Seth J. Berkowitz
Eva Lermer
Joseph F. Coughlin
John V. Guttag
Errol Colak
Marzyeh Ghassemi
spellingShingle Susanne Gaube
Harini Suresh
Martina Raue
Alexander Merritt
Seth J. Berkowitz
Eva Lermer
Joseph F. Coughlin
John V. Guttag
Errol Colak
Marzyeh Ghassemi
Do as AI say: susceptibility in deployment of clinical decision-aids
npj Digital Medicine
author_facet Susanne Gaube
Harini Suresh
Martina Raue
Alexander Merritt
Seth J. Berkowitz
Eva Lermer
Joseph F. Coughlin
John V. Guttag
Errol Colak
Marzyeh Ghassemi
author_sort Susanne Gaube
title Do as AI say: susceptibility in deployment of clinical decision-aids
title_short Do as AI say: susceptibility in deployment of clinical decision-aids
title_full Do as AI say: susceptibility in deployment of clinical decision-aids
title_fullStr Do as AI say: susceptibility in deployment of clinical decision-aids
title_full_unstemmed Do as AI say: susceptibility in deployment of clinical decision-aids
title_sort do as ai say: susceptibility in deployment of clinical decision-aids
publisher Nature Publishing Group
series npj Digital Medicine
issn 2398-6352
publishDate 2021-02-01
description Abstract Artificial intelligence (AI) models for decision support have been developed for clinical settings such as radiology, but little work evaluates the potential impact of such systems. In this study, physicians received chest X-rays and diagnostic advice, some of which was inaccurate, and were asked to evaluate advice quality and make diagnoses. All advice was generated by human experts, but some was labeled as coming from an AI system. As a group, radiologists rated advice as lower quality when it appeared to come from an AI system; physicians with less task-expertise did not. Diagnostic accuracy was significantly worse when participants received inaccurate advice, regardless of the purported source. This work raises important considerations for how advice, AI and non-AI, should be deployed in clinical environments.
url https://doi.org/10.1038/s41746-021-00385-9
work_keys_str_mv AT susannegaube doasaisaysusceptibilityindeploymentofclinicaldecisionaids
AT harinisuresh doasaisaysusceptibilityindeploymentofclinicaldecisionaids
AT martinaraue doasaisaysusceptibilityindeploymentofclinicaldecisionaids
AT alexandermerritt doasaisaysusceptibilityindeploymentofclinicaldecisionaids
AT sethjberkowitz doasaisaysusceptibilityindeploymentofclinicaldecisionaids
AT evalermer doasaisaysusceptibilityindeploymentofclinicaldecisionaids
AT josephfcoughlin doasaisaysusceptibilityindeploymentofclinicaldecisionaids
AT johnvguttag doasaisaysusceptibilityindeploymentofclinicaldecisionaids
AT errolcolak doasaisaysusceptibilityindeploymentofclinicaldecisionaids
AT marzyehghassemi doasaisaysusceptibilityindeploymentofclinicaldecisionaids
_version_ 1724257872891609088