Student Publications

Document Type

Article

Abstract

IMPORTANCE Neuroimaging-based artificial intelligence (AI) diagnostic models have proliferated in psychiatry. However, their clinical applicability and reporting quality (ie, feasibility) for clinical practice have not been systematically evaluated.

OBJECTIVE To systematically assess the risk of bias (ROB) and reporting quality of neuroimagingbased AI models for psychiatric diagnosis.

EVIDENCE REVIEW PubMed was searched for peer-reviewed, full-length articles published between January 1, 1990, and March 16, 2022. Studies aimed at developing or validating neuroimaging-based AI models for clinical diagnosis of psychiatric disorders were included. Reference lists were further searched for suitable original studies. Data extraction followed the CHARMS (Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies) and PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analyses) guidelines. A closed-loop cross-sequential design was used for quality control. The PROBAST (Prediction Model Risk of Bias Assessment Tool) and modified CLEAR (Checklist for Evaluation of Image-Based Artificial Intelligence Reports) benchmarks were used to systematically evaluate ROB and reporting quality.

FINDINGS A total of 517 studies presenting 555 AI models were included and evaluated. Of these models, 461 (83.1%; 95%CI, 80.0%-86.2%) were rated as having a high overall ROB based on the PROBAST. The ROBwas particular high in the analysis domain, including inadequate sample size (398 of 555 models [71.7%; 95%CI, 68.0%-75.6%]), poor model performance examination (with 100% of models lacking calibration examination), and lack of handling data complexity (550 of 555 models [99.1%; 95%CI, 98.3%-99.9%]). None of the AI models was perceived to be applicable to clinical practices. Overall reporting completeness (ie, number of reported items/number of total items) for the AI models was 61.2%(95%CI, 60.6%-61.8%), and the completeness was poorest for the technical assessment domain with 39.9%(95%CI, 38.8%-41.1%).

CONCLUSIONS AND RELEVANCE This systematic review found that the clinical applicability and feasibility of neuroimaging-based AI models for psychiatric diagnosis were challenged by a high ROB and poor reporting quality. Particularly in the analysis domain, ROB in AI diagnostic models should be addressed before clinical application.

Publication Title

JAMA Network Open

Publication Date

3-6-2023

Volume

6

Issue

3

ISSN

2574-3805

DOI

10.1001/jamanetworkopen.2023.1671

Keywords

Artificial Intelligence, AI, benchmarking, bias, calibration, diagnosis, humans, neuroimaging

Included in

Psychology Commons

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.