Article Review: Clinician driven automated data preprocessing in nuclear medicine AI environments

2 min read
Objectives
- Introduced the rule set table (RST) as an interface to incorporate clinician's input into data preprocessing (DP) for AI models in nuclear medicine.
- Evaluated the impact of RST on the predictive performance of machine learning (ML) models in three different cancer cohorts (glioma, prostate, and diffuse large B-cell lymphoma (DLBCL)).
- Demonstrated that RST, when combined with manual DP, improved the balanced accuracy (BACC) of ML models by up to 18% compared to models without RST.
Methodology
- Implemented a rule set table (RST) to translate clinician's input (exp-keep, exp-remove, pref-keep, pref-remove) into machine-readable instructions for DP algorithms.
- Incorporated commonly used algorithms for DP of clinical cohorts in single and multi-center scenarios.
- Utilized a 100-fold Monte Carlo cross-validation scheme for single-center cohorts and a dual-center setup for DLBCL cohort.
- Employed the XGBoost algorithm for classification tasks across all established models.
- Compared the performance of RST across all actions, as well as without RST, in both manual and automated (ML-driven data preparation, MLDP) settings for each cohort.
Results
- Performance increase of ML models with manual preprocessing combined with RST was up to 18% BACC compared to models without RST.
- ML models with "exp-keep" and "pref-keep" instructions showed the highest performance increase: +18% BACC (glioma), +6% BACC (prostate), and +3% BACC (DLBCL) compared to other models across all datasets.
- Specific BACC values for different scenarios are provided in Table 3, along with p-values and confidence intervals in Supplemental Table S3.
Discussions
- The study presents a novel approach (RST) to incorporate clinical domain knowledge into the DP process, which is a significant contribution. However, the validation relies on previously identified high-ranking features, limiting the assessment of RST's ability to discover new relevant features.
- The study could benefit from a more detailed explanation of how the "pref-keep" and "pref-remove" actions are weighted and prioritized within the DP algorithms.
The criteria in Supplemental Table S1 are relatively general.
- While the study compares manual DP and MLDP, it would be valuable to investigate the performance of RST with other automated DP methods besides MLDP.
- The retrospective nature of the study, using pre-identified features, is acknowledged as a limitation. A prospective study involving clinicians actively providing input through RST would strengthen the findings.
Reference: Clinician driven automated data preprocessing in nuclear medicine AI environments
0
Subscribe to my newsletter
Read articles from Aldo Yang directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
