Accurate and cost-effective grading of compost maturity is critical for agronomic safety and process optimization. This study developed an image-driven in situ grading framework for compost maturity using deep feature clustering and supervised prediction, enabling multi-level differentiation beyond conventional binary maturity identification. Compost images were first labeled as mature or immature based on indicator-derived maturity results and then used to train deep learning models for initial classification and feature extraction. Class incremental learning-based P-ResNet-18 achieved the best classification performance, with all key metrics exceeding 0.94 and only a 0.3% training-test gap, indicating strong generalization capability. For refined maturity grading, latent Dirichlet allocation (LDA) achieved the best overall grading performance, with over 80% coverage, more than 90% purity, and the strongest monotonic relationship with composting duration. The LDA-derived labels further enabled supervised multi-level prediction, with all four evaluated models achieving grading accuracy above 90%. Grad-CAM analysis revealed a dispersed-concentrated-extensive evolutionary pattern during composting, with color and texture identified as the dominant discriminative features. The framework also remained robust under noise perturbation, with over 93% consistency, demonstrating its potential to support intelligent composting management through improved end-point determination and reduced unnecessary over-composting.
Accurate classification of colonoscopic images is essential for early detection and characterization of colorectal diseases. Recent advances in deep learning, particularly transformer-based architectures and graph neural networks (GNNs), provide alternative strategies for modeling global contextual information and relational structures in image representations. This study evaluates transformer-based and graph-based frameworks under a unified experimental protocol for endoscopic colon disease classification. Experiments were conducted on the Kvasir V2 dataset using two primary paradigms: (i) a Vision Transformer (ViT) with selective fine-tuning and learning-rate scheduling, and (ii) a CNN-GNN pipeline integrating image embeddings with graph construction strategies (cosine similarity, k-nearest neighbors, and epsilon-radius graphs) and multiple GNN architectures. Performance was evaluated using accuracy, precision, recall, and macro-F1 score, with Grad-CAM used for qualitative interpretability analysis. The selectively fine-tuned Vision Transformer achieved 94.6% accuracy with a macro-F1 score of 0.94. The best graph-based configuration (ViT embeddings with epsilon graph and GIN aggregation) achieved 92% accuracy and 0.92 macro-F1 score. Transformer-based contextual modeling provides strong discriminative capability for image-level colon disease classification, while graph-based relational modeling offers competitive performance when paired with high-quality embeddings.
Objective: To construct and validate a predictive model for endotypes in patients with chronic rhinosinusitis with nasal polyps (CRSwNP) using a sinus CT-based multitask learning network (MTLNet). Methods: CRSwNP patients who underwent initial treatment at the Second Affiliated Hospital of Shantou University Medical College from January 1, 2020 to April 30, 2024 were retrospectively enrolled and randomly divided into training and validation sets in an 8∶2 ratio. Patients from May 1 to November 30, 2024 were retrospectively enrolled as the external validation set at the same center. Endotypes were classified into eosinophilic and non-eosinophilic types according to the Guideline for Diagnosis and Treatment of Chronic Rhinosinusitis (2024). The MTLNet model adopted a U-shaped architecture, capable of simultaneously performed two tasks: three-dimensional (3D) sinus region segmentation and endotype classification. Model performance was evaluated using Dice similarity coefficient (DSC), confusion matrices, and the area under the curve (AUC) with 95% confidence intervals (CIs) calculated via bootstrap resampling. 3D image reconstruction technology and gradient-weighted class activation mapping (Grad-CAM) were used for visual explanation of the model's working mechanism. Results: A total of 257 CRSwNP patients were included, including 172 in the training set, 41 in the validation set, and 44 in the external testing set. In the training and validation sets, the MTLNet model exhibited excellent 3D sinus region segmentation performance (DSC: 0.913 and 0.887, respectively) and endotype classification performance (AUC: 0.871 and 0.770, respectively). In the external test set, the model maintained good predictive performance with a segmentation DSC of 0.898 and an endotype classification AUC of 0.818 (sensitivity 72.7%, specificity 78.8%), indicating favorable generalization ability. 3D image reconstruction technology and Grad-CAM visualization demonstrated good model interpretability. Conclusion: A novel MTLNet model is developed with excellent clinical predictive performance, achieving artificial intelligence-enabled accurate CRSwNP endotype prediction that can assist rhinologists in formulating individualized and precise treatment strategies. 目的: 利用基于鼻窦CT的多任务学习网络(multitask learning network,MTLNet)构建并外部验证慢性鼻窦炎伴鼻息肉(CRSwNP)患者内在型的预测模型。 方法: 回顾性收集2020年1月1日至2024年4月30日在汕头大学医学院第二附属医院进行初次治疗的CRSwNP患者,按8∶2的比例随机分为训练集与验证集;回顾性收集同医院2024年5月1日至11月30日的患者作为外部测试集。根据《慢性鼻窦炎诊断与治疗指南(2024)》将内在型分为嗜酸粒细胞型和非嗜酸粒细胞型。MTLNet模型采用U型架构,可同时输出三维鼻窦区域分割和内在型分类两个任务。通过Dice相似系数(Dice similarity coefficient,DSC)、混淆矩阵和曲线下面积(area under the curve,AUC)评估模型预测性能。采用三维图像重建技术和梯度加权类激活映射(gradient-weighted class activation mapping,Grad-CAM)可视化解释模型工作原理。 结果: 共纳入257例CRSwNP患者,其中训练集172例、验证集41例、外部测试集44例。在训练集和验证集中,MTLNet模型具有良好的三维鼻窦区域分割性能(DSC分别为0.913和0.887)和内在型分类性能(AUC分别为0.871和0.770);在外部测试集中,MTLNet模型的预测效能仍表现良好:分割DSC为0.898,内在型分类AUC为0.818(灵敏度72.7%,特异度78.8%),说明模型具有良好的泛化能力。三维图像重建技术及Grad-CAM可视化显示模型具有良好的可解释性。 结论: 本研究开发了一种新型的MTLNet模型,且具有良好的临床预测效能,实现了人工智能赋能的CRSwNP内在型准确预测,可协助鼻科医生制订个体化精准治疗策略。.
Hyperspectral remote sensing technology is one of the key technical methods for detecting rice blast in the field, but existing hyperspectral dimensionality reduction methods still suffer from information redundancy and insufficient feature interpretability. This study aimed to develop a feature wavelength selection method integrating deep learning and model attribution analysis to extract key spectral features across different disease severity levels. A residual network model Dilated Convolution and Deformable Convolution-Residual Network (DCR-ResNet) combining dilated convolution and deformable convolution was constructed to deeply mine spectral features across varying disease severities. Meanwhile, the Integrated Gradient (IG) and Gradient-weighted Class Activation Mapping (Grad-CAM) methods were combined to enable the selection of spectral wavelengths. The effectiveness of the proposed method was validated using statistical analysis (transformed divergence, within-class scatter) and modeling analysis. Findings reveal that the spectral feature wavelengths identified by DCR-ResNet in conjunction with the IG-GradCAM approach exhibit excellent inter-class separability and intra-class compactness. Furthermore, when benchmarked against conventional dimensionality reduction techniques such as Successive Projections Algorithm, Random Frog, and Competitive Adaptive Reweighted Sampling, the Support Vector Machine, Extreme Learning Machine, and Random Forest models developed using IG-GradCAM-selected feature wavelengths demonstrate superior classification performance. The overall accuracy reaches 85.9%, 85.5% and 86.2%, with kappa values of 81.3%, 80.6% and 81.6%, respectively. The feature wavelength selection method combining DCR-ResNet with IG-GradCAM not only improves the accuracy of hyperspectral feature extraction but also provides an efficient and feasible approach for the precise identification of rice blast. © 2026 Society of Chemical Industry.
Confidence calibration, selective prediction, out-of-distribution scoring, and deep ensembles are mature techniques in machine learning, yet their efficacy under the severe domain shift encountered when plant disease classifiers move from controlled laboratory imagery to heterogeneous field photographs has not been systematically benchmarked. Models trained on PlantVillage were evaluated on PlantDoc leaf-level crop images under a parent-image-aware split protocol, and a suite of standard mitigation techniques was applied to characterize the reliability gap. Analyses included temperature scaling and selective prediction for a fine-tuned ResNet-50, quantitative image-level shift analysis, Grad-CAM visualization, simple target-aware adaptation baselines, frozen-feature backbone comparisons, and ensemble baselines. In the primary case study, a fine-tuned ResNet-50 suffered a 67.7-percentage-point accuracy collapse upon cross-domain transfer, while mean predicted confidence remained at 79.76%. Post-hoc temperature scaling reduced calibrated ECE to 0.3645 but left selective risk at 80% coverage at 64.30%. Quantitative image-level shift analysis confirmed large-effect-size differences in saturation (d = 3.90), border edge density (d = 3.33), and foreground-occupancy proxy (d = 2.48) between the two domains, while Grad-CAM visualizations showed that the model shifts attention from lesion-centered regions in PlantVillage to background-dominated areas in PlantDoc. Simple target-aware mitigations, including adaptive batch normalization and feature moment matching, improved accuracy from 0.321 to 0.343 and 0.366, respectively, whereas DANN-style adversarial adaptation degraded performance to 0.252. A frozen-feature backbone comparison across five backbones showed that, within the energy-scoring frozen-backbone comparison, DINOv2-S/14 achieved the highest unknown-detection AUROC (0.764) and the lowest selective risk at 80% coverage (0.520), with paired Wilcoxon tests confirming statistically significant accuracy and macro-F1 differences across backbones. Two ensemble baselines were evaluated: a warm-start end-to-end ResNet-50 ensemble reduced calibrated ECE to 0.063 but achieved only 0.666 AUROC, while a lightweight DINOv2 linear-probe ensemble achieved 0.779 AUROC after calibration but under limited epistemic diversity. Neither ensemble established deployment-grade reliability: the best selective risk at 80% coverage across all configurations remained above 0.51. The principal contribution is a reproducible, deployment-oriented reliability characterization showing that standard post-hoc and lightweight adaptation techniques reduce but do not eliminate the severe reliability gap under controlled-to-field transfer in agricultural computer vision.
The molecular and spatial heterogeneity of gliomas severely limits accurate prediction of postoperative adjuvant chemotherapy efficacy, representing a critical bottleneck in achieving personalized treatment decisions. Conventional imaging assessments and single-modality AI models struggle to comprehensively characterize the complex tumor phenotype. Based on preoperative multimodal MRI data from the TCGA-LGG cohort integrated with clinical and survival information from the Genomic Data Commons (GDC), this study extracted 726 radiomics features. Postoperative chemotherapy benefit was operationalized using overall survival (≥24 months vs. <24 months), a pragmatic surrogate endpoint validated in prior low-grade glioma radiomics studies. Two types of deep learning embedding features were generated using segmentation-guided 3D bounding-box and patch-based sampling strategies, combined with a lightweight 3D CNN and a pretrained 3D ResNet-18 (MedicalNet) model. Prediction models were constructed using radiomics alone, deep learning alone, and their fusion, and were evaluated through stratified cross-validation on both a real-world dataset (Dataset 1) and a dataset augmented via PCA-GMM (Dataset 2). Model interpretability was assessed using SHAP attribution analysis, variance analysis, and Grad-CAM visualization. Radiomics and deep learning features exhibited significant information complementarity: the former focused on describing overall tumor volume, morphology, and macro-texture, while the latter excelled at capturing local heterogeneity and subtle spatial infiltration patterns. The fusion model demonstrated optimal performance in predicting postoperative chemotherapy benefit, achieving an AUC of 0.75 on the constrained real-world dataset (Dataset 1) and improving to 0.99 on the more feature-diverse Dataset 2. SHAP analysis revealed key radiomics features driving model predictions, whereas Grad-CAM heatmaps localized model attention to tumor core regions and infiltrative margins-areas highly consistent with the pathological microenvironment associated with drug efficacy. The dual-dataset comparison further confirmed that data quality and feature diversity are core drivers for unleashing model predictive potential and enabling precise drug response phenotyping. This study establishes a transparent and interpretable multimodal AI framework that integrates handcrafted and deep learning MRI features, significantly enhancing the prediction of postoperative chemotherapy benefit in gliomas.
This study aims to develop an artificial intelligence system capable of automatically classifying endoscopic images of reflux esophagitis (RE) according to the Los Angeles (LA) classification, thereby improving the accuracy and efficiency of RE diagnosis and providing intelligent support for clinical decision-making. RE images from three centers were collected to construct a dataset for training, validating, and testing a deep learning model. Model performance was evaluated using metrics such as accuracy, sensitivity, specificity, precision, area under the receiver operating characteristic curve (AUC), and F1 score. After model training, Grad-CAM (Gradient-weighted Class Activation Mapping) visualization techniques were applied to enhance model transparency. Finally, a clinical application was developed using PyQt5 technology for portable use. Among the five models evaluated, YOLOv11l demonstrated the best performance, achieving an accuracy, precision, sensitivity, and F1 score of 97.89%, 94.90%, 93.69%, and 94.28% on the validation set, respectively; and a weighted average accuracy, precision, specificity, and AUC of 96.26%, 91.58%, 98.04%, and 0.995 on the test set. The diagnostic accuracy of this model was significantly higher than that of both junior (χ2=45.93, P<0.05) and senior endoscopists (χ2=8.34, P<0.05). The artificial intelligence model and application developed based on the YOLOv11 network can rapidly and accurately grade the severity of RE according to the LA classification on retrospective external test data, providing a promising proof-of-concept system that warrants further prospective and multi-reader validation before routine clinical deployment.
The diagnosis of grade IV brain tumors, such as de novo glioblastoma, has recently attracted a lot of scientific interest in neuroimaging and deep learning. Glioblastoma, a very rare and highly aggressive brain tumor, poses considerable diagnostic challenges due to restricted data availability and substantial intratumoral heterogeneity. This paper introduces VMAM-NET, a hybrid deep meta-learning model that combines VGG-16-based feature extraction with model-agnostic meta-learning (MAML) to enhance glioblastoma diagnosis in data-scarce settings. The VGG model, initially trained on an Astrocytoma dataset, acquires domain-specific imaging characteristics that the MAML framework utilizes for rapid adaptation to few-shot learning tasks involving glioblastoma samples. The model is evaluated on four reliable MRI datasets, using comprehensive preprocessing and stringent optimization. Experimental findings indicate that VMAM-NET attains training and testing accuracies of 98.69% and 96.71%, respectively, with an F1-score of 0.9694, surpassing traditional deep learning and meta-learning models. The approach offers significant interpretability using gradient-based class activation maps (Grad-CAM), emphasizing tumor-relevant areas in MRI scans. The proposed framework provides a scalable and clinically feasible diagnostic measure, with potential relevance to further rare disorders. VMAM-NET enhances the application of data-efficient artificial intelligence in healthcare under resource-constrained environments.
Background: Rare neurological diseases are challenging to diagnose from brain MRI because of their low prevalence, heterogeneous imaging patterns, and limited annotated datasets. Deep learning may support image-level recognition, but results from curated datasets without complete patient-level identifiers require cautious interpretation. Objectives: This study proposes RareNeuroXNet, a frequency-aware multi-branch attention framework for image-level classification of rare neurological diseases from brain MRI. The objective was to assess whether combining global anatomical, local fine-grained, and frequency-domain representations improves benchmark performance, calibration, and interpretability. Methods: RareNeuroXNet uses three complementary branches: a global branch for whole-image representation, a local branch for regional feature extraction, and an FFT magnitude-based frequency branch. Features are refined using CBAM attention, fused, and classified through a fully connected head. The model was evaluated on a balanced curated dataset with five rare neurological disease classes using five-fold cross-validation, ablation analysis, calibration metrics, internal baseline comparison, paired testing against DenseNet121 local-only, and Grad-CAM visualization. MCND was also used as a complementary cross-dataset neurological MRI benchmark, not as same-task external validation. Results: RareNeuroXNet achieved strong image-level internal benchmark performance, with accuracy of 0.9924±0.0061, macro F1-score of 0.9924±0.0061, macro AUROC of 0.9998±0.0002, and macro AUPR of 0.9992±0.0007. Calibration was favorable, with ECE of 0.0052±0.0029 and NLL of 0.0276±0.0159. Ablation results showed that the local branch was the dominant contributor, while FFT and CBAM provided supportive refinement. Compared with DenseNet121 local-only, RareNeuroXNet showed modest classification gains and clearer calibration improvements. Conclusions: RareNeuroXNet demonstrated strong controlled image-level benchmark performance with high discrimination, stable cross-validation behavior, favorable calibration, and Grad-CAM interpretability. However, possible correlated slices, duplicate images, or subject overlap cannot be excluded. Future work should use patient-level, same-task, multi-center external validation and 3D multimodal MRI analysis.
Current deep learning models for early breast cancer lack interpretability and multimodal integration, limiting their clinical acceptance. This study aimed to develop and evaluate a deep learning system that automates breast ultrasound evaluation to support early breast cancer detection in clinical assessment. We developed BrcaDetect, which integrates ultrasound image-based deep learning predictions, Breast Imaging Reporting and Data System (BI-RADS) assessments, and demographic factors. A total of 24,762 ultrasound images from 3048 women across five hospitals were retrospectively collected. The model was trained and internally validated using 19,340 images from 2399 patients at three tertiary hospitals between January 2017 and December 2020, and externally validated using 5422 images from 649 women at two additional hospitals between January 2021 and August 2023. All lesions were confirmed by biopsy or 3-year follow-up. Model performance and its impact on the diagnostic accuracy of five radiologists were evaluated. BrcaDetect outperformed image-based deep learning and demographic model, achieving an area under the curve (AUC) of 0.989 (95% confidence interval (CI): 0.979-0.999), 0.851 (95% CI: 0.819-0.884), and 0.826 (95% CI: 0.804-0.848), with corresponding sensitivities of 98.8%, 93.0%, and 71.8%. In the reader study, radiologists assisted by BrcaDetect achieved significantly higher diagnostic accuracy than unassisted reading (0.977 [95% CI: 0.967-0.986] vs. 0.919 [95% CI: 0.900-0.935]; p < 0.001). As an image‑level decision-support model, BrcaDetect was associated with improved radiologists' performance and interpretability under controlled reading conditions, reducing false positives and demonstrating proof-of-concept for decision support in clinical assessment workflows. Current deep learning models for early breast cancer lack interpretability and multimodal integration, severely limiting their clinical acceptance in practice. In a retrospective study, BrcaDetect outperformed single‑modality models across three cohorts and provided strong interpretability via Grad‑CAM and Shapley values. This article addresses the lack of interpretability and multimodal integration in deep learning models for early breast cancer, presenting BrcaDetect: explainable predictions via Grad-CAM and Shapley values may reduce diagnostic uncertainty and support clinical workflow integration as a proof-of-concept.
Immune checkpoint inhibitors, particularly antibodies targeting programmed cell death 1 (PD-1), are increasingly used for advanced hepatocellular carcinoma (Ad-HCC), but treatment responses remain heterogeneous. Hyperprogressive disease (HPD) is an especially concerning pattern of rapid progression after PD-1 therapy, and reliable pre-treatment tools to identify high-risk patients are still lacking. In this multicenter retrospective study of 665 patients with Ad-HCC receiving PD-1 inhibitor-based triple therapy, we developed a transformer-based multimodal model, Hyperprogression Oncological Predictive Enhanced-model (HOPE), integrating arterial- and portal-phase computed tomography with structured clinical factors. HOPE achieved an area under the receiver operating characteristic curve of 0.801 in the internal validation cohort and 0.687 in the external validation cohort, and outperformed clinical-only and imaging-only baseline models. Ablation analyses supported the value of multimodal integration. HOPE was further supported by prespecified subgroup analyses, survival risk stratification, and Gradient-weighted Class Activation Mapping (Grad-CAM) assessments. HOPE may serve as a clinically interpretable pretreatment decision-support tool for HPD risk stratification in patients with Ad-HCC receiving PD-1 inhibitor-based triple therapy, with potential utility for closer monitoring and risk-adapted management of patients predicted to be at high risk.
Brain age estimation provides a noninvasive MRI biomarker of neurodevelopment. In infancy, rapid regionally ordered myelination reflects brain maturation, yet early-life brain age estimation remains underexplored, particularly with myelination-sensitive MRI and biologically informed modeling. To develop and evaluate a biologically informed deep learning framework for infant brain age estimation using T1w/T2w ratio MRI. Retrospective. Internal cohort: 629 infants aged 0-24 months (626 with age-appropriate myelination, train/validation/test = 376/125/125), 3 with myelin-related developmental abnormalities for qualitative review. External cohort: 10 healthy infants aged 0-15 months (5 females, 5 males). Internal: 3T; 3D gradient-echo or 2D spin-echo T1w, and 2D turbo spin-echo T2w. External: 3T; 3D gradient-echo T1w and 2D turbo spin-echo T2w. 3D convolutional neural networks were trained with T1w, T2w, and T1w/T2w ratio inputs using manually defined biological age labels from visual myelination assessment. The model incorporated multi-task learning for age regression, white matter segmentation, and image reconstruction. Performance was evaluated using five-fold cross-validation with repeated random splits. Metrics included mean absolute error, root mean squared error, R 2 $$ {R}^2 $$ , and Pearson and Spearman correlations. Modality differences were tested using one-way ANOVA, t $$ t $$ -tests, and Mann-Whitney U $$ U $$ , with Cohen's d $$ d $$ and 95% confidence intervals. In the external cohort, absolute prediction errors were compared using the Wilcoxon signed-rank test. Statistical significance was defined as p < 0.05 $$ p<0.05 $$ . T1w/T2w ratio models achieved the best overall performance (MAE: 1.489  ± $$ \pm $$ 0.302 months; r $$ r $$  = 0.966  ± $$ \pm $$  0.012), compared with T1w (2.055  ± $$ \pm $$  0.944; 0.933  ± $$ \pm $$  0.061), T2w (1.794  ± $$ \pm $$  0.434; 0.947  ± $$ \pm $$  0.023), T1w+T2w (1.546  ± $$ \pm $$  0.291; 0.960  ± $$ \pm $$  0.013), and T1w+T2w+RI (1.498  ± $$ \pm $$  0.313; 0.963 ± $$ \pm $$ 0.012). Modality effects were significant for MAE, RMSE, R 2 $$ {R}^2 $$ , r $$ r $$ , but not for ρ $$ \rho $$ ( p = 0.250 $$ p=0.250 $$ ). Auxiliary-task and multi-scale modeling numerically improved performance (MAE, 1.203 months; r $$ r $$  = 0.979). External validation showed the lowest error for the RI-based model (MAE, 1.16 months), and Grad-CAM highlighted myelination-relevant white matter. T1w/T2w ratio MRI combined with biologically informed deep learning enabled accurate and interpretable infant brain age estimation. This framework showed promising cross-scanner performance and may support MRI-based assessment of early brain maturation. 3. 2. Assessing brain development in infants is critical for early detection of developmental delays. This study developed a deep learning model that estimates infant brain age from MRI by combining two standard scan types into a ratio image highlighting myelin, the insulating coating around nerve fibers that increases as the brain matures. Trained on 629 infants aged 0–24 months, the model predicted developmental age with a mean error of approximately 1.5 months. Attention maps confirmed the model focused on regions known to undergo early myelination. This approach showed consistent performance across two different scanners and may support objective monitoring of infant brain maturation.
Agriculture plays a critical role in ensuring global food security, yet crop yield variability driven by climate change, soil heterogeneity, and environmental fluctuations poses persistent challenges. Accurate yield forecasting and intelligent crop recommendation are essential for sustainable and efficient farm management. This study proposes a novel AI-driven framework for precise crop yield prediction and data-driven crop selection. The framework integrates systematic data preprocessing, hybrid feature selection, deep learning, and attention-based modeling to capture complex nonlinear relationships within agricultural datasets. Raw data, including soil properties, topography, climatic variables, and historical yield records, are processed using median-based imputation, normalization, and Z-score outlier detection to enhance reliability. A multi-stage hybrid feature selection approach combining Minimum Redundancy Maximum Weight, Sequential Forward Subset Selection, and Recursive Fisher Score identifies the most informative features while reducing redundancy. Yield prediction is performed using an attention-enhanced hybrid kernel Extreme Learning Machine (ELM). Crop recommendation is achieved through a Spatio-Temporal Explainable Group-Enhanced Transformer Network (STX-GTNET) optimized with the PantheraCobra metaheuristic. Model interpretability is ensured using Grad-CAM and Integrated Gradients. Experimental results demonstrate strong performance, achieving an RMSE of 281.6 and R² of 0.94 for yield prediction, and 98.4% accuracy with a 0.991 ROC-AUC for crop recommendation.
Pancreatic ductal adenocarcinoma (PDAC) has a poor prognosis, with high early recurrence rates after curative resection. Current prediction methods, based on clinicopathological features or conventional radiomics, often fail to capture intratumoral heterogeneity (ITH), a key driver of recurrence. Computed tomography (CT)-based habitat analysis quantifies ITH by identifying phenotypically distinct tumor subregions, while deep learning (DL) can extract complex imaging patterns. Their integration may improve recurrence risk assessment. This study aimed to develop and validate a fusion model that integrates CT-based habitat analysis, a 2.5D convolutional neural network (CNN)-Transformer DL framework, and clinicopathological features to noninvasively predict early recurrence (within one year) risk after PDAC resection. In this multicenter retrospective study, 346 patients with resected PDAC were included from four institutions. Tumors were segmented into three habitat subregions via unsupervised K‑means clustering. Radiomic features from these subregions constructed the HabitatAll model. In parallel, a DL model was built using a 2.5D CNN-Transformer architecture. Predictive scores from both models were integrated with key clinicopathological variables through ridge regression to develop the fusion model (HADLC). Model interpretability was examined using SHAP (SHapley Additive exPlanations) and Grad‑CAM (Gradient‑weighted Class Activation Mapping). Model performance was evaluated using the area under the receiver operating characteristic curve (AUC), accuracy, calibration curves, and decision curve analysis (DCA). The HADLC model showed superior predictive ability, achieving AUCs of 0.977 (training), 0.916 (internal test), and 0.838-0.866 (external validation), outperforming the standalone HabitatAll, DL, and Clinic models. It demonstrated good calibration and provided higher net clinical benefit across most decision thresholds. Interpretability analyses revealed key imaging phenotypes linked to aggressive tumor biology. The HADLC model effectively integrates multimodal information to accurately assess early postoperative recurrence risk in PDAC, providing a robust, non-invasive imaging biomarker to potentially guide personalized treatment.
The diagnosis and surgical prediction of necrotizing enterocolitis (NEC) remain challenging. Our goal is to develop an interpretable multimodal artificial intelligence model to assist these key clinical decisions. This retrospective study included 484 neonates (242 with NEC, 242 without NEC). We developed a dual Swin Transformer integrating abdominal X-rays (2D branch) and laboratory parameters (1D branch) via late fusion. The model was refined using an external data domain adaptation strategy (n = 50) and evaluated on independent internal and external test sets. The interpretability of the model was evaluated by Grad-CAM and SHAP. The optimized multimodal model showed high performance on the internal test set, achieving AUCs of 0.915 for NEC diagnosis and 0.920 for surgical prediction. On the independent external test set, it achieved AUCs of 0.903 (diagnosis) and 0.894 (surgical prediction), significantly outperforming baseline models. Interpretability analyses highlighted clinically relevant features, including intestinal pneumatosis and specific inflammatory markers (such as C-reactive protein) as key predictive factors. The dual Swin Transformer provides an accurate, interpretable, and adaptable multimodal tool that integrates radiographic and laboratory data to support NEC diagnosis and personalized surgical decision-making. This study developed a dual Swin Transformer, which integrates abdominal X-rays and laboratory data to provide a robust multimodal framework for the diagnosis and surgical prediction of necrotizing enterocolitis. By implementing an external data domain adaptation strategy, the study contributes to overcoming the key challenge of clinical heterogeneity and temporal variability in NEC cohorts. Using Grad-CAM and SHAP visualization to identify specific predictive characteristics improves model transparency and clinician trust. These findings provide an explainable and adaptable AI tool to support evidence-based and personalized clinical decision-making in neonatal intensive care.
Postoperative cerebrovascular events, including transient ischemic attacks, infarctions, and hemorrhages, remain a significant concern in pediatric patients with Moyamoya disease (MMD)undergoing surgical revascularization. This study aimed to develop an explainable deep learning-based classification model using intraoperative arterial blood pressure (ABP) waveform analysis for postoperative cerebrovascular events in pediatric patients undergoing surgery for MMD, with exploratory analysis of associated waveform-derived physiologic features. This retrospective study included 181 pediatric patients (≤18 years) who underwent revascularization surgery for MMD, with an independent temporal holdout cohort of 79 patients reserved for validation. ABP signals were preprocessed using detrending, pulse segmentation, and normalization, then converted into image representations for deep learning classification. Various convolutional neural network (CNN) models, including ResNet50, ResNet34, DenseNet121, VGG16, and VGG19, were evaluated against Vision Transformer (ViT) architectures. Multiple image transformation methods were tested, and Grad-CAM analysis and statistical comparisons of waveform-derived physiologic features were conducted between patients with and without postoperative cerebrovascular events. The optimal model configuration achieved the best performance using raw pulse waveforms with three consecutive pulses per image. CNN-based models outperformed ViT-based models, with the highest internal classification performance observed using raw pulse waveforms (AUROC = 0.772, SD = 0.070).In the independent temporal validation cohort, the model achieved an AUROC of 0.738 ± 0.011 at the patient level. Grad-CAM visualization highlighted the diastolic runoff phase as a region of interest for classification. Four waveform-derived features related to arterial compliance were significantly associated with postoperative cerebrovascular events (p < 0.05). In this study, CNN-based deep learning models demonstrated the feasibility of predicting postoperative cerebrovascular events from intraoperative ABP waveforms, with diastolic runoff dynamics emerging as a potentially relevant physiologic pattern. These findings are exploratory and require prospective multi-center validation before clinical application.
Acute Lymphoblastic Leukemia (ALL) is one of the most aggressive hematological malignancies, and its early diagnosis remains challenging due to non-specific clinical symptoms and reliance on invasive procedures such as bone marrow biopsies. To address these limitations, we propose Meta-Conformer-XAI, a novel meta-learned hybrid deep learning framework for non-invasive ALL detection using microscopic peripheral blood smear images. Unlike conventional CNN-Transformer pipelines, our approach integrates three key innovations: (1) a Dual Attention Feature Fusion (DAFF) block that adaptively combines local morphological features extracted by a CNN with global contextual dependencies captured by a Vision Transformer (ViT); (2) a Meta-Learning Path Controller, which dynamically optimizes information flow between convolutional and transformer pathways for improved generalization across heterogeneous datasets; and (3) a Reinforcement Learning-based Confidence Estimator, ensuring robust decision reliability in clinical settings. We validated the framework on two benchmark datasets, the ALL Image Dataset and the C-NMC Leukemia Dataset using both fixed train/validation/test splits and 5-fold cross-validation. To mitigate class imbalance, a class-aware augmentation strategy was employed, significantly improving minority-class recognition. Meta-Conformer-XAI achieved 0.9924 accuracy on the ALL dataset and 0.9636 accuracy on the C-NMC dataset, with AUC-ROC scores exceeding 0.99 across both, outperforming baseline CNNs, ViTs, and existing hybrid architectures. Furthermore, the framework incorporates a comprehensive explainability module combining Grad-CAM, SHAP, LIME, and Integrated Gradients, providing transparent insights into feature attribution and clinical relevance. Overall, Meta-Conformer-XAI advances the state of the art in automated leukemia diagnosis by offering a precise, interpretable, and scalable tool that addresses current limitations of diagnostic invasiveness, model generalization, and clinical trustworthiness.
Accurate assessment of lymph node metastasis (LNM) following neoadjuvant chemoradiotherapy (nCRT) presents a significant clinical challenge and is essential for the management of locally advanced rectal cancer (LARC). In this multicenter study, we developed and externally tested a multimodal MRI-based framework that integrates clinical demographics, handcrafted radiomic signatures, and deep learning (DL)-derived features to predict post-nCRT lymph nodal status. This study enrolled 382 LARC patients who underwent surgery after nCRT at four centers. Post-nCRT T2-weighted (T2WI) and diffusion-weighted (DWI) MRI images were used to extract radiomic and DL-derived features of tumors. After feature harmonization and selection, a predictive model was constructed using a DL fusion network and a random forest algorithm. The model performance was evaluated across the training, validation, internal test, and external test cohorts using the area under the receiver operating characteristic curve (AUC) and decision curve analysis (DCA). Shapley Additive Explanation (SHAP) and gradient-weighted Class Activation Mapping (Grad-CAM) were used to enhance the model's interpretability. The combined model, which included clinical, radiomics and DL-derived features, demonstrated the optimal predictive capacity, with an AUC of 0.771 in the external test dataset. This approach shows promise for noninvasively determining treatment response, prognosis, and surgical management.
Snakebite envenoming is a significant global health crisis that has been long neglected as a global health priority. It is a huge problem for rural communities of low and middle-income countries, India accounts for the largest proportion of snakebite deaths globally. Timely identification of venomous snakebite and its syndromic pattern is essential for effective administration of antivenom and supportive treatment. Expert identification of snake species and syndromes is not always available in peripheral healthcare settings. This leads to delays, unnecessary referrals, or improper treatment choices. Additionally, diverse snake species distribution and venom variations across regions pose challenges. AI-powered image classification methods can help overcome these barriers. We propose a clinically oriented deep learning pipeline for binary classification of venomous and non-venomous snake species of India using real-world imagery data. This pipeline would serve as a baseline step towards aiding snakebite management at peripheral healthcare setups with scarce resources. The selected dataset consisted of 20 medically important Indian species. MobileViT-S, ConvNeXt-Tiny, EfficientNet-V2-S and ResNeXt-50 (32 × 4d) were trained under same conditions for comparison of results. Model interpretability was evaluated using Grad-CAM ++ to ensure that classification was not performed based on background but on features like head shape and stripes present on body. For reliable implementation we connected it to a web interface with human in loop expert verification. Experts can confirm or override predictions in real time. Among the evaluated architectures, ResNeXt-50 (32 × 4d) showed the most reliable and consistent performance in classifying venomous and non-venomous snakes. It achieved the highest test accuracy, sensitivity, specificity, and F1-score. The model also had strong discriminative ability, with a ROC-AUC of 0.9950 and PR-AUC of 0.9959. These results indicate dependable performance in safety-critical screening situations. Grad-CAM++ visualizations confirmed that predictions were based on anatomically relevant features, especially in the head and body contour areas. This supports model interpretability and reduces background bias. Although the dataset size and single-institution source limit how widely the results can be applied, the proposed framework shows that it's possible to create a clinically oriented, ready-to-use deep learning system for snakebite triage support. This system is intended as a scalable tool to help rural healthcare workers, emergency responders, and telemedicine platforms in areas where snakebites are common.
Accurate dental age estimation is a vital component of forensic identification, orthodontic treatment planning, and medico-legal assessments. Conventional approaches, such as Demirjian's and Cameriere's methods, are limited by observer variability and population dependency. Recent advances in artificial intelligence (AI) and deep learning (DL) offer potential for objective, reproducible, and scalable assessment through panoramic radiology. This study aimed to develop and validate a convolutional neural network (CNN)-based model for dental age estimation among adolescents using panoramic radiographs and to compare its accuracy with traditional manual methods. A total of 296 panoramic dental radiographs from adolescents aged 9-19 years were analyzed, with balanced representation across age groups and comparable distribution between male and female participants. Images were divided into training (70%, n = 207), validation (15%, n = 44), and testing (15%, n = 45) subsets. All images were preprocessed, normalized, and augmented prior to model development. A CNN-based regression model was trained using mean squared error as the loss function and evaluated using mean absolute error (MAE), root mean square error (RMSE), and Pearson correlation coefficient (r). Grad-CAM visualization was employed to enhance interpretability of model predictions. The proposed CNN model achieved an MAE of 1.12 years, RMSE of 1.37 years, and a strong correlation (r = 0.96) between predicted and chronological age, outperforming traditional Demirjian and Cameriere methods. Activation maps confirmed that predictions were based on biologically relevant anatomical regions, including developing third molars and mandibular growth zones. AI-driven panoramic radiographic analysis provides a robust, interpretable, and ethically compliant framework for adolescent dental age estimation. These findings highlight its translational potential in forensic odontology, pediatric dentistry, and legal age verification, supporting a shift toward standardized, data-driven dental diagnostics.