Secure transmission and storage of medical images is critical as patient information is sensitive and the trend toward digital healthcare, telemedicine and cloud based medical systems is growing rapidly. To address the increased challenges of security and privacy, a novel three layer medical image encryption algorithm is proposed providing a high level of security as well as providing lossless reconstruction for accurate diagnosis when using medical images. This algorithm decomposes the medical image according to bit-planes, into three semantic layers that contain structural, detail, and fine information, making it possible for medical images to be differentiated by level of protection based on the sensitivity of the information contained therein. A separate encryption process is applied for each of the three semantic layers, each using a different cryptographic key created from a single master key generated from an SHA-256 based key derivation process. This will eliminate key exchange overhead and improve the sensitivity of the keys being used. The key dependent substitution, block level diffusion, keystream encryption, and pixel chaining operations used for encrypting the layers provide confusion and diffusion within the system while maintaining the efficiency of computation. The three semantic layers that were previously encrypted are combined and processed with a global plaintext dependent diffusion operation, providing resistance to statistical and differential attacks. The results were evaluated experimentally across several imaging modalities demonstrating strong performance achieving an average entropy of 7.9998 indicating high randomness and uniform pixel distribution, near zero correlation confirming effective removal of spatial relationships between adjacent pixels, an average Number of Pixel Change Rate (NPCR) of 99.61 percent showing the percentage of pixels that change significantly when a single pixel in the original image is modified, and Unified Average Changing Intensity (UACI) of 33.47 percent indicating the average intensity difference between the original and encrypted images, and infinite Peak Signal to Noise Ratio (PSNR) confirming perfect lossless reconstruction.
Deep learning models for medical image analysis often rely on large-scale parameterization, which may limit their practical use in resource-constrained settings. This study aims to design a structurally compact multi-source framework capable of delivering competitive diagnostic performance with reduced computational overhead. We propose ML-ConvNet, a lightweight architecture comprising approximately 4.2 K parameters and 924 M FLOPs at 512×512 input resolution. The network incorporates Multi-Branch Re-parameterized Convolutions for scale-aware feature extraction, Hierarchical Dual-Path Attention for feature localization, Feature Self- Transformation for cross-feature interaction, and a Local Variance Weighted optimization strategy to address class imbalance. The framework is evaluated independently on three publicly available benchmark datasets representing heterogeneous imaging modalities: brain MRI, lung CT, and chest X-ray. Ablation studies, precision-recall analysis, cross-modality validation, and computational benchmarking are conducted to assess performance, stability, and efficiency under controlled experimental conditions. Within the evaluated settings, results indicate competitive diagnostic accuracy relative to established lightweight baselines, including EfficientNet and MobileNet variants, while substantially reducing parameter count. Class-wise F1-scores and PR-AUC values suggest relatively stable minority-class performance under repeated cross-validation sampling. Attention visualizations show activations concentrated over regions broadly associated with pathological findings, though these observations are qualitative in nature. Inference latency measurements on CPU and mobile hardware suggest feasibility for low-latency deployment under the tested single-image batch configurations, though real-world throughput may differ depending on hardware and operational conditions. These findings suggest that careful architectural design and domain-informed inductive biases may support competitive medical image classification on public benchmark datasets without extensive parameter scaling. The framework was evaluated exclusively under controlled conditions on publicly available data, and multi-institutional external validation is required before conclusions regarding generalizability or clinical applicability can be drawn.
Medical imaging is indispensable for diagnosis, with abdominal imaging playing a pivotal role in generating medical reports and informing clinical decision-making. Recent works in artificial intelligence (AI), particularly in multimodal approaches such as vision-language models, have demonstrated significant potential to enhance medical image analysis by seamlessly integrating visual and textual data. While 2D imaging has been the main focus of many studies, the enhanced spatial detail and volumetric consistency offered by 3D images, such as CT scans, remain relatively underexplored. This gap underscores the need for innovative approaches to unlock the potential of 3D imaging in clinical workflows. In this study, we utilized a multimodal AI pipeline, Phi3-V, to address 2 key challenges in abdominal imaging: generating clinically coherent medical reports from 3D CT images and performing visual question answering based on these images. Our optimized model attained an average GREEN score of 0.409 for medical report generation and an accuracy of 79% for multiple-choice visual question answering on the validation cases. These findings demonstrate the potential of multimodal AI in advancing the analysis of 3D medical imaging, paving the way for more robust and efficient applications in healthcare. This study advances the use of multimodal AI for 3D CT imaging, achieving improvements in medical report generation and visual question answering.
Deep learning (DL) continues to advance cardiac image analysis with increasingly sophisticated methodologies. Although convolutional neural networks laid the foundation for DL, emerging methods including graph neural networks, transformers, implicit neural representations, generative adversarial networks, and foundation models enable enhanced anatomical and functional modeling, image generation, and multimodal integration. Graph neural networks enable non-Euclidean data representations that preserve anatomical structure; transformers improve sequence modeling in dynamic imaging; and implicit neural representations introduce continuous spatial representations for more accurate reconstructions. Generative adversarial networks enhance image generation, noise reduction, and cross-modality synthesis adaptation, while foundation models introduce a unified, generalizable framework capable of adapting across diverse imaging tasks. This review discusses these key innovations of DL in cardiac imaging, their implications, and their challenges as well as potential future directions in the field, such as clinical validation trials.
There is a growing need to develop user-friendly, bladder-specific image analysis tools that can produce reliable artificial intelligence (AI)-quantitative imaging biomarkers (QIBs) derived from multiparametric (mp)MRI data for clinical applications. To address it, we developed an AI-powered BLADdEr multiparametric MRI Analysis for Clinical Application (AI-BLADE, current release v1.0) toolbox designed for extracting mpMRI-derived quantitative metrics. AI-BLADE is an advanced tool for bladder-specific mpMRI data analysis with 2 core functionalities: (1) Deep Feature Analysis (MRI-DFA toolkit) and (2) Data-Driven Model-Based Analysis (MRI-MBA toolkit). AI-BLADE offers customizable options and serves as a one-stop shop solution for bladder cancer (BCa) clinical applications. The models within DFA and MBA were tested separately on 2 patient cohorts. DFA was used to classify BCa histology subtypes (n = 104) with T2-weighted images, while MBA was used to interrogate tumour physiology by deriving mpMRI QIBs, including apparent diffusion coefficient (ADC), and volume transfer constant (Ktrans) obtained from 34 BCa patients. Out of the 17 AI models tested, the VGG19 model with a decision tree classifier and no feature selection for the fully connected layer 7 achieved the highest area under the curve of the receiver operating characteristic of 0.79 in classifying BCa histology subtypes, demonstrating the strongest performance. The mean ADC and Ktrans values were 1.22 × 10-3 (mm2/s) and 0.27 (min-1), respectively, reflecting underlying tumour physiology. The AI-BLADE (v1.0), a flexible and user-friendly software toolbox for analysing mpMRI data, shows strong potential for application in BCa oncology, offering capabilities that can enhance diagnostic accuracy and support improved patient outcomes. This is the first study to design, develop, and implement a novel bladder-specific AI toolbox for analysing mpMRI data. AI-BLADE enables an advanced image analysis workflow, facilitating AI-QIB-based clinical decision-making for patients with BCa.
Gastrointestinal cancer (GC) is particularly malignant as they tend to progress slowly before advanced stages due to the forecast of early-specific symptoms. The heterogeneous properties of GCs require extremely precise and sensitive diagnostic techniques that can integrate in-depth structural information and surface-level data to distinguish cancer severity grades, thereby significantly lowering mortality rates. Deep learning (DL) algorithms are crucial in the classification of these various GC grades. However, the algorithms cannot detect interpretability and are characterized by a high probability of false alarm rates in detecting the underlying acute relationship between the medical images. Additionally, existing systems lack language-level transparency, preventing them from generating user-based narrative diagnostic explanations consistent with medical standards. To address this aforementioned challenge, this research study introduces a novel explainable LLM (X-LLM) based DL framework, which overcomes the drawbacks of existing DL algorithms. The suggested framework uses the ensemble transformer architectures that combine the clinical features by integrating the endoscopy and computer tomography (CT) scan images for enhancing the performance in detecting the different severity grades of GCs. The proposed system uses several components: (1) heterogeneous image collection; (2) image pre-processing; (3) ensemble networks; (4) interpretability analysis; and (5) user-interaction module. The extensive experiments are conducted utilizing two different datasets, such as Kvasir and TCIA CT (TCGA-STAD) scan images. The severity annotation of both datasets was carried out by experienced medical doctors, including endoscopists. Several evaluation metrics, including accuracy, precision, and recall, are measured and benchmarked against other learning networks. The experimental findings demonstrate the enhanced performance of the proposed framework over the existing models by achieving the accuracy, precision, recall, and F1-score values of 0.99, 0.997, 0.99, and 0.99, respectively. Furthermore, different LLM models such as GPT4.0, GPT3.5, LLAMA, and GEMINI are integrated, and their mode of interaction is also analyzed with SHAP measurements. The suggested framework demonstrates its strong potential by enhancing diagnostic performance, achieving high performance with user-interacted clinical treatment outcomes.
This meta-analysis was conducted to systematically evaluate the accuracy of image-based deep learning models for aortic dissection segmentation and diagnosis, aiming to provide an evidence base for developing intelligent detection tools. A comprehensive search was performed across the Cochrane Library, PubMed, Embase, and Web of Science to identify studies on the effectiveness of deep learning in aortic dissection segmentation or diagnosis up to November 3, 2024. Risk evaluation was carried out with the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2). A total of 48 studies on deep learning models for aortic dissection segmentation or diagnostic tasks were included, with 28 on segmentation tasks and 20 on diagnostic tasks. For segmentation tasks, the mean Dice coefficient was 89.2% ± 4.4% for false lumen segmentation, 90.8% ± 3.4% for true lumen segmentation, and 91.7% ± 6.1% for entire aorta segmentation. For diagnostic tasks, computed tomography (CT) based deep learning showed pooled sensitivity and specificity of 0.94 [95% confidence interval (CI): 0.89-0.96] and 0.92 (95% CI: 0.88-0.95), respectively. In terms of electrocardiogram-based deep learning, the pooled sensitivity and specificity were 0.85 (95% CI: 0.79-0.89) and 0.90 (95% CI: 0.87-0.92), respectively. Regarding the computed tomography angiography (CTA) based deep learning, the pooled sensitivity and specificity were 0.94 (95% CI: 0.90-0.96) and 0.95 (95% CI: 0.91-0.98), respectively. Additionally, some studies compared the diagnostic performance of deep learning with that of clinicians. The pooled sensitivity and specificity were 0.79 (95% CI: 0.65-0.89) and 0.95 (95% CI: 0.88-0.94), respectively. Image-based deep learning models demonstrated high accuracy for aortic dissection segmentation and diagnosis. They performed comparably to or better than clinicians. These findings support their potential as clinical assistive tools. Future work should prioritize multicenter validation, seamless integration of these models into clinical workflows, and enhancement of model generalizability to facilitate broader clinical adoption. https://www.crd.york.ac.uk/PROSPERO/view/CRD42024619403, identifier CRD42024619403.
With the rapid advancement of Vision-Language Models (VLMs), there is a growing interest in adapting these models to the medical domain. However, the majority of VLMs treat medical tasks as straightforward question- answering problems, neglecting the need for step-by-step reasoning, which is essential for handling complex medical information and gaining clinical trust. To address these limitations, we develop a Multi-modal Medical Reasoning Model (MMRM), which augments visual-language models (VLMs) by incorporating structured Chain of Thought (CoT) reasoning mechanism for better simulating the real clinical diagnosis process. During the development process, we first propose an Ortho Enhanced Training Framework to optimize the critical visual encoder of the VLM. Secondly, we leverage a Black-box knowledge distillation method to transfer the medical Chain of Thought reasoning capabilities into the large language model, which serves as another key component of the VLM. Finally, we construct a unique multi-modal medical Chain of Thought dataset for training the entire VLM, allowing the model to explicitly learn diagnostic reasoning. Extensive experiments demonstrate that our proposed model achieves state-of-the-art performance on standard medical VQA benchmarks, including SLAKE outperforming existing methods in both diagnostic accuracy and explanation. Critically, the model not only draws conclusions in medical diagnosis but also presents interpretable reasoning pathway strengthens clinical trustworthiness, and ultimately paving the way for deploying AI-assisted healthcare solutions in real-world clinical setting.
Video-based or image-based human activity recognition (HAR) via machine learning algorithms helps track, detect, and categorize users' daily activities. It is usually formulated as specific research problems, such as fall detection, gait recognition, posture recognition, and gesture recognition. In practice, HAR is extended to a more generic group activity recognition (GAR) problem, focusing on multiple user (group) activities for various purposes, such as the recognition of social activities, autonomous driving, surveillance, and pedestrian crossing management. In this research, the formulation of the GAR problem aims to enhance the performance of the GAR model by tackling the research limitations in three aspects: (i) image quality may not always be guaranteed, resulting bias and inaccurate models; (ii) the working environments, in both indoor and outdoor, of the GAR model are dynamic, such as varying light conditions; and (iii) the number of people in the covered area may vary and the number of class labels for activities is large, leading to an extensive computing time to train a complex GAR model architecture. Therefore, a dynamic light image enhancement generative adversarial network has been proposed to enhance the image quality significantly. A multi-input image-enhanced generative adversarial network (MIIEGAN) has been proposed for generating high-quality additional training images, and a guided asymmetric depthwise separable convolution (GA-DSC) contributed to the model complexity and performance. These algorithms were evaluated in dynamic light environments to investigate their robustness and performance in varying light conditions that meet the nature of GAR in indoor and outdoor environments. Two benchmark datasets (The Volleyball Dataset and The Collective Dataset) were selected for the performance evaluation and comparison. Five backbones (AlexNet, VGG-16, Inception-v3, ResNet-50, and EfficientNetV2) were chosen as deep learning models for analysis. Nine existing methods were compared. Five variants of GANs and five variants of convolutional techniques were compared in the ablation experiments. Through experiments and ablation studies, the proposed work outperforms existing works (0.931-9.29% and 1.05-9.83%), five variants of GANs (on average 2.71% and 2.62%), and five variants of convolutional techniques (on average 1.91% and 1.75%).
To develop a semi-automated method to segment "black hole" lesions on post-gadolinium 2D T1-weighted images (GdT1) in multiple sclerosis (MS) that follows radiological intensity rules and perform multi-center validation. Multi-center spin-echo GdT1 images and accompanying proton-density (PD)/T2-weighted images and manual T2 lesion masks of the REFLEXION study (NCT00813709) of suspected/early MS were used. Briefly, the proposed method segments cortical gray matter (GM) to derive a T1-weighted intensity threshold, which is applied inside co-registered T2 lesion masks to segment black hole lesion voxels. It was optimized on a training set (N = 40, 57.5% female, mean age 31.4 ± 8.7 (standard deviation) years), and 274 patients formed the test set (61.3% female, age 31.8 ± 8.4 years). Performance was quantified by the Dice similarity coefficient (DSC) and the intraclass correlation coefficient (ICC) for absolute agreement with manual segmentations. Lesion-wise sensitivity and specificity were calculated. Optimization resulted in: (1) GM selection as minimally 0.8 total WM plus GM partial volume, masked by MNI cortex; (2) normalized mutual information-driven linear co-registration of T2 to GdT1 images, interpolating T2 lesion masks using trilinear interpolation and 0.6 threshold; (3) mean intensity inside GM mask used as upper intensity threshold. The optimized method had acceptable spatial accuracy (DSC: 0.39 ± 0.26) and good volumetric accuracy (ICC: 0.84, 95% CI [0.72, 0.90]. Lesion-wise sensitivity was 0.91 ± 0.19, and lesion-wise specificity was 0.62 ± 0.22. The proposed method to semi-automatically segment black holes from post-gadolinium T1-weighted images shows acceptable performance. As a potential aid to radiologists, the method is not recommended to be used entirely without human intervention. Question T1-hypointense "black hole" lesions reflect disease severity in multiple sclerosis but are not routinely quantified due to a lack of reliable analysis methods. Findings A rule-based semi-automated method for GdT1 "black hole" lesion segmentation was developed and optimized, and then validated in a large unseen multi-center test set. Clinical relevance This method adds quantitative information about GdT1 "black hole" lesions to the radiological assessment of multiple sclerosis disease severity, when false positives are manually removed. This can enhance the characterization of individual patients and advance the understanding of the disease.
To evaluate the feasibility of cerebral computed tomography angiography (CTA) obtained with reduced iodine and low radiation at 70 kVp and the effect of deep learning-based augmented contrast enhancement (DL-ACE) and denoising (DL-DN) algorithms on the CTA quality. In this prospective study, 47 healthy volunteers (male:female, 31:16; mean age ± standard deviation, 57.8 ± 10.9 years) were randomly assigned to one of three CTA protocols: Group A (n = 16; 100 kVp, 40 mL of 350 mgI/mL), Group B (n = 16; 70 kVp, 40 mL of 270 mgI/mL), and Group C (n = 15; 70 kVp, 28 mL of 270 mgI/mL [ultralow iodine]), with an injection rate of 2.5 mL/s for all. Images were reconstructed using filtered back projection (FBP), and images in Groups B and C were additionally reconstructed using DL-ACE and DL-DN. Arterial attenuation, image noise, contrast-to-noise ratio (CNR), and subjective image quality were compared among five image sets. Compared with Group A, Groups B and C received 23.7% lower radiation doses. With FBP, arterial attenuation was significantly higher in Groups B (435.8 ± 50.2 Hounsfield units [HU]) and C (391.8 ± 52.1 HU) than in Group A (321.1 ± 47.4 HU) (P < 0.001), while CNR did not differ significantly (Group A, 19.9 ± 4.7; Group B, 20.3 ± 3.8; and Group C, 18.4 ± 4.6) due to higher image noise in Groups B and C. After applying DL-ACE and DL-DN in Groups B and C, arterial attenuation increased by 45.4% and image noise decreased by 34.5%, resulting in significantly higher arterial attenuation, CNR, and subjective image quality compared with Group A (P < 0.001). Cerebral CTA at 70-kVp using ultralow iodine enhanced arterial attenuation but increased image noise compared with the 100-kVp CTA protocol. DL-ACE and DL-DN significantly increased arterial attenuation and reduced image noise, resulting in higher CNR and better subjective image quality.
Breast cancer surgery and corresponding treatments have significant residual effects on survivors of breast cancer in China. Body image distress and stigma are persistent challenges that negatively affect their quality of life. Accessible, sustainable, and cost-effective support remains scarce. This study aimed to evaluate the effectiveness and cost-effectiveness of an app-based mindfulness breast care (MBC) program in addressing body image distress and stigma for survivors of breast cancer. We carried out a randomized controlled trial in 2 university-affiliated hospitals in China. Survivors of breast cancer who had completed primary treatments and had mobile phone internet access were recruited and randomly assigned at a 1:1 ratio to the intervention (3-month MBC program plus routine care) or the control group (routine care alone). Under the conceptualization of mindfulness-based cognitive therapy, the MBC program was developed, including three modules: (1) Library, (2) Mindfulness Yoga, and (3) Mindfulness Practices. The primary outcomes measured were body image distress and stigma, and secondary outcomes included sleep quality, social support, and quality of life (physical and mental well-being). Assessments were conducted at baseline, 3 months (T1), and 6 months (T2). Multiple imputation was used to handle missing data and generalized estimating equations were fitted to evaluate the effectiveness. The incremental cost per quality-adjusted life year (QALY) gained was used to measure cost-effectiveness. A total of 192 survivors of breast cancer participated in the baseline assessment, with 155 completing the 2 follow-up surveys. The median total usage duration was 199.60 (IQR 70.90-451.31; mean 360.59, SD 511.72) minutes, and total login frequency was 39.50 (IQR 19.00-86.50; mean 57.02, SD 50.26) times. The reduction in body image distress at T2 (adjusted mean difference -1.91; 95% CI -3.40 to -0.42; P=.01; d=-0.31), the reduction in stigma at T1 (adjusted mean difference -5.83; 95% CI -8.46 to -3.20; P<.001; d=-0.61) and T2 (adjusted mean difference -7.79; 95% CI -10.62 to -4.97; P<.001; d=-0.82), and the improvement in mental well-being at T1 (adjusted mean difference 4.44; 95% CI 1.70 to 7.18; P=.002; d=0.43) were statistically significantly greater in the intervention group compared with the control group. No statistically significant group differences were observed regarding sleep quality, social support, and physical well-being. The cost-effectiveness analysis showed that the intervention group gained more QALYs than the control group at T2 (adjusted mean difference 0.008; 95% CI 0.004 to 0.016; P=.01). The incremental cost per QALY gained at T2 was US $19,431.25, indicating a 57% probability that the MBC program is a cost-effective intervention at a threshold of US $37,530, three times the 2023 gross domestic product per capita of China. An app-based MBC program was effective and potentially cost-effective and had the promise to be scalable for clinical practice. Chinese Clinical Trial Registry ChiCTR2200059952; https://www.chictr.org.cn/showproj.html?proj=167247.
Early detection of right ventricular (RV) dysfunction is essential in pulmonary arterial hypertension (PAH) but remains challenging using conventional echocardiography. This study investigates the feasibility of a noninvasive, physics-based framework using three-dimensional (3D) echocardiography that integrates myocardial strain and volumetric flow analysis to characterize RV mechanical performance across stages of PAH. A prospective pilot study (N = 15) enrolled healthy controls, PAH patients with preserved RV size, and PAH patients with RV dysfunction. Deformation was evaluated by principal strain analysis and by conventional (longitudinal, circumferential) components. Hemodynamic metrics included hemodynamic forces and energetic properties that were derived using a physics-informed volumetric echocardiographic particle image velocimetry (V-Echo-PIV) method applied to contrast-enhanced acquisitions. Deformation analysis revealed that longitudinal strain was significantly reduced even in PAH patients with preserved RV dimensions, while second principal (secondary) strain showed a distinctive sign reversal, indicating a paradoxical systolic lengthening, early in the disease. The analysis of hemodynamic forces showed a marked reduction in systolic propulsion across all PAH stages. In contrast, energetic abnormalities were predominantly observed at later stage of the disease. The integration of 3D myocardial strain with fluid dynamics provides a comprehensive physiological assessment of RV remodeling. While strain and systolic propulsion appear as sensitive markers for early dysfunction, diastolic energetics may support disease staging. This noninvasive framework shows promise for early detection and longitudinal monitoring of PAH patients.
Medical images acquired using different scanners and protocols can differ substantially in their appearance. This phenomenon, scanner domain shift, can result in a drop in the performance of deep neural networks which are trained on data acquired by one scanner and tested on another. This significant practical issue is well-acknowledged, however, no systematic study of the issue is available across different modalities and diagnostic tasks. In this paper, we present a broad experimental study evaluating the impact of scanner domain shift on convolutional neural network performance for different automated diagnostic tasks. We evaluate this phenomenon in common radiological modalities, including X-ray, CT, and MRI. We find that network performance on data from a different scanner is almost always worse than on same-scanner data, and we quantify the degree of performance drop across different datasets. Notably, we find that this drop is most severe for MRI and X-ray, yet small for CT, on average, which we attribute to the standardized nature of CT acquisition systems which is not present in MRI or X-ray. We also study how injecting varying amounts of target domain data into the training set, as well as adding noise to the training data, insufficiently helps with generalization, showing a need for more powerful domain adaptation methods. Our results provide extensive experimental evidence and quantification of the extent of performance drop caused by scanner domain shift in deep learning across different modalities, with the goal of guiding the future development of robust deep learning models for medical image analysis.
The distal-to-proximal pressure ratio (dpPR) has emerged as a superior indicator compared to the diameter stenosis rate (DSR) for assessing the functional severity of carotid artery stenosis (CAS). However, unlike DSR, dpPR cannot be directly determined by vascular imaging. In this study, we developed a hemodynamic modeling method to predict dpPR based on medical images available in clinical settings. A multiscale modeling method was employed to integrate a three-dimensional (3D) hemodynamic model of CAS into a lumped-parameter model of systemic hemodynamics, while incorporating patient-specific geometric information of large cerebral arteries derived from computed tomography angiography (CTA) images. The 3D modeling method was validated through in vitro fluid dynamics experiments, while the accuracy of the resulting multiscale model in predicting dpPR was evaluated by comparing model predictions with invasive pressure wire measurements. The model-predicted dpPR values for 27 carotid artery stenoses demonstrated strong agreement with invasive measurements, with a mean relative error of - 0.8% and a standard deviation of 2.5%. dpPR was only moderately correlated with DSR (r = - 0.55, p = 0.003). Further analysis revealed that the anatomical structure of the circle of Willis (CoW) is a major factor influencing the relationship between dpPR and DSR. Constructing a multiscale model based on CTA images provides a practical approach for assessing the hemodynamic impact of CAS. The significant influence of CoW's anatomical structure on the relationship between dpPR and DSR underscores the importance of considering systemic cerebral hemodynamics when evaluating the functional severity of CAS.
Integration of Artificial Intelligence (AI), particularly deep learning, into medical imaging represents a profound shift in diagnostic medicine, moving from purely descriptive analysis to advanced predictive and prescriptive analytics. This Collection explores the rapid advancement of AI-driven tools in their specific fields such as oncology, cardiology, ophthalmology and so on, highlighting their potential to improve diagnostic accuracy, workflow efficiency, and personalized treatment planning. However, significant challenges remain, including the heterogeneity of medical image data, the "black box" nature of some intelligent models, and the critical hurdles of clinical integration and validation. The research presented here addresses these frontiers, showcasing innovations in algorithm development, explainable AI, and translational application. This Editorial synthesizes the contributions and outlines the essential collaborative pathway-uniting computer scientists, clinicians, and regulatory bodies-required to translate algorithmic promise into robust, trustworthy, and equitable clinical tools that genuinely improve patient care.
The goal of this study is to empirically evaluate the Decoupled Momentum Optimizer (DeMo) in medical image segmentation while demonstrating its extensibility to applications outside LLMs. We aim to characterize the behavior of each parameter group and their adherence to conjectures underlying DeMo's function. DeMo leverages spatial redundancy in gradients through a spatially partitioned frequency decomposition compression algorithm, reducing network traffic and smoothing gradient noise. DeMo provides up to a 150x traffic reduction and 1.6x wall-time speedup on lung segmentation of COPDGene CTs. Analysis of gradients support the conjectures that the primary components of the gradient exhibited higher spatial autocorrelation and lower temporal variance. We find that these conjectures are not uniformly true across all parameters, but rather are predominantly observed in a small subset of them. We also introduce DeMoDropout, a modification to the algorithm that selectively compresses only the largest gradients to significantly reduce computational overhead while maintaining effective overall compression. Using the Beyond the Cranial Vault dataset, we demonstrate potential speed-ups at bandwidths of 1 Gb/s and 100 Mb/s (1.6x vs 1.5x and 6.151 vs 6.31x for DeMoDropout and DeMo, respectively).
Accurate identification of irreducible intussusception during air enema is crucial for optimizing enema strategies. Current methods are limited by subjective interpretation and inconsistent clinical criteria. We developed a deep learning (DL) framework to objectively predict irreducibility using air enema fluoroscopic images. In this retrospective study, a hybrid ensemble DL model was developed using fluoroscopic images acquired during air enema, comprising 770 irreducible and 1214 reducible cases. Model performance was evaluated on a real-world test set (46 irreducible vs. 802 reducible cases) and an external test set (9 irreducible vs. 101 reducible cases), with benchmarking against state-of-the-art techniques. The model's performance was further compared with radiologists' interpretations, and its ability to improve diagnostic accuracy was assessed. Performance was evaluated using receiver operating characteristic (ROC) analysis and confusion matrix-derived metrics. The proposed model achieved areas under the ROC curves (AUCs) of 0.89 (95% CI: 0.836-0.944) and 0.883 (95% CI: 0.78-0.968) on the real-world and external test sets, respectively, outperforming comparative methods (AUC ranges: 0.823-0.877 and 0.634-0.826). The model demonstrated superior performance compared with that of the intermediate radiologist (AUC: 0.89 vs. 0.804; P < 0.001) and comparable performance to that of a senior radiologist (AUC: 0.89 vs. 0.842; P = 0.108). When used as an assistive tool, the model significantly improved radiologists' diagnostic performance (all P < 0.01), with AUC improvements of 0.095-0.072, balanced accuracy gains of 8.6-11.7%, and specificity increases of 18.7-22.6%. The proposed model demonstrated promising diagnostic performance in identifying irreducible intussusception and may serve as an effective decision-support tool to improve radiologists' diagnostic accuracy during air enema procedure.
Quantitative and qualitative histopathological assessments may yield complementary and novel insights into myocardial tissue alterations in dogs with myxomatous mitral valve disease (MMVD) and dilated cardiomyopathy (DCM). This study aimed to compare myocardial histopathological features in multiple, region-matched cardiac specimens from dogs with MMVD, DCM, and cardiac-healthy controls using quantitative and qualitative histopathological methods. Dogs with MMVD (n=27), DCM (n=16), and cardiac healthy controls (n=32) were enrolled. Tissue proportions of cardiomyocytes, fibrosis, and fat, as well as the arterial lumen-to-area ratio (LAR), were quantitatively evaluated in each dog using a digital image analysis software, whereas the presence of endocardial thickening, fibrosis, fat, and attenuated wavy fibers was qualitatively assessed. Clinical MMVD dogs had higher proportions of fibrosis in the left ventricular (LV) lateral wall, the LV posterior papillary muscle, and left atrium compared to controls, and more MMVD dogs had thickened endocardium in the LA compared to controls (all P<0.05). Clinical DCM dogs had higher proportions of fibrosis in both atria compared to controls (P<0.05), and higher fibrosis in the interventricular septum compared to clinical MMVD dogs (P<0.05). Dogs with DCM and atrial fibrillation had higher proportions of atrial fibrosis compared to DCM dogs without atrial fibrillation (P<0.05). Neither the LAR nor the presence of attenuated wavy fibers differed between the groups in any region. In conclusion, myocardial histopathological alterations were predominantly left-sided in MMVD dogs and bilateral in DCM dogs. Histopathological characteristics in certain cardiac regions varied with disease type and severity, underscoring their potential role in disease pathogenesis.
To systematically evaluate the effectiveness of AI-assisted teaching in computed tomography angiography (CTA) training, focusing on theoretical knowledge, practical skills, image judgment time, and teaching satisfaction. Eight databases were searched for randomized controlled trials (RCTs) from 1990 to 2025. Two reviewers independently screened studies, extracted data, and assessed risk of bias using the Cochrane RoB 2.0 tool. Frequentist and Bayesian meta-analyses were performed, along with subgroup analyses by CTA anatomical type, learner group, and sample size. Seven RCTs with 515 participants were included. All studies were conducted in China with Chinese-language AI interaction. Frequentist analyses showed that AI significantly improved practical skills and reduced image judgment time. Benefits were also observed for theoretical scores and teaching satisfaction. However, Bayesian analyses revealed statistical uncertainty for theoretical scores and teaching satisfaction due to high heterogeneity. Subgroup analyses identified CTA type, learner background, and sample size as key sources of heterogeneity. All included studies had high or some concerns for risk of bias. AI-assisted teaching effectively improves practical skills and shortens image judgment time in CTA training, offering clear educational benefits for radiology trainees. However, evidence for theoretical scores and teaching satisfaction remains uncertain due to high heterogeneity and methodological limitations. The findings support targeted integration of AI into CTA curricula while highlighting the need for large-scale, standardized RCTs to confirm long-term efficacy and generalizability.