In this Viewpoint, we advocate for direct tokenisation of medical data by breaking them into discrete units, such as laboratory results, medications, and vital signs, similar to word tokenisation in language models. This approach enables transformer-based models to learn from the temporal structure of patient health timelines without relying on textual translation, potentially leading to more accurate and personalised care. Enhanced Transformer for Health Outcome Simulation, an example of a model that uses tokenisation, forecasts health timelines and supports clinical decision making using tokenised medical records. We outline a privacy-preserving model-sharing framework, in which models are trained locally and only trained models-not sensitive data-are shared, allowing collaborative development across institutions. We also emphasise that access to large, diverse datasets enhances fairness, generalisability, and equity in health-care generative artificial intelligence. Although challenges such as data complexity and interpretability remain, this Viewpoint underscores that embracing tokenised representations opens a path towards scalable, multimodal, and equitable artificial intelligence in medicine.
Enteric infectious diseases claim more than 1 million lives annually and are among the top ten causes of death in children younger than 5 years. Remarkable global investment has been dedicated to enteric infectious disease prevention and control; however, the shifting global health landscape is testing the continuance of progress. To evaluate the current status and guide future interventions, we present the latest epidemiological estimates of enteric infectious diseases from the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2023 and assess progress towards the Global Action Plan for the Prevention and Control of Pneumonia and Diarrhoea (GAPPD) mortality target of fewer than 20 deaths per 100 000 children younger than 5 years by 2025. We quantified the incidence, mortality, and disability-adjusted life-years (DALYs) of enteric infectious diseases by age, sex, and year across 204 countries and territories from 1990 to 2023. In GBD 2023, the following were considered under the category of enteric infectious diseases: diarrhoeal diseases, enteric fever (typhoid and paratyphoid), invasive non-typhoidal Salmonella spp (iNTS) infections, and other intestinal infectious diseases. We also examined 15 aetiologies contributing to diarrhoeal diseases. Incidence and prevalence were estimated with DisMod-MR (version 2.1), a Bayesian meta-regression tool, drawing on data from systematic reviews, population-based surveys, claims data, and hospital sources. Cause-specific mortality was modelled with Cause of Death Ensemble Modelling based on data from sources including vital registration, mortality surveillance, verbal autopsy, and minimally invasive tissue sampling. Years of life lost and years lived with disability were computed and combined to derive DALYs. For aetiology-specific estimation, population-attributable fractions (PAFs) for 15 pathogens were derived with a counterfactual framework. Point estimates and 95% uncertainty intervals (UIs) were generated from 250 draws from the posterior distribution. In 2023, enteric infectious diseases resulted in an estimated 1·27 million (95% UI 0·963-1·68) deaths globally, declining from 3·69 million (3·04-4·56) in 1990. The global age-standardised mortality rate (ASMR) decreased from 74·1 (62·0-92·9) per 100 000 population to 16·4 (12·6-21·3) per 100 000 population during the same period. Diarrhoeal diseases accounted for most deaths in 2023 (1·11 million [0·811-1·54]), followed by enteric fever and iNTS. South Asia and sub-Saharan Africa remained the most affected regions in 2023, with 599 000 (441 000-882 000) and 501 000 (373 000-648 000) deaths due to enteric infectious diseases, respectively, predominantly from diarrhoeal disease. Rotavirus was the leading cause of all-age diarrhoeal disease deaths (PAF 16·3% [12·0-21·5]), followed by norovirus (10·2% [2·4-17·0]) and Shigella spp (9·3% [5·4-15·2]). Among children younger than 5 years, PAFs of deaths due to diarrhoeal diseases were 40·2% (32·5-48·5) for rotavirus, 24·0% (15·1-36·7) for Shigella spp, and 23·4% (13·7-34·3) for adenovirus. Across 204 countries and territories, 141 met the GAPPD mortality target in 2023. The driving aetiologies among countries that did not meet the target in 2023 varied slightly by GBD super-region, but the highest or second-highest number of deaths in children younger than 5 years were consistently attributed to rotavirus. Astrovirus and sapovirus, newly included in GBD 2023, were responsible for 24 600 (6290-49 000) and 18 800 (4650-44 400) deaths, respectively, in 2023, mainly in children younger than 5 years. Our findings show that mortality and ASMRs of enteric infectious diseases declined substantially between 1990 and 2023. This decline is consistent with the expansion of public health measures and broader socioeconomic development. However, the burden in 2023 remains considerably high, with the highest mortality concentrated in sub-Saharan Africa and south Asia. Considering that more than a quarter of all countries had yet to meet the GAPPD mortality target in 2023, sustained efforts are needed to address the persistent burden in affected countries and to adapt to the changing global health landscape. Gates Foundation.
In recent years, artificial intelligence in medicine has evolved from single recognition tasks toward structural understanding, spatial reasoning, and clinical interpretability. High-quality anatomical data have become a key factor in further development. Driven by digital tomography, three-dimensional reconstruction, and multimodal technologies, body-donor-derived specimens and digital anatomical datasets, characterized by clear structural boundaries, stable spatial relationships, and fine-grained detail, are being transformed into computable, annotatable, and reusable digital anatomical resources. These resources are playing an increasingly important role in medical artificial intelligence. This narrative review summarizes the multiple roles of body-donor-derived data in medical AI. They serve as foundational resources that provide high-fidelity training data and fine-grained annotation systems. They also serve as validation references for improving algorithm credibility. In addition, they act as a substrate for AI-driven transformation in data processing, three-dimensional modeling, and intelligent applications in education, clinical practice, and forensic medicine. Their main strengths lie in anatomical authenticity, fine-grained annotatability, and structural validation utility, while their limitations include sample size, the postmortem-in vivo domain gap, annotation cost, and data governance. In the future, body-donor-derived data should become a core foundation for anatomical priors and structural gold standards, and should be deeply integrated with large-scale clinical imaging, multimodal intelligent analysis, and cross-domain learning to support the development of medical AI from high performance toward higher credibility and translational value.
Artificial intelligence (AI) is reshaping clinical practice and redefining the competencies future physicians will need. International bodies, such as the Association of American Medical Colleges, have called for structured AI training in medical curricula. Despite growing international consensus, no systematic nationwide evaluation had been conducted in Spain prior to this study. This study aimed to characterize the presence, type, and curricular features of AI-related training across all Spanish universities offering an official medical degree and to assess differences by institutional ownership and geographic region. This cross-sectional study was conducted from July to September 2025. Universities were the unit of analysis. A census of all institutions offering an officially recognized medical degree was obtained from the Register of Universities, Centers and Degrees; all 52 eligible institutions were included. Publicly available curricula and course guides for the 2025-2026 academic year were reviewed by 2 independent researchers and validated by an external evaluator. Courses were classified as (1) a specific AI course (AI as primary topic, accounting for >50% of syllabus), (2) an AI-similar course (a digital health or biomedical informatics course referencing AI as secondary content), or (3) not AI-related training. Course-level variables included ownership (public or private), region, status (compulsory or elective), European Credit Transfer and Accumulation System (ECTS) credits, academic year, and department. All analyses were descriptive. Potential sources of bias were addressed through predefined classification criteria, duplicate independent extraction, and external dataset verification. Of 52 universities, 36 (69.2%) were public and 16 (30.8%) were private. A total of 10 (19.2%) institutions offered at least one specific AI course; 6 (11.5%) included an AI-similar course. Overall, 16 (30.8%) universities had incorporated AI in some form; 36 (69.2%) institutions had not incorporated AI. Rates were similar for public (7/36, 19.4%) and private institutions (3/16, 18.8%). Identified courses ranged from 3 to 6 ECTS credits, representing an average of 1.17% of the 360-credit degree; most were elective. Only the University of Jaén offered a compulsory course with AI content. Marked regional disparities were observed: Andalusia led with 5 of 9 (55.6%) universities offering a specific AI course, while 10 autonomous communities had no universities with any AI-related training. This study delivers the first census-based, reproducible, national assessment of AI integration in Spanish undergraduate medical education. Unlike prior work focused on individual programs or nonstandardized definitions, we applied a consistent taxonomic framework reusable for longitudinal monitoring and international benchmarking. Findings reveal a heterogeneous, predominantly elective, and low-weight curricular landscape with striking interregional inequities. These results inform curriculum reform, accreditation standards, and faculty development priorities and support the establishment of minimum national competency standards and systematic monitoring to ensure equitable AI literacy among future physicians in Spain.
Artificial intelligence (AI) is increasingly integrated into clinical medicine, with foundation models emerging as an alternative to task-specific models for forecasting longitudinal healthcare data. These models, pre-trained on large datasets, promise broad applicability across clinical domains, yet their real-world performance and generalizability remain underexplored. To address this gap, we evaluated foundation and task-specific models across diverse clinical use cases, focusing on zero-shot performance, cross-hospital transportability, the impact of fine-tuning, and potential clinical implications. We used data from University Hospital Essen, Germany, two nearby regional hospitals, and the MIMIC-IV database to define six clinical time series use cases, including forecasting of vital signs, laboratory values, and hospital capacity. Transformer-based foundation models were compared in zero-shot and fine-tuned settings to task-specific approaches, including neural networks, gradient boosting, AutoML ensembles, and statistical models. We also assessed predictive value for guiding treatment decisions by dichotomizing forecasts. Zero-shot foundation models frequently approached the performance of optimized task-specific models. Fine-tuning further improved performance, with Chronos and TimesFM ranking among the best-performing models 19 and 18 times, respectively, compared to 21 times for AutoML ensembles. Foundation models showed superior transportability across hospital settings and patient populations. However, variations in forecasting strategies influenced their positive and negative predictive values in clinical decision-making contexts. These results suggest that foundation models are viable for clinical time series forecasting, particularly where generalizability is crucial. Their flexibility and zero-shot capabilities reduce the need for retraining, potentially lowering barriers to adoption and challenging the role of domain-specific models in clinical practice.
The integration of large language models (LLMs) into medicine has reshaped health care delivery, education, and research. Although proprietary models face challenges such as data privacy, regulation, and adaptability, DeepSeek, an open-source LLM, has emerged as a customizable and cost-effective alternative with significant potential for clinical and operational applications. However, the rapid expansion of research in this area necessitates a systematic mapping of its landscape, applications, and challenges. This study combines bibliometric analysis with a scoping review to systematically map and characterize the literature on DeepSeek's medical applications. The aims were to (1) analyze publication trends, leading contributors, and research themes and (2) identify primary application domains, strengths, limitations, and future directions. Following the framework by Arksey and O'Malley and the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines, a systematic search was conducted using PubMed, Web of Science, and Scopus from January 20, 2025, to November 30, 2025. Bibliometric analysis was then used to quantify publication trends, productivity, and research themes across 371 papers. The scoping review thematically synthesized the applications, strengths, and limitations of 353 original articles. The publication output showed a progressive increase, with China (n=163), Turkey (n=52), and the United States (n=48) as leading contributors. Keyword co-occurrence analysis formed 7 clusters; the 3 most frequent keywords were "large language model," "artificial intelligence," and "patient education." DeepSeek has shown promising yet preliminary performance across multiple domains, including patient education, clinical decision support, medical education, workflow optimization, and medical research. The evidence base remains predominantly low in quality, with 66.6% (235/353) of original articles classified as low-quality evidence, consisting largely of unvalidated benchmarking, simulated cases, and single-center retrospective analyses. Only 6.8% (24/353) of studies met the criteria to be considered high quality, and prospective randomized trials assessing patient-relevant outcomes were notably absent. Publications on DeepSeek's medical applications increased progressively from January 2025 through November 2025, with China, Turkey, and the United States as the leading contributors. The scoping review found that DeepSeek has been evaluated across 5 domains (patient education, clinical decision support, medical education, workflow optimization, and research), with variable but often competitive performance relative to proprietary models. Strengths included readability, diagnostic accuracy in select specialties, cost-efficiency, and local deployability. Limitations included inconsistent cross-specialty performance, hallucinations, ethical concerns, data privacy issues, and regulatory gaps. The evidence base is predominantly low-quality and simulation-based, with few prospective trials or randomized controlled trials. These findings indicate that DeepSeek's clinical readiness varies, and future research should address prospective validation, multimodal capabilities, bias mitigation, human oversight, and equitable access.
Reliable prediction is critical for prognosis communication and informed decision-making, yet remains challenging for cancer patients-especially older adults with non-small cell lung cancer (NSCLC)-due to the uncertainty inherent in model-based predictions. This study presents a framework that integrates uncertainty quantification (UQ) into individual survival prediction using electronic health records from 4243 older NSCLC patients in Korea. We applied four Cox proportional hazard-based survival models and four artificial intelligence (AI)-based survival models, including DeepSurv, to predict 2-year survival probabilities. We introduce two novel UQ metrics: certainty score, capturing the relative model confidence in predicted mortality risk, and predictive multiplicity, quantifying model disagreement in risk stratification. Although the survival models achieved high mean areas under the receiver operating characteristics curve ranging from 0.840 to 0.851, 26% of patients in the test set were assigned to conflicting risk groups depending on the model used, indicating considerable variability in model-predicted prognosis. DeepSurv demonstrated the highest average certainty. All models showed substantial degrees of predictive ambiguity and discrepancy. We also developed a visual informatics tool that presents personalized best-, worst-, and most likely-case scenarios, risk group stratification, and interpretable feature importance to improve transparency and facilitate shared decision-making. This framework offers a practical approach for integrating uncertainty into AI-based prognosis, addressing the challenge of enhancing confidence in cancer prognosis communication by quantifying and visualizing model uncertainty. It can support clinicians in tailoring prognostic discussions based on the level of model consensus and confidence, helping guide when to communicate prognosis cautiously or emphasize shared decision-making. The proposed framework is model-agnostic and readily applicable to real-world clinical settings.
Oral cancer is a significant global health burden, with early detection crucial for improving outcomes. Traditional screening methods are often subjective and time-consuming, relying heavily on clinician expertise. Artificial Intelligence (AI) and Machine Learning (ML) have emerged as promising tools, leveraging deep learning and image analysis to enhance diagnostic accuracy and automate screening processes. This scoping review evaluates AI/ML technologies for oral cancer detection, focusing on classification, segmentation, and early diagnosis. It also assesses AI's role in risk prediction, treatment planning, and prognostic modeling. A systematic literature search was conducted across multiple databases, prioritizing peer-reviewed studies on AI/ML techniques for oral cancer screening. Key models analyzed include Convolutional Neural Networks (CNNs), Fully Convolutional Networks (FCNNs), and ensemble learning, applied to intraoral images, histopathology slides, and fluorescence visualization. AI-driven fluorescence visualization demonstrates 98.0% sensitivity and 92.7% specificity, surpassing conventional methods. Deep learning-based histopathology analysis improves risk stratification in oral leukoplakia, while cloud-based AI models enhance accessibility. Hybrid approaches and explainable AI frameworks show promise in reducing diagnostic variability. Notably, AI matches or exceeds pathologist accuracy in lesion classification, with immunohistochemical AI tools improving tumor marker detection. Challenges include limited datasets and algorithmic bias, but ongoing advancements in multi-institutional collaborations are addressing these gaps. AI/ML technologies significantly improve oral cancer screening, yet further validation is needed for clinical integration. Future efforts should prioritize interpretability, ethical considerations, and real-world implementation to maximize impact on patient outcomes.
Vaccination is one of the most effective public health interventions. Large language models (LLMs) show great capability for providing health information. This study compared the performance of ChatGPT 3.5 (G3E), ChatGPT 4o in English (G4E), Claude 3.0 (CDE), Gemini 1.5 (GME), and ChatGPT 4o in Italian (G4I) in delivering information on vaccination and preventive medicine. Mixed-method analysis evaluating large language models. Twenty-six expert-designed healthcare scenarios were used to evaluate each model through an adapted DISCERN-based instrument. Four experts independently rated outputs across six domains: Information Reliability, Information Quality, Medical Appropriateness, Impact on Vaccine Hesitancy, Potential for Behavioral Influence, and Overall Rating. Medians and interquartile ranges were calculated, and mixed-effects ordinal logistic regression models were applied to account for inter-rater variability. G4E showed the highest performance, with significant advantages in Overall Rating (OR = 2.17, 95% CI: 1.20-3.92, p = 0.010) and Medical Appropriateness (OR = 1.86, 95% CI: 1.04-3.33, p = 0.036). G4I outperformed others in Information Quality (OR = 1.76, 95% CI: 1.06-2.93, p = 0.030) but scored lower for vaccine hesitancy and behavioral influence. GME performed weaker across qualitative domains, with occasional generation issues, while CDE and G3E yielded intermediate, consistent results. Differences among LLMs reflect model architecture, training data, and language adaptation, influencing clarity, accuracy, and persuasive tone. These disparities highlight the need for domain-specific fine-tuning and language-sensitive optimization to enhance public health communication. LLMs show uneven performance in providing accurate and behaviorally effective vaccine information, underscoring the importance of evaluation and cautious integration into health communication strategies.
Brain tumors are formed when abnormal cells grow within the brain or its surrounding tissues. Approximately 400 people in Ireland receive a primary brain tumor diagnosis each year. In the US, this number increases to almost 90,000 individuals diagnosed each year. Timely diagnosis of brain tumor is essential to saving lives and significantly reducing treatment costs. To automate this process, different Artificial Intelligence (AI) techniques have been adopted to identify brain tumors in humans. Specifically, various deep learning algorithms have been used to segment and classify brain tumors. In this paper, a systematic review is conducted based on Kitchenham & Charters methodology. We selected seven research questions to identify commonly used methods, datasets, features, metrics, and Explainable AI (XAI) approaches for AI-based analysis of brain tumors. This process starts by sourcing papers that address these techniques via the IEEE Xplore and ACM biblographic databases between January 2013 and December 2024. The papers are then filtered using specifically designed inclusion and exclusion criteria. Out of 3950 papers sourced from two electronic databases, only 101 papers were selected for this review. In summary, despite a focus on segmentation and classification, our findings indicate that no AI methods have been fully adopted in clinical practice. Furthermore, none of the reviewed papers address the specific problem of weakly-supervised brain tumor segmentation, highlighting a clear research gap in the existing literature that warrants further investigation. Also, only four articles on XAI were identified. Given the importance of transparency in network predictions for brain tumor analyses, this fact supports the need for more research in this domain.
Skin cancer diagnosis, particularly the differentiation of melanoma from benign nevi, is a vital yet challenging task due to the visual similarity between lesions. Although deep learning models such as convolutional neural networks (CNNs) and vision transformers (ViTs) have demonstrated promising performance, their effectiveness often deteriorates when applied to data from heterogeneous clinical sources. While conventional domain adaptation methods address domain shift, they require access to source data during adaptation, which is often infeasible due to privacy regulations. Multi-source-free unsupervised domain adaptation (MSFDA) addresses this limitation by leveraging multiple labeled source domains to generalize to an unlabeled target domain without requiring access to source data, making it suitable for privacy-sensitive medical settings. However, existing MSFDA methods rely on full backbone fine-tuning, leading to catastrophic forgetting and overfitting on small clinical datasets, and address domain shift at the aggregation stage without establishing a shared domain-invariant feature space. Furthermore, their reliance on hard pseudo-labels or confidence-weighted aggregation introduces noisy supervision signals under domain shift. To address these limitations, we propose CAT-CKD, consisting of two components: (1) contrastive adapter training (CAT), which trains lightweight ConvPass adapters within a frozen ViT backbone using supervised contrastive learning (SCL) to establish a shared domain-invariant feature space before source-specific model training, and (2) consensus knowledge distillation (CKD), which aggregates logits from multiple source models into a consensus supervisory signal and adapts a student model on unlabeled target data using KL divergence. Experiments on five publicly available skin lesion datasets show that CAT-CKD achieves an average AUROC of 86.1%, outperforming existing MSFDA methods while requiring only 4.3M trainable parameters. The code for this paper is available at https://github.com/A-Abedi/CAT_CKD.
Accurate prediction of outcomes after cardiac procedures is critical for personalised decision-making and risk stratification. While machine learning (ML) has shown promise in this domain, most prior studies rely on traditional ML methods that require structured data and manual feature engineering, limiting scalability. Many deep learning (DL) architectures offer an alternative by enabling automated feature extraction, particularly from unstructured data such as text, images, and signals. This scoping review summarises recent advances in DL-based prediction of outcomes for four major cardiovascular procedures: percutaneous coronary intervention (PCI), coronary artery bypass grafting (CABG), aortic valve replacement (AVR), and mitral valvuloplasty. Following PRISMA-ScR guidelines, we searched PubMed and IEEE Xplore for studies published between 2020 and March 2025. Finally, 457 studies were retrieved and nine eligible studies were included after screening. DL models demonstrated varying performance across data types, with particularly strong results for text and imaging tasks. Multimodal approaches combining clinical, imaging, and signal data showed added predictive value. Compared with traditional ML, DL models often reduce the need for manual feature engineering, though they still require preprocessing and validation to mitigate overfitting. Overall, these findings suggest that DL has potential to support preoperative risk stratification, although evidence for clinical utility remains preliminary. Moreover, all included studies lacked external validation, and challenges remain regarding generalisability, explainability, and integration into clinical workflows. Future research should prioritise large, diverse cohorts, multimodal data fusion, and interpretable DL models to enable safe and effective clinical implementation.
Impairment of axonal transport may contribute to the degeneration of dopaminergic (DAergic) neurons in the substantia nigra (SN), a key event in Parkinson's disease (PD) pathogenesis. Due to the lack of early diagnosis, changes in axonal transport at the preclinical stage can only be studied in PD models. We assessed gene expression (RT-PCR after cell sorting) and protein levels (semiquantitative immunohistochemistry) of axonal transport-related proteins in SN DAergic neurons from mice in subchronic MPTP models of PD (preclinical and clinical stages) and controls. The proteins studied included α-tubulin (Tuba1a), β-tubulin (Tubb3), kinesin (Kif5b, Klc1), dynein (Dynll1, Dync1i1), dynactin (Dctn1), microtubule affinity-regulating kinase 1 (Mark1), and tau (Mapt). In the preclinical stage, Kif5b expression and Kif5B level were increased, possibly to compensatorily preserve anterograde transport. Dynll1 and Tuba1a were upregulated, whereas Dync1i1 and Mapt were downregulated, with no change in tubulin or tau protein levels. In the clinical stage, Klc1, Dync1i1, Dctn1, Mark1, and Mapt expression and Kif5B protein levels decreased. These data indicate that transcriptional alterations in axonal transport proteins precede protein-level changes in DAergic neurons. The upregulation of Kif5B in the preclinical stage suggests that axonal transport proteins may serve as potential early therapeutic targets in PD.
Diabetic lower limb ischemia (DLLI) is a serious complication of diabetes with limited therapeutic options. Heterophyllin B (HET-B), a bioactive cyclopeptide from Pseudostellaria heterophylla, possesses antioxidant properties. However, its therapeutic mechanism in DLLI remains unclear. This study aims to investigate the protective effects of HET-B against DLLI and elucidate the underlying metabolic and molecular mechanisms. A murine DLLI model was established in streptozotocin-induced diabetic mice via femoral artery ligation. Therapeutic efficacy was assessed by laser Doppler imaging, histopathology, and immunofluorescence. A high glucose-induced endothelial injury model was established using HUVECs. Endothelial function was evaluated by tube formation, migration, and wound healing assays. Single-cell RNA sequencing data (GSE165816) were analyzed to identify key metabolic targets. Mechanistic validation was performed using Spermine oxidase (SMOX) gene silencing and pharmacological activation with spermine. HET-B significantly improved hindlimb blood flow recovery, promoted angiogenesis with increased CD31, α-SMA, VEGF, and eNOS, and attenuated inflammation with reduced TNF-α, IL-6, IL-1β, and TGF-β in ischemic muscles. In HUVECs, HET-B restored high glucose-impaired endothelial function, promoted Nrf2 nuclear translocation with NQO1 upregulation, suppressed TNF-α expression, and subsequently reduced Caspase-1/3 activation. Bioinformatic analysis identified SMOX as a key dysregulated gene in diabetic endothelium, and HET-B reversed its overexpression both in vitro and in vivo. Immunofluorescence co-staining confirmed SMOX and Nrf2 localization in CD31-positive endothelial cells, with HET-B reversing SMOX upregulation while restoring Nrf2 activation. SMOX knockdown mimicked HET-B effects, whereas SMOX activation with spermine abrogated HET-B-mediated protection, Nrf2 activation, and NF-κB suppression, confirming that HET-B acts through functional inhibition of SMOX. HET-B alleviates DLLI by inhibiting SMOX to activate Nrf2-mediated antioxidant defense and suppress inflammatory signaling, suggesting that the SMOX-Nrf2 axis may represent a potential therapeutic target for DLLI.
Echocardiography is the cornerstone of cardiovascular diagnosis, yet its manual interpretation is labor-intensive and prone to inter-observer variability. While Deep Learning (DL) offers expert-level potential, existing models struggle with clinical generalization due to domain shifts and are often limited to single-view analysis, failing to provide the comprehensive assessment required in real-world practice. To overcome these limitations, this study aimed to design, develop, and clinically evaluate EchoAI, a secure, browser-based Clinical Decision Support System to bridge the gap between high-performance DL algorithms and routine echocardiography analysis. We developed a secure web-based framework that integrates our previously validated, multi-task UDA-VAE engine capable of simultaneous quantification of Left Ventricular Ejection Fraction (LVEF) and Wall Thickness across multiple standard acoustic windows (A4C, A2C, PLAX). Uniquely, the platform employs a User-Centered Design with a "Human-in-the-loop" workflow, transforming the AI from a "black box" into a transparent assistant that allows physicians to visualize and verify segmentation masks in real-time. A multicenter clinical validation involving 18 cardiologists and residents across an academic hospital and a private cardiac center demonstrated real-time performance with an average processing time of 1.15 s per cycle across diverse ultrasound vendors. The system achieved a strong correlation with expert measurements (r = 0.98, P < 0.001) and a negligible bias of 0.12%. Usability assessment yielded a high overall satisfaction score (6.20/7). Notably, physicians accepted 86% of the AI-generated outputs without modification (84% in the academic setting and 88% in the private sector), confirming the system's robust reliability and cross-domain adaptability. EchoAI demonstrates that integrating vendor-agnostic, domain-adaptive AI into an intuitive, interactive web interface effectively bridges the gap between algorithmic capability and clinical adoption. This multicenter approach significantly reduces manual workload while fostering the high level of clinical trust necessary for routine deployment across diverse healthcare settings.
Scientific publishing is changing - and it's changing fast. Digital platforms have made it easier than ever to share research across borders, open-access models have pulled down paywalls that once limited who could read or contribute to scientific discourse, and global collaboration has become the norm rather than the exception. Into this already shifting landscape, artificial intelligence (AI) has arrived - quietly at first, and now with considerable force - touching nearly every stage of how research gets done, analyzed, and communicated. The promise here is real. But so is the tension it creates. The central question facing the scientific community isn't whether to embrace these changes - that ship has largely sailed - but whether we can move this quickly without eroding the credibility that makes science worth doing in the first place. For journals, this isn't a theoretical problem. Editorial standards are the backbone of the scientific record, and right now, those standards are being stress-tested. As the Editor-in-Chief of Cureus, I think we're at a moment that calls for clarity, not hedging - a moment to say plainly what principles must hold even as everything else shifts. Cureus was built around a straightforward idea that medical publishing was too slow, too exclusive, and too gatekept to serve science well. The journal set out to change that by reducing barriers to dissemination while keeping editorial rigor intact. That core mission hasn't changed. What has changed is the environment in which we pursue it. Two issues now sit at the center of that effort: the responsible use of AI in scholarly communication and the ongoing fight to protect research integrity.
Large language models (LLMs) have shown promising results in medical decision support; Background: Large language models (LLMs) have demonstrated promising outcomes in medical decision support; however, their efficacy in managing complex hepatobiliary conditions remains insufficiently examined. We have developed a genetic neuro-symbolic LLM system that integrates multiple AI agents with neural-symbolic reasoning for the management of cholangitis, and we have compared its performance to that of conventional LLMs and human experts.genetic neuro-symbolic LLM system integrating multiple AI agents with neural-symbolic reasoning for cholangitis management and compared its performance against conventional LLMs and human experts. This multi-center cross-sectional study included 30 case-based questions from American Board of Internal Medicine (ABIM) gastroenterology subspecialty examinations covering acute cholangitis. Questions were categorized into diagnosis (n = 10), treatment (n = 10), and complications/prognosis (n = 10). Performance of a genetic neuro-symbolic LLM system orchestrated via LangGraph was compared against Claude 4.5 Sonnet, ChatGPT 5.2, Gemini 2.0 Flash, 10 gastroenterology specialists, and 4 emergency medicine physicians from four tertiary centers in Turkey. The genetic neuro-symbolic system achieved the highest overall accuracy (100%, 30/30), significantly outperforming Claude 4.5 Sonnet (90.0%), ChatGPT 5.2 (60.0%), Gemini 2.0 Flash (63.3%), gastroenterology experts (mean 95.7% ± 3.2%), and emergency medicine physicians (mean 84.2% ± 8.8%). The neuro-symbolic system demonstrated superior performance across all categories and cholangitis subtypes. Among human participants, gastroenterologists outperformed emergency physicians in treatment decisions (p = 0.012) and showed non-inferior performance to Gemini 2.0 Flash overall (p = 0.034). The genetic neuro-symbolic LLM system demonstrated superior accuracy in cholangitis management compared to all conventional AI models and human experts. This proof-of-concept study suggests that multi-agent architectures with neural-symbolic reasoning may offer a promising direction for AI-assisted clinical decision support in complex hepatobiliary conditions, although prospective clinical validation is required before broader implementation claims can be warranted.
Fluidized bed drying (FBD) is widely used in the drying of granular pharmaceuticals, but currently, manual adjustment of air intake and drying temperature is highly relied upon to control the moisture content of particles in the fluidized bed. This manual intervention is cumbersome, labor-intensive, and limited by the thermal inertia of industrial heating systems, hindering precise control. In this study, a control system for FBD was developed to overcome this inherent challenge. The system utilizes a proportional controller to precisely control the moisture content decline trajectory of granules, and air intake flow is used as the manipulated variable. The system's robustness was evaluated using two model materials with distinct physical properties: Cordyceps Fungus Powder (high moisture) and Xin Huang Tablet powder (low moisture). Results demonstrated that the system precisely achieved the target moisture content (e.g., 6.5%) within prescribed drying durations (standard, shortened, and extended). Crucially, despite dynamic airflow adjustments, sieving analysis revealed essentially consistent particle size distributions, indicating that the strategy maintains granule integrity without causing significant attrition. The system also exhibited strong adaptability to variations in drying temperature. This approach effectively resolves the technical challenge of precise endpoint control by overcoming thermal inertia while ensuring critical quality attributes (CQAs). By enabling flexible trajectory adjustments and consistent drying rates, the proposed strategy offers a practical solution for enhancing efficiency and automation in pharmaceutical FBD processes.
Increased right ventricular (RV) radiotracer uptake on perfusion imaging has been recognized as a marker of increased cardiovascular risk. However, this uptake is challenging to quantify because of the variable intensity of uptake in a thin structure. We used a validated artificial intelligence-enhanced method for segmenting the right ventricle from CT attenuation correction (CTAC) imaging to automatically quantify RV activity and then evaluated its prognostic significance. Methods: We evaluated consecutive patients from 11 sites who underwent PET myocardial perfusion imaging with available CTAC. We segmented the RV and left ventricular myocardium from CTAC images using deep learning and then quantified RV activity measures on coregistered PET images. We evaluated associations between RV activity measures and the incidence of death or myocardial infarction (MI). Results: In total, 25,444 patients were included in our analysis (median age, 67 y). During a median follow-up of 4.1 y, 6009 patients (23.6%) experienced death or MI. Most RV activity measures were associated with the risk of death or MI. Higher maximum RV rest activity was associated with an increased risk of death or MI (unadjusted hazard ratio, 1.17 per SD for 13N-ammonia and 1.19 per SD for 82Rb). These associations persisted after adjusting for age, sex, medical history, perfusion, function, and myocardial flow reserve. Conclusion: Deep learning can extract RV activity from hybrid PET/CT myocardial perfusion imaging. These measures are associated with myocardial flow reserve and provide complementary information regarding cardiovascular risk.
Artificial intelligence (AI), and specifically deep learning (DL) models, are rapidly gaining traction in healthcare to analyze complex medical images and support clinical decision-making. However, DL models are often considered black boxes due to the lack of a clear explanation when providing predictions. Explainable artificial intelligence (XAI) methods are emerging as an effective way to make models explainable for developers and provide interpretable outputs for clinicians. This review presents a taxonomy of the most widely used XAI methods for image classification, with related benefits and drawbacks. Furthermore, it examines whether the type of classifier affects the choice of an explainability technique and investigates the impact of black boxes on the healthcare environment. The analysis considered papers published between January 2020 and July 2025 in Scopus and Google Scholar, utilizing the PRISMA guidelines to enhance reporting. Sixty-nine papers were identified as suitable for classifying XAI methods in four categories based on backpropagation, perturbation, attention, and concept. The results show increased use of backpropagation-based techniques, which offer simple and intuitive heatmaps. Perturbation-based methods are frequently employed to validate model robustness, but they are computationally expensive. Finally, concept-based and attention-based approaches are less widespread but represent a promising solution towards explanations that align with human semantics and reflect the intrinsic model behavior. Future research should focus on combined approaches and concept methods that generate explanations in the same semantic field as clinicians and are computationally suitable for healthcare environments, paving the way for transparent and clinically reliable DL systems.