Artificial intelligence (AI) is increasingly used to enhance diagnostic accuracy, clinical decision-making, and health system efficiency. However, its sustainable and equitable deployment in low-resource settings (LRS) remains limited. In many low- and middle-income countries (LMICs), digital health efforts are still held back by weak infrastructure, fragmented health data, limited local skills, and gaps in governance. Bringing together lessons from existing evidence and practical, real-world solutions is essential for supporting digital health approaches that are fair, workable, and sustainable over time. Following the PRISMA-ScR framework, a scoping review was conducted of peer-reviewed literature published between January 2015 and January 2026. Searches were performed across PubMed, Scopus, Web of Science, IEEE Xplore, and Google Scholar. Eligible studies examined medical AI deployment, implementation barriers, or enabling strategies within LMIC healthcare settings. Data were extracted and analyzed thematically across four domains: digital infrastructure and connectivity, data quality and local capacity, ethics and governance, and policy and sustainability, guided by a human-centered implementation perspective and JBI methodological guidance. A total of 44 studies met the inclusion criteria. The analysis showed that making AI work in low-resource settings is less about advanced technology and more about having the right systems in place. Common problems included unreliable electricity and internet access, messy or incomplete data, limited familiarity with AI among healthcare workers, and a lack of clear rules to guide its use. Reported enabling strategies focused on investments in resilient digital infrastructure, adoption of interoperable data standards (e.g., HL7/FHIR), continuous capacity-building programs, fairness and bias auditing mechanisms, and integration of AI governance within national digital health and e-health policies supported by sustainable financing models. Sustainable and equitable deployment of medical AI in LMICs requires embedding human-centered values-transparency, accountability, privacy, and equity throughout the AI lifecycle. Aligned with the WHO (2021) and UNESCO (2021) AI ethics frameworks, this review underscores that meaningful innovation in digital health depends on augmenting, rather than replacing, human judgment through context-aware and trustworthy AI systems. However, this scoping review is limited by the inclusion of English-language studies and by the heterogeneity of studies, which precluded quantitative synthesis.
Large, diverse datasets are essential for reliable deep learning in mammography, yet clinical data remain siloed due to privacy and governance constraints. Federated learning enables collaborative training without sharing raw data, but its robustness under strong imaging-domain heterogeneity, such as film-digital shifts, remains uncertain. We conducted a comparative evaluation of centralized learning and cross-silo federated learning for benign-malignant lesion classification across two heterogeneous public datasets: CBIS-DDSM (scanned film) and VinDr-Mammo (full-field digital). Using ResNet-50 and Swin V2-T backbones, we evaluated FedAvg, FedProx, SCAFFOLD, and FedBN across multi-seed experiments with bootstrap confidence intervals. The study design included local-only baselines, homogeneous FL controls, size-balancing ablations, and a resolution ablation (224→324 px). Performance was assessed using AUROC, AP, Accuracy, Precision, Recall, F1, and Precision@Recall = 0.90. Federated models matched centralized learning in homogeneous settings for both domains. Under film-digital heterogeneity, FL retained strong performance on the digital VinDr domain (AUC ≈ 0.91-0.95) but showed reduced performance on the film-based CBIS domain (AUC ≈ 0.53-0.62), exhibiting a shift toward high-recall/low-precision behavior. Neither FedProx, SCAFFOLD, nor FedBN consistently mitigated this degradation. Size-balancing improved CBIS performance modestly but did not close the gap to centralized learning, indicating that feature and quality shift dominated over dataset-size imbalance. Higher input resolution improved CBIS calcification detection (e.g., F1 0.49 → 0.54). These findings show that FL performs reliably within homogeneous domains but remains vulnerable to strong feature and quality shifts between film and digital mammography. The observed asymmetric performance suggests that domain shift, rather than data quantity or optimizer instability, is the dominant limiting factor. Federated learning enables high-performing mammography classification without data centralization in homogeneous settings but requires domain-aware or personalized FL strategies, site-specific thresholding, and resolution-sensitive preprocessing to ensure reliable deployment under film-digital heterogeneity.
Peer-to-peer sharing of personal health data on social media is increasingly used as a strategy to support public health goals. Such sharing is often assumed to motivate individuals to adopt or maintain healthy behaviors. However, the social and ethical implications of sharing-based interventions remain insufficiently examined. This paper offers an empirical and theoretical contribution by foregrounding the socio-technical contexts of sharing and analyzing how sharing-based interventions may drive social change. Building on these insights, it also outlines ethical considerations for researchers and stakeholders. We conducted 22 semi-structured interviews with participants in a regional public health intervention in Sweden. Interviews focused on participants' experiences of receiving personal health data and their reflections on sharing such data on social media. Analysis was guided by reflexive thematic analysis and informed by theoretical perspectives on the socio-technical embeddedness of health data and sharing practices. Participants understood health data as both personal and communal. Although many expressed discomfort with disclosing sensitive health information online, the peer-to-peer sharing model fostered a perceived moral obligation to share data for collective benefit. The tension between personal boundaries and perceived communal obligations raises important ethical concerns, particularly when individuals feel pressured to share data they would prefer to keep private. Our findings underscore the need for ethical frameworks that address social pressures, consent, and the emotional dimensions of data sharing. To support sustainable and ethical public health practices, further qualitative research is essential-particularly to understand how individuals navigate obligations and risks in technology-mediated care, and how these dynamics shape values such as autonomy, well-being, and collective responsibility.
The increasing use of electronic health records (EHRs) for real-world evidence (RWE) studies is hindered by substantial heterogeneity in data collection practices and local coding schemes across healthcare providers. Data standardization-particularly the mapping of locally defined medical concepts to standardized vocabularies-is therefore a critical but labour-intensive step, traditionally relying on extensive manual review by clinical experts. While a range of machine-learning (ML) approaches have been proposed to support medical concept mapping, their integration into practical, end-to-end workflows and their performance under real-world conditions remain insufficiently studied. In this work, we present ArcMAP, an end-to-end application that integrates a state-of-the-art biomedical representation model (BioLORD) into a human-in-the-loop workflow designed to streamline and accelerate medical concept mapping. ArcMAP provides a graphical user interface that enables clinical experts to efficiently review, validate, and correct automated mapping suggestions. A core component of the system is a continuous learning pipeline, in which expert feedback is systematically captured and used to update the underlying model, allowing ArcMAP to adapt to evolving coding practices and newly onboarded data sources. We conduct a comprehensive evaluation of ArcMAP across multiple deployment scenarios, including the impact of continuous fine-tuning, the onboarding of a new hospital, and a longitudinal real-world evaluation conducted over a two-month period using medication and laboratory test data from five UK-based NHS hospitals. Our results demonstrate the importance of domain-specific fine-tuning, with top-1 accuracy for laboratory test names increasing from 37.0% to 91.6%. However, when simulating the onboarding of a new hospital, the system achieves a weighted average top-1 accuracy of only 73.5%, indicating substantial variability across NHS hospital systems. In real-world use, the use of ArcMAP indicates an increased mapping efficiency compared to manual workflows, while also revealing considerable variation across individual data-mapping sessions.
Large Language Models (LLMs) are transforming back-office quality management processes in European healthcare systems through automation of compliance monitoring, quality assurance, and process optimization without direct patient interaction. This narrative review synthesizes evidence from recent systematic reviews and implementation studies (2023-2025) examining LLM deployment within the European regulatory framework encompassing the Medical Device Regulation (MDR), General Data Protection Regulation (GDPR), and the EU Artificial Intelligence Act (Regulation EU 2024/1689). Current research demonstrates meaningful efficiency gains: individual studies of AI-assisted documentation tools report improvements ranging from modest increases in documentation speed to reductions in processing time approaching 50%, while broader policy analyses estimate administrative workload reductions of up to 30% through digital health and AI solutions. Clinical trial applications show particular maturity, with LLM-generated informed consent forms demonstrating improved readability (76% vs. 67%) without compromising accuracy. However, critical gaps persist between research achievements and practical deployment. Analysis of 519 evaluation studies reveals that only 5% utilized real patient care data, while 95% focused exclusively on accuracy metrics to the neglect of fairness (16%), deployment readiness (5%), and calibration (1%). No LLM-based quality management system has yet received regulatory clearance, and implementation science frameworks remain underdeveloped. We propose a risk-stratified implementation framework emphasizing process-oriented applications-standard operating procedure automation, audit documentation, deviation management, and compliance monitoring-that avoid medical device classification while capturing substantial operational benefits. Advanced methodological approaches including retrieval-augmented generation (RAG) architectures, digital twin integration, and natural language processing-based pattern recognition offer pathways toward comprehensive quality intelligence platforms. The convergence of LLMs with emerging technologies such as knowledge graphs, digital twin architectures and multimodal analysis creates opportunities for predictive quality management that anticipates rather than merely documents quality-relevant events. Evidence supports deployment in administrative quality processes, with particular potential for applications that redirect human expertise from documentation toward quality improvement activities, though current evidence derives predominantly from non-European healthcare contexts and simulated or limited-scope settings. Success requires adapted validation methodologies addressing LLM non-determinism, robust governance structures, and comprehensive change management that maintains the high standards European healthcare systems demand.
Connected drug delivery devices such as combination products that integrate traditional drug delivery systems with digital connectivity features represent an opportunity to improve treatment outcomes and disease management. This online survey study was conducted to explore the evolving landscape of digitally connected subcutaneous (SC) drug delivery devices, including the perspectives of pharmaceutical stakeholders regarding the promise of these technologies, particularly in relation to the expansion of traditional mobile companion applications and their integration in drug delivery systems. A total of 80 employees of pharmaceutical, biotechnology, or digital health companies with primary roles in medical affairs, commercial, combination product development, or digital health who had experience working on SC drug-device combination products completed the survey. Survey questions explored the value propositions of connected SC drug delivery devices for patients, providers, and payers; barriers to the adoption of these technologies; and strategies for gaining internal support for connected healthcare initiatives. Responses demonstrated that industry professionals recognize the potential value of connected SC drug delivery devices and associated companion mobile applications and are investing in bringing them to market. Nearly all respondents (97.5%) reported that connectivity is at least moderately important to achieving important objectives, including acquiring real-world data, improving medication adherence, and enhancing ease-of-use for patients. Equal potential value was noted for using connectivity in clinical trials or commercial settings, with neither considered more beneficial than the other. Indications in oncology and endocrinology were considered to be the most likely to benefit from connected SC drug delivery devices. Key barriers to the adoption of connected SC drug delivery devices were development cost, data security, and patient and payer acceptance, while generating evidence of internal and external value was noted as a significant barrier to gaining company endorsement. These results should guide strategies for the effective integration of connected healthcare solutions within the pharmaceutical sector.
Virtual Reality (VR) has evolved from entertainment to a versatile platform for clinical and public health innovation. In medicine, VR supports pain management, rehabilitation, and cognitive training, and shows growing promise for addressing chronic diseases linked to modifiable risk factors. To support this expansion, we introduce the Transcend Framework, the Translational Engineering of Behavioral Interventions into Immersive Contexts for Engagement and Design framework, a systematic model for adapting evidence-based behavioral interventions into VR platforms, illustrated by Joviality™, a positive psychological intervention designed for use during hemodialysis. The aim of this paper is to outline a clear, reproducible process for translating behavioral interventions into immersive digital formats that supports broader research, clinical, and implementation applications. The framework comprises five stages: (1) identifying the target population; (2) assessing feasibility and adapting the curriculum for VR; (3) pre-production planning, including storyboarding and design specification; (4) previsualization and asset creation of immersive environments; and (5) iterative VR development and testing to refine usability, accessibility, and engagement. Each stage emphasizes user-centered design and attention to physical limitations, cognitive load, and accessibility to ensure feasibility and effectiveness. Interactive, visually rich, modular environments foster engagement, while gamified activities enhance experiential learning and skill acquisition, and culturally attuned content ensures inclusivity. Continuous, data-informed refinement guided by end-user feedback ensures usability and sustained engagement. This methodological framework provides a practical roadmap for developing and optimizing VR-based behavioral health interventions and demonstrates how immersive technology can advance health education, promote behavior change, and enable scalable, equitable implementation across clinical contexts.
Artificial intelligence (AI) has the potential to transform rural healthcare delivery through automated monitoring, personalised care, and virtual support. Yet the future pathways for AI in rural contexts remain underexplored. Most AI applications are developed in urban-centric environments with limited consideration for infrastructure constraints, workforce realities, and sociocultural dynamics that shape rural healthcare delivery. This study examined stakeholder perspectives on the future role of AI in rural healthcare, identifying key priorities, facilitators, and barriers to adoption. Using a participatory research approach incorporating horizon scanning and foresight methods, data were collected during a structured workshop at the South Australian Rural Health Research and Education Conference. Forty participants, including general practitioners, clinicians, medical students, researchers, and healthcare administrators, engaged in four sequential activities: historical events mapping, future event possibilities, experiential future scenarios, and priority setting using the MoSCoW framework. Written responses were systematically transcribed and analysed using reflexive thematic analysis. Four prominent themes emerged capturing stakeholder priorities and the guardrails they considered essential for future technological integration. These themes related to opportunities from AI and technology deployment for rural and remote equity, people at the centre of care, ethical challenges, and funding and systems issues. Participants acknowledged AI's potential to reduce geographical barriers and improve access to healthcare services, while also raising concerns about data privacy, governance, cultural appropriateness, and the risk of technology exacerbating existing health disparities. Across activities, participants expressed a strong preference for AI that supports rather than replaces human clinicians, and emphasised the importance of maintaining person-centred care, human connection, and local knowledge. This study shows how futures-oriented, participatory methods can surface both the promise and the constraints of AI in rural healthcare. Successful implementation requires co-design with rural communities, equity-driven approaches, transparent governance frameworks, and investment in infrastructure and workforce capacity so that future technology adoption supports, rather than exacerbates, existing health disparities.
This exploratory, two-arm, randomized, unblinded, controlled, multicentre study assessed the health benefits of the INKA app, a MDR class I CE-marked digital therapy companion for patients with overactive bladder (OAB) and mixed incontinence (MI). INKA offers self-guided educational, behavioural, and motivational content, along with physiotherapy modules and supports daily self-management, in accordance with current clinical guidelines. 251 patients under first-line stable pharmacological treatment were recruited at 35 study sites in Germany and randomized to receive access to the INKA app or standard of care alone (the control group). Self-assessed OAB related endpoints were investigated at baseline, after 4 and after 12 weeks. The end-of-study visit was conducted on site. Among 111 evaluable patients (43 INKA, 68 control), baseline characteristics were comparable (mean age 52.7 years, SD 14.6; 27% male, 73% female). 55% of INKA users engaged with the app on a daily basis. At 12 weeks, the INKA group showed a mean reduction of -1.02 (SD 3.36) micturitions per 24 h compared to +0.08 (SD 2.97) in the control group. Significant and clinically meaningful improvements were observed in female INKA users and those with heightened symptom severity. A significant mean increase in urine volume per micturition was noted in the INKA group (+15.75 mL, SD 49.74) vs. the control group (-8.84 mL, SD 52.14), in "OAB wet" and in the female subgroup. The ICIQ-OAB questionnaire results indicated favourable outcomes for all groups, with all INKA patients and the female subgroup showing clinically relevant symptom relief. Additionally, greater improvement on the ICIQ-OABqol questionnaire was reported for the INKA group (-12.5, SD 20.17) vs. the control group (-7.89, SD 20.15). No INKA-related adverse events or device deficiencies were reported. This proof-of-concept study highlights the potential of the INKA mobile app to reduce micturition frequency and increase the micturition volume in therapy refractory OAB patients, both recognized as key factors of OAB symptom burden. A forthcoming trial will evaluate an optimized and more user-friendly version of the app with patients with a higher symptom severity at baseline. German Clinical Trials Register (DRKS ID 00029329).
Clinical simulation represents an emerging educational technology delivered through software-based platforms, accessible via computers or head-mounted displays. It is characterized as a partially immersive, screen-mediated experience in which learners are placed in simulated roles that require the execution of psychomotor actions, clinical decision-making, and interpersonal communication skills. This scoping review protocol follows the methodological guidance of the Joanna Briggs Institute and adheres to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews (PRISMA-ScR). Comprehensive searches will be conducted across the following electronic databases: MEDLINE/PubMed, Embase, Web of Science, Education Resources Information Center (ERIC), and the Cumulative Index to Nursing and Allied Health Literature (CINAHL). In addition, gray literature sources will be explored through national and international repositories, including the Catalogue of Theses and Dissertations of the Coordination for the Improvement of Higher Education Personnel (CAPES), the Electronic Theses Online Service (EthOS), the Open Access Scientific Repository of Portugal (RCAAP), the National ETD Portal, Theses Canada, the Portal de Tesis Latinoamericanas, and WorldCat Dissertations and Theses. The review seeks to address the question: "What evidence exists regarding the use of clinical simulation with digital patients in the teaching and learning process of nursing students?" Eligible sources will include studies with full-text availability, encompassing peer-reviewed research articles, theses, dissertations, and other relevant documents, without restrictions related to geographic location, publication date, or language. Data will be charted using a customized extraction form based on Joanna Briggs Institute recommendations. Quantitative findings will be summarized using descriptive statistical methods, while qualitative evidence will be examined through thematic analysis. Ethical approval is not required. Given the methodological nature of this study, formal ethical approval is not required. The findings are intended for dissemination through publication in a peer-reviewed journal and presentation at scientific conferences. To promote transparency and reinforce the originality of the review, this protocol has been prospectively registered on the Open Science Framework (OSF). Protocol registration in the open science framework (OSF): DOI: 10.17605/OSF.IO/GAXR6.
Intracranial hemorrhage (ICH) is a life-threatening medical emergency requiring rapid and accurate diagnosis. Non-contrast computed tomography (CT) remains the primary imaging modality for detecting acute hemorrhage. In recent years, machine learning (ML) and deep learning (DL) approaches have gained increasing attention for automated detection and classification of ICH and its subtypes. This systematic review aims to consolidate and critically analyze contemporary machine learning and deep learning methodologies applied to ICH detection and classification from non-contrast CT scans. A comprehensive review of published studies was conducted focusing on ML and DL models developed for identifying ICH and its subtypes, including epidural, subdural, intraparenchymal, intraventricular, and subarachnoid hemorrhages. The reviewed techniques encompass conventional convolutional neural networks (CNNs), three-dimensional CNNs, hybrid and ensemble frameworks, and emerging transformer-based architectures. Preprocessing strategies such as Hounsfield Unit windowing, skull stripping, and data augmentation were examined. Additionally, explainable artificial intelligence (XAI) approaches, including Grad-CAM, were evaluated for enhancing model interpretability. Recent studies demonstrate promising diagnostic performance across multiple deep learning architectures, with improved sensitivity and specificity for subtype classification. Hybrid and transformer-based models show enhanced feature representation capabilities. Preprocessing techniques and explainability methods contribute significantly to model robustness and clinical interpretability. Machine learning and deep learning models exhibit substantial potential in automated ICH detection and classification from non-contrast CT scans. However, challenges remain regarding generalizability, dataset heterogeneity, and clinical validation. Future research should emphasize large-scale multi-center validation, model interpretability, and integration into real-world clinical workflows to enable effective translation into routine neuroimaging practice.
Robotics and technological interventions are increasingly being explored as solutions to improve rehabilitation outcomes but their implementation in clinical practice remains very limited. Understanding patient needs is crucial for effective integration of these technologies, ensuring they align with and address the actual requirements of individuals in clinical settings. The primary aim of this study is to explore the rehabilitation needs of adults with motor, sensory, and/or cognitive disabilities in order to more effectively guide the practice of technological and robotic interventions in clinical setting. To this end, as part of the Fit for Medical Robotics Initiative, we conducted a survey targeting adult patients recruited from clinical centers participating in the Initiative. It aimed to provide a clear understanding of the patients' rehabilitation priorities, as well as perceived efficacy and satisfaction levels about the robotic and the traditional rehabilitation, in order to better address trials on the use of robots and technologies in individuals with disabilities considering a patient-centered perspective. The survey was structured on the basis of the International Classification of Functioning, Disability, and Health framework. There were 424 respondents representing a range of conditions, including stroke, Parkinson's disease, multiple sclerosis, neuromuscular disorders, and other motor and cognitive impairments. Notably, 86% of respondents reported undergoing traditional rehabilitation, while 39% had experienced (also) robotic interventions, highlighting limited accessibility to advanced rehabilitation technologies. Additionally, respondents expressed a significant need for multidomain rehabilitation, with movement being the most prioritized domain. The degree of satisfaction was higher among respondents receiving technological interventions, particularly in addressing mobility. Furthermore, a substantial proportion of respondents indicated a strong need for receiving home-based care. The patient needs identified through the survey were fundamental for designing pragmatic clinical trials, whose results will help shape the rehabilitation offer using new and innovative models.
In India, untreated depression among women contributes significantly to morbidity and mortality, underscoring an urgent need for accessible and ethically grounded mental health interventions. Mobile health (mHealth) tools offer scalable solutions; however, their implementation in low- and middle-income country (LMIC) settings raises important bioethical considerations. This study was conducted at the conclusion of a pilot randomized controlled trial evaluating the MITHRA app (Multiuser Interactive Health Response Application), designed for depression screening and treatment among women participating in self-help groups (SHGs) in rural Karnataka, India. Two focus-group discussions were conducted with intervention participants to explore ethical dimensions of app use, including technological proficiency, privacy, informed consent, connectivity, accessibility, and gender-specific interactions. Transcripts were analyzed using thematic coding to identify recurring patterns. Participants preferred a hybrid care model combining mobile app use with human interaction. Technological proficiency varied, and participants demonstrated uncertainty regarding mental health app functionality and limited understanding of privacy policies. In the collectivist cultural context of rural India, autonomy and informed consent were often expressed relationally, shaped by family and community dynamics rather than individual decision-making alone. These findings highlight the need to tailor digital mental health interventions to user preferences and local sociocultural contexts. Ethical implementation in rural LMIC settings requires enhanced transparency around data use, culturally aligned consent processes, and integration of ethical, technological, and relational considerations to improve accessibility, trust, and acceptability.
Artificial intelligence (AI) has the potential to revolutionize healthcare delivery in low- and middle-income countries (LMICs), yet its rapid adoption raises complex ethical, regulatory, and implementation challenges. This review investigates these barriers and identifies emerging strategies that support equitable and inclusive AI deployment in resource-limited settings. Following the PRISMA Extension for Scoping Reviews (PRISMA-ScR) guidelines, a systematic mapping of literature was conducted using PubMed, Scopus, and Cochrane Library (2000-2025) alongside global health policy reports. The search was framed using the Population, Concept, and Context (PCC) framework to identify studies addressing AI governance in LMICs. A total of 60 sources addressing ethical, regulatory, or implementation issues were analyzed across three domains derived from the WHO and OECD frameworks: governance, privacy, and AI applications. This study reveals that 7.4% of LMICs have adopted national AI strategies. Evidence indicates that over 60% of AI models in LMICs rely on non-representative datasets, increasing contextual bias. Of the 60 included studies, 25 focused on ethics, 17 on regulatory gaps, and 18 on implementation. Findings highlight workforce readiness gaps, with fewer than 10% of institutions offering structured AI training. Case studies from Brazil and India illustrate how these barriers are addressed through context-sensitive design. Successful AI integration requires context-sensitive design, participatory governance, and capacity building. This scoping review identifies critical gaps in empirical research on operationalization and recommends a transition from digital dependency to local innovation ecosystems.
Biomedical data integration requires term-to-identifier normalization, the process of linking natural-language biomedical terms to standardized ontology codes so that extracted concepts become computable and interoperable. Although large language models perform well on clinical text summarization and concept extraction, they remain markedly less accurate at mapping ontology terms to their corresponding identifiers. We examined the roles of memorization and generalization in term-to-code mapping across the Human Phenotype Ontology (HPO), the Gene Ontology (GO), and the HGNC gene naming system, including mappings between gene names, lexicalized gene symbols, and arbitrary gene identifiers. Performance was assessed across multiple base models and after task-specific fine-tuning. Accuracy scaled with model size, with GPT-4o outperforming Llama 3.1 70B and Llama 3.1 8B. Fine-tuning improved forward mappings from term to identifier, with larger gains for GO than for HPO and minimal improvement for gene name-to-HGNC identifier mappings. Generalization to withheld mappings occurred primarily for HGNC gene name-to-gene symbol tasks, whereas fine-tuning on HPO and GO identifiers produced little generalization. Embedding analyses revealed strong semantic alignment between gene names and HGNC gene symbols but no comparable alignment between concept names and identifiers in GO, HPO, or HGNC. These results suggest that fine-tuning success depends on two interacting factors: popularity and lexicalization. Popularity, a proxy for pretraining exposure to term-identifier pairs, predicted baseline accuracy and the magnitude of memorization gains during fine-tuning, whereas long-tail identifiers remained difficult to consolidate. Lexicalization, the extent to which a symbol functions as a meaningful token in embedding space, enabled generalization and explains why generalization emerged for HGNC gene symbols but not for the arbitrary identifiers used in GO and HPO. Together, these findings provide a predictive framework for identifying when fine-tuning can improve factual term normalization, when gains primarily reflect memorization, and when normalization is likely to fail.
Generative artificial intelligence (GenAI) is becoming an important tool in medical product development. A main component of this development includes annotating, summarizing, and extracting key insights from expert interviews to identify clinical pain points and curate device requirements. These tasks are time- and labor-intensive, resulting in increased administrative burden and reduced efficiency. As a result, researchers have developed large language models (LLMs) that can disseminate research and interview findings with reduced workload and improved productivity. This study explores the use of GenAI, specifically GPT-4o, to extract user functional and design requirements from medical professional interviews for the iterative development of an infant heart rate detector for neonatal resuscitation. A total of 29 healthcare practitioners were interviewed using a semistructured interview format. The interviews were recorded and transcribed. GPT-4o was used to extract user insights from the transcripts, and the results were compared with manual interviewer notes. A total of 26 h of interview data were collected. All interviewees validated the clinical need for a modality that enables quick and accurate heart rate (HR) measurement during neonatal resuscitation. A set of user requirements was extracted from the interviews and curated under the themes of ease of use, fast and accurate HR measurement, reusability, display, battery life, start-up time, and cost. Also, quantitative analyses of the interviewee's years of experience, clinical settings, and specialties were conducted. These analyses were conducted using GPT-4o and compared with ground-truth manual annotations to determine the accuracy and reliability of GenAI in content extraction and summarization. Overall, this study explored the user requirements identified through in-depth interviews for the development of a pediatric medical device. It also aimed to demonstrate the potential of GenAI in curating these design requirements, offering a framework for researchers and product designers to explore the use of LLMs in curating user requirements and design specifications for medical devices.
Effective cancer care increasingly depends on digital decision support tools (DSTs) to interpret complex clinical, molecular, and genomic data and guide personalised treatment decisions. However, the oncology DST (oncDST) landscape remains fragmented, with limited interoperability, inconsistent standards, and uneven clinical adoption across healthcare systems. This fragmentation hinders routine clinical use and impedes the demonstration of robust clinical benefit. To address these challenges, the CAN.HEAL consortium proposes the EU-oncDST digital framework, a conceptual, harmonised, interoperable, and modular architecture designed to integrate existing oncDSTs across Europe. Developed through consortium-wide consultations, an EU-level survey and comprehensive mapping of both public and private solutions, the framework provides a practical pathway for implementing interoperable oncDSTs while fostering stakeholder collaboration and innovation. It also promotes the improvement of data-driven precision oncology, highlighting the integration of artificial intelligence, enabling continuous patient follow-up, and supporting the development of a learning cancer system. At its core, the framework empowers Molecular Tumour Boards (MTBs) to operate efficiently at institutional, national, and European levels. By offering a harmonised, interoperable, and modular architecture designed to integrate clinical, molecular and genomic data, the framework strengthens evidence-based and personalised treatment recommendations. A phased action plan links MTB deployment to the implementation of oncDSTs. Early phases focus on piloting and validating oncDST use within MTBs, optimising patient-centred consultations, harmonising variant annotation, and enhancing clinical trial matching. Overall, the EU-oncDST digital framework aims to provide a practical and collaborative pathway to strengthen oncology decision-making and accelerate the translation of precision medicine into clinical benefit across Europe.
Automated documentation tools are being rapidly adopted in healthcare and clinical workflows. Among these are AI-enabled ambient scribing products, which transcribe conversations between patients and healthcare providers, then produce clinical records using automatic speech recognition (ASR) and generative AI such as Large Language Models (LLMs). While research suggests these technologies can reduce clinical burden, safe and responsible deployment requires that these tools determine what captured information is appropriate to record and under which circumstances. This presents a contextual privacy challenge distinct from PII leakage or data memorization and remains largely untested. We address this gap by operationalizing privacy leakage as the inappropriate inclusion of third-party personal information in LLM-generated clinical notes. We construct a benchmark of transcripts containing private information with gold standard clinical notes by enriching patient metadata from the aci-bench corpus and injecting third-party personal information across six relationship types and seven information topics. We evaluate open weight LLaMA 3.1 8 and 70 B, Mixtral 8×7B and 8×22B, and proprietary Claude 3.5 Haiku and Sonnet models on note generation using prompts with varied privacy and structural requirements. All examined models leaked third-party information, and privacy instructions helped reduce leakage but proved neither complete nor robust as a solution. Models could generate privacy-infringing notes despite correctly identifying such information as inappropriate to share. Decomposing generation and privacy editing into separate steps could further reduce leakage, but only when privacy was defined with contextual specificity. No single mitigation eliminated leakage entirely, but combining approaches yielded the greatest reductions. Results emphasize the need to build privacy-by-design systems and develop evaluation strategies that reflect emerging information synthesis and sharing practices.
The increasing integration of connected medical devices and internet of things (IoT) technologies in healthcare has significantly improved patient care and operational efficiency. However, this rapid digital transformation has also introduced serious cybersecurity vulnerabilities in medical devices, posing risks to patient safety and sensitive health data. Cybersecurity threats can allow unauthorized remote access to devices, cause device malfunctions, and lead to data breaches. As medical devices become more interconnected within healthcare systems, ensuring their security has become a critical priority for regulators, nanufacturers, and healthcare providers. This study examines the cybersecurity safety communications issued by the U.S. Food and Drug Administration (FDA), between 2013 and 2025, using a systematic qualitative content analysis approach. The analysis focuses on identifying the frequency of alerts, the severity of vulnerabilities, and the potential risks posed to healthcare infrastructure and patient safety. The study also reviews regualtory actions and policy frameworks introduced by the FDA to address cybersecurity risks in medical devices. The analysis found that the FDA issued 18 safety communications related to cybersecurity breaches in medical devices. Among the reported vulnerabilities, 94% were classified as high-risk, indicating severe potential consequences, including unauthorized remote access to medical devices, possible device malfunctions, and exposure of sensitive patient data. Additionally, the results demonstrate a notable increase in FDA cybersecurity safety communications over time, reflecting the growing severity and prevalence of cybersecurity threats in healthcare technologies. The finding emphasize the need for stronger cybersecurity strategies in healthcare. Collaboration among medical device manufacturers, healthcare providers, and regulatory agencies, along with continuous monitoring and regulatory compliance is necessary to protect patient safety and sensitive health data in an increasingly interconnected healthcare environment.
Academic mentoring plays a critical role in monitoring student progress, maintaining academic integrity, identifying early signs of risk, and delivering personalized guidance to improve learning outcomes. Traditionally, this has relied on face-to-face interactions; however, advancements in artificial intelligence (AI) have introduced new opportunities for AI-assisted mentoring. While promising, many existing AI models for student monitoring and risk identification are complex and difficult to implement in real-world academic settings. To address this challenge, the present study validates a simplified AI comentor model designed to efficiently identify at-risk students and support continuous academic monitoring focused on pedagogy. This study employed a prospective mixed-methods pilot design to evaluate the feasibility, acceptability, and analytic agreement of an AI-assisted assessment framework in medical education. Participants included approximately 40 undergraduate medical students and faculty assessors. Primary outcomes focused on implementation feasibility and acceptability, assessed using structured student and faculty surveys, system-usage metrics, and qualitative feedback. Secondary outcomes evaluated the analytic agreement between AI-derived competency profiles and faculty assessments. The AI component used unsupervised machine learning-based clustering to group students according to multidimensional performance indicators, without prior labels. Agreement was examined using confusion matrices, percentage agreement, and Cohen's Kappa, reported with confidence intervals to account for the exploratory sample size. Given the pilot nature of the study, resampling-based validation (repeated stratified k-fold cross-validation) was used to assess stability rather than definitive diagnostic accuracy. Ethical approval was obtained, and all data were deidentified before analysis. This study will be conducted on a cohort of 40 students from a reputed Health Sciences College, UAE, to evaluate the integrity of a proposed AI comentoring model for monitoring academic performance throughout a semester. The AI models (supervised and segmentation engines) will be tested at two time points: the 5th and 10th weeks. At each time point, categorized student performance data will be uploaded to the AI platform, based on pedagogical parameters, and used to generate a personalized text draft automatically (local pseudonymization + institutional mail merge workflow). To assess the integrity of the used AI, the investigator will perform a manual evaluation of each student's risk status at both checkpoints, alongside statistical analyses. If successful, the system may alleviate the workload of human mentors, enable timely interventions for at-risk students, and enhance overall student performance and retention.