There are challenges that must be overcome to make recommender systems useful in healthcare settings. The reasons are varied: the lack of publicly available clinical data, the difficulty that users may have in understanding the reasons why a recommendation was made, the risks that may be involved in following that recommendation, and the uncertainty about its effectiveness. In this work, we address these challenges with a recommendation model that leverages the structure of psychometric data to provide visual explanations that are faithful to the model and interpretable by care professionals. We focus on a narrow healthcare niche, gerontological primary care, to show that the proposed recommendation model can assist the attending professional in the creation of personalised care plans. We report results of a comparative offline performance evaluation of the proposed model on healthcare datasets that were collected by research partners in Brazil, as well as the results of a user study that evaluates the interpretability of the visual explanations the model generates. The results suggest that the proposed model can advance the application of recommender systems in this healthcare nic
Demand for health care is constantly increasing due to the ongoing demographic change, while at the same time health service providers face difficulties in finding skilled personnel. This creates pressure on health care systems around the world, such that the efficient, nationwide provision of primary health care has become one of society's greatest challenges. Due to the complexity of health care systems, unforeseen future events, and a frequent lack of data, analyzing and optimizing the performance of health care systems means tackling a wicked problem. To support this task for primary care, this paper introduces the hybrid agent-based simulation model SiM-Care. SiM-Care models the interactions of patients and primary care physicians on an individual level. By tracking agent interactions, it enables modelers to assess multiple key indicators such as patient waiting times and physician utilization. Based on these indicators, primary care systems can be assessed and compared. Moreover, changes in the infrastructure, patient behavior, and service design can be directly evaluated. To showcase the opportunities offered by SiM-Care and aid model validation, we present a case study for
In the much-celebrated book Deep Medicine, Eric Topol argues that the development of artificial intelligence for health care will lead to a dramatic shift in the culture and practice of medicine. In the next several decades, he suggests, AI will become sophisticated enough that many of the everyday tasks of physicians could be delegated to it. Topol is perhaps the most articulate advocate of the benefits of AI in medicine, but he is hardly alone in spruiking its potential to allow physicians to dedicate more of their time and attention to providing empathetic care for their patients in the future. Unfortunately, several factors suggest a radically different picture for the future of health care. Far from facilitating a return to a time of closer doctor-patient relationships, the use of medical AI seems likely to further erode therapeutic relationships and threaten professional and patient satisfaction.
Depression is underdiagnosed in primary care, yet timely identification remains critical. Recorded clinical encounters, increasingly common with digital scribing technologies, present an opportunity to detect depression from naturalistic dialogue. We investigated automated depression detection from 1,108 audio-recorded primary care encounters in the Establishing Focus study, with depression defined by PHQ-9 (n=253 depressed, n=855 non-depressed). We compared three supervised approaches, Sentence-BERT + Logistic Regression (LR), LIWC+LR and ModernBERT, against a zero-shot GPT-OSS. GPT-OSS achieved the strongest performance (AUPRC=0.510, AUROC=0.774), with LIWC+LR competitive among supervised models (AUPRC=0.500, AUROC=0.742). Combined dyadic transcripts outperformed single-speaker configurations, with providers linguistically mirroring patients in depression encounters, an additive signal not captured by either speaker alone. Meaningful detection is achievable from the first 128 patient tokens (AUPRC=0.356, AUROC=0.675), supporting in-the-moment clinical decision support. These findings argue for passively collected clinical audio as a low-burden complement to existing screening wor
The medical ecosystem consists of the training of new clinicians and researchers, the practice of clinical medicine, and areas of adjacent research. There are many aspects of these domains that could benefit from the application of task automation and programmatic assistance. Machine learning and artificial intelligence techniques, including large language models (LLMs), have been promised to deliver on healthcare innovation, improving care speed and accuracy, and reducing the burden on staff for manual interventions. However, LLMs have no understanding of objective truth that is based in reality. They also represent real risks to the disclosure of protected information when used by clinicians and researchers. The use of AI in medicine in general, and the deployment of LLMs in particular, therefore requires careful consideration and thoughtful application to reap the benefits of these technologies while avoiding the dangers in each context.
Referral workflow inefficiencies, including misaligned referrals and delays, contribute to suboptimal patient outcomes and higher healthcare costs. In this study, we investigated the possibility of predicting procedural needs based on primary care diagnostic entries, thereby improving referral accuracy, streamlining workflows, and providing better care to patients. A de-identified dataset of 2,086 orthopedic referrals from the University of Texas Health at Tyler was analyzed using machine learning models built on Base General Embeddings (BGE) for semantic extraction. To ensure real-world applicability, noise tolerance experiments were conducted, and oversampling techniques were employed to mitigate class imbalance. The selected optimum and parsimonious embedding model demonstrated high predictive accuracy (ROC-AUC: 0.874, Matthews Correlation Coefficient (MCC): 0.540), effectively distinguishing patients requiring surgical intervention. Dimensionality reduction techniques confirmed the model's ability to capture meaningful clinical relationships. A threshold sensitivity analysis identified an optimal decision threshold (0.30) to balance precision and recall, maximizing referral eff
This review underscores the vital role of interoperability in digital health, advocating for a standardized framework. It focuses on implementing a Fast Healthcare Interoperability Resources (FHIR) server, addressing technical, semantic, and process challenges. FHIR's adaptability ensures uniformity within Primary Care Health Information Systems, fostering interoperability. Patient data management complexities highlight the pivotal role of semantic interoperability in seamless patient care. FHIR standards enhance these efforts, offering multiple pathways for data search. The ADR-guided FHIR server implementation systematically addresses challenges related to patient identity, biometrics, and data security. The detailed development phases emphasize architecture, API integration, and security. The concluding stages incorporate forward-looking approaches, including HHIMS Synthetic Dataset testing. Envisioning FHIR integration as transformative, it anticipates a responsive healthcare environment aligned with the evolving digital health landscape, ensuring comprehensive, dynamic, and interconnected systems for efficient data exchange and access.
Relationship-centred care (RCC) recognises that healthcare quality depends not only on outcomes, but on how voice, responsibility, and emotional labour are negotiated among patients, caregivers, and providers. As AI systems enter sensitive care contexts, they introduce a new participant into these negotiations. Drawing on empirical work in Advance Care Planning (ACP) and peer support, we argue that AI's primary impact in high-subjectivity domains is not optimisation but redistribution: it reorganises who speaks, who decides, and who bears moral responsibility. Across both settings, participants were less concerned with technical accuracy than with relational consequences: whether AI would appropriately represent their decision, reduce burden, or blur accountability, scaffold connection, or subtly displace it. We identify three relational dimensions: authority, temporality, and visibility, through which AI reshapes care relationships, and propose design provocations centred on relational legibility, bounded agency, responsibility traceability, and non-substitutive scaffolding.
Managing patients with respiratory failure increasingly involves noninvasive respiratory support (NIRS) strategies to support respiration, often preventing the need for invasive mechanical ventilation. However, despite the rapidly expanding use of NIRS, there remains a significant challenge to its optimal use across all medical circumstances. It lacks a unified ontological structure, complicating guidance on NIRS modalities across healthcare systems. This study introduced NIRS ontology to support knowledge representation in acute care settings by providing a unified framework that enhances data clarity and interoperability, laying the groundwork for future clinical decision-making. We developed NIRS ontology using the Web Ontology Language (OWL) and Protege to organize clinical concepts and relationships. To enable rule-based clinical reasoning beyond hierarchical structures, we added Semantic Web Rule Language (SWRL) rules. We evaluated logical reasoning by adding a sample of 6 patient scenarios and used SPARQL queries to retrieve and test targeted inferences. The ontology has 145 classes, 11 object properties, and 18 data properties across 949 axioms that establish concept relati
With the rapid development of artificial intelligence (AI), large language models (LLMs) have shown strong capabilities in natural language understanding, reasoning, and generation, attracting amounts of research interest in applying LLMs to health and medicine. Critical care medicine (CCM) provides diagnosis and treatment for critically ill patients who often require intensive monitoring and interventions in intensive care units (ICUs). Can LLMs be applied to CCM? Are LLMs just like stochastic parrots or ICU experts in assisting clinical decision-making? This scoping review aims to provide a panoramic portrait of the application of LLMs in CCM. Literature in seven databases, including PubMed, Embase, Scopus, Web of Science, CINAHL, IEEE Xplore, and ACM Digital Library, were searched from January 1, 2019, to June 10, 2024. Peer-reviewed journal and conference articles that discussed the application of LLMs in critical care settings were included. From an initial 619 articles, 24 were selected for final review. This review grouped applications of LLMs in CCM into three categories: clinical decision support, medical documentation and reporting, and medical education and doctor-patien
Sepsis-induced acute respiratory failure (ARF) is a serious complication with a poor prognosis. This paper presents a deep representation learningbased phenotyping method to identify distinct groups of clinical trajectories of septic patients with ARF. For this retrospective study, we created a dataset from electronic medical records (EMR) consisting of data from sepsis patients admitted to medical intensive care units who required at least 24 hours of invasive mechanical ventilation at a quarternary care academic hospital in southeast USA for the years 2016-2021. A total of N=3349 patient encounters were included in this study. Clustering Representation Learning on Incomplete Time Series Data (CRLI) algorithm was applied to a parsimonious set of EMR variables in this data set. To validate the optimal number of clusters, the K-means algorithm was used in conjunction with dynamic time warping. Our model yielded four distinct patient phenotypes that were characterized as liver dysfunction/heterogeneous, hypercapnia, hypoxemia, and multiple organ dysfunction syndrome by a critical care expert. A Kaplan-Meier analysis to compare the 28-day mortality trends exhibited significant differe
Large Language Models have been tested on medical student-level questions, but their performance in specialized fields like Critical Care Medicine (CCM) is less explored. This study evaluated Meta-Llama 3.1 models (8B and 70B parameters) on 871 CCM questions. Llama3.1:70B outperformed 8B by 30%, with 60% average accuracy. Performance varied across domains, highest in Research (68.4%) and lowest in Renal (47.9%), highlighting the need for broader future work to improve models across various subspecialty domains.
Background: Although cardio-respiratory (CR) system is generally controlled by the autonomic nervous system, interactions between the cortex and these primary functions are receiving an increasing interest in neurosciences. New method: In general, the timing of such internally paced events (e.g. heartbeats or respiratory cycles) may display a large variability. For the analysis of such CR event-related EEG potentials, a baseline must be correctly associated to each cycle of detected events. The open-source toolbox CARE-rCortex provides an easy-to-use interface to detect CR events, define baselines, and analyse in time-frequency (TF) domain the CR-based EEG potentials. Results: CARE-rCortex provides some practical tools to detect and validate these CR events. Users can define baselines time-locked to a phase of respiratory or heart cycle. A statistical test has also been integrated to highlight significant points of the TF maps with respect to the baseline. We illustrate the use of CARE-rCortex with the analysis of two real cardio-respiratory datasets. Comparison with existing methods: Compared to other open-source toolboxes, CARE-rCortex allows users to automatically detect CR even
What does Artificial Intelligence (AI) have to contribute to health care? And what should we be looking out for if we are worried about its risks? In this paper we offer a survey, and initial evaluation, of hopes and fears about the applications of artificial intelligence in medicine. AI clearly has enormous potential as a research tool, in genomics and public health especially, as well as a diagnostic aid. It's also highly likely to impact on the organisational and business practices of healthcare systems in ways that are perhaps under-appreciated. Enthusiasts for AI have held out the prospect that it will free physicians up to spend more time attending to what really matters to them and their patients. We will argue that this claim depends upon implausible assumptions about the institutional and economic imperatives operating in contemporary healthcare settings. We will also highlight important concerns about privacy, surveillance, and bias in big data, as well as the risks of over trust in machines, the challenges of transparency, the deskilling of healthcare practitioners, the way AI reframes healthcare, and the implications of AI for the distribution of power in healthcare ins
We investigate different natural language processing (NLP) approaches based on contextualised word representations for the problem of early prediction of lung cancer using free-text patient medical notes of Dutch primary care physicians. Because lung cancer has a low prevalence in primary care, we also address the problem of classification under highly imbalanced classes. Specifically, we use large Transformer-based pretrained language models (PLMs) and investigate: 1) how \textit{soft prompt-tuning} -- an NLP technique used to adapt PLMs using small amounts of training data -- compares to standard model fine-tuning; 2) whether simpler static word embedding models (WEMs) can be more robust compared to PLMs in highly imbalanced settings; and 3) how models fare when trained on notes from a small number of patients. We find that 1) soft-prompt tuning is an efficient alternative to standard model fine-tuning; 2) PLMs show better discrimination but worse calibration compared to simpler static word embedding models as the classification problem becomes more imbalanced; and 3) results when training models on small number of patients are mixed and show no clear differences between PLMs and
The development of respiratory failure is common among patients in intensive care units (ICU). Large data quantities from ICU patient monitoring systems make timely and comprehensive analysis by clinicians difficult but are ideal for automatic processing by machine learning algorithms. Early prediction of respiratory system failure could alert clinicians to patients at risk of respiratory failure and allow for early patient reassessment and treatment adjustment. We propose an early warning system that predicts moderate/severe respiratory failure up to 8 hours in advance. Our system was trained on HiRID-II, a data-set containing more than 60,000 admissions to a tertiary care ICU. An alarm is typically triggered several hours before the beginning of respiratory failure. Our system outperforms a clinical baseline mimicking traditional clinical decision-making based on pulse-oximetric oxygen saturation and the fraction of inspired oxygen. To provide model introspection and diagnostics, we developed an easy-to-use web browser-based system to explore model input data and predictions visually.
We find ourselves on the ever-shifting cusp of an AI revolution -- with potentially metamorphic implications for the future practice of healthcare. For many, such innovations cannot come quickly enough; as healthcare systems worldwide struggle to keep up with the ever-changing needs of our populations. And yet, the potential of AI tools and systems to shape healthcare is as often approached with great trepidation as celebrated by health professionals and patients alike. These fears alight not only in the form of privacy and security concerns but for the potential of AI tools to reduce patients to datapoints and professionals to aggregators -- to make healthcare, in short, less caring. This infixated concern, we - as designers, developers and researchers of AI systems - believe it essential we tackle head on; if we are not only to overcome the AI implementation gap, but realise the potential of AI systems to truly augment human-centred practices of care. This, we argue we might yet achieve by realising newly-accessible practices of AI healthcare innovation, engaging providers, recipients and affected communities of care in the inclusive design of AI tools we may yet enthusiastically
Model Medicine is the science of understanding, diagnosing, treating, and preventing disorders in AI models, grounded in the principle that AI models -- like biological organisms -- have internal structures, dynamic processes, heritable traits, observable symptoms, classifiable conditions, and treatable states. This paper introduces Model Medicine as a research program, bridging the gap between current AI interpretability research (anatomical observation) and the systematic clinical practice that complex AI systems increasingly require. We present five contributions: (1) a discipline taxonomy organizing 15 subdisciplines across four divisions -- Basic Model Sciences, Clinical Model Sciences, Model Public Health, and Model Architectural Medicine; (2) the Four Shell Model (v3.3), a behavioral genetics framework empirically grounded in 720 agents and 24,923 decisions from the Agora-12 program, explaining how model behavior emerges from Core--Shell interaction; (3) Neural MRI (Model Resonance Imaging), a working open-source diagnostic tool mapping five medical neuroimaging modalities to AI interpretability techniques, validated through four clinical cases demonstrating imaging, compari
As respiratory illnesses become more common, it is crucial to quickly and accurately detect them to improve patient care. There is a need for improved diagnostic methods for immediate medical assessments for optimal patient outcomes. This paper introduces VoxMed, a UI-assisted one-step classifier that uses digital stethoscope recordings to diagnose respiratory diseases. It employs an Audio Spectrogram Transformer(AST) for feature extraction and a 1-D CNN-based architecture to classify respiratory diseases, offering professionals information regarding their patients respiratory health in seconds. We use the ICBHI dataset, which includes stethoscope recordings collected from patients in Greece and Portugal, to classify respiratory diseases. GitHub repository: https://github.com/Sample-User131001/VoxMed
I study households' primary health care usage in India, which presents a paradox. I examine why most households use fee-charging private health care services even though (1) most providers have no formal medical qualifications and (2) in markets where qualified doctors offer free care through public hospitals. I present evidence that this puzzling practice has deep historical routes. I examine India's coercive forced sterilization policy implemented between 1976 and 1977. Utilizing the unexpected timing of the policy, multiple measures of forced sterilization, including at a granular level, and an instrumental variable approach, I document that places heavily affected by the policy have lower public health care usage today. I also show that the instrument I use is unrelated to a battery of demographic, economic, or political aspects before the forced sterilization period. Finally, I explore the mechanism and document that supply-side factors do not explain these differences. Instead, I demonstrate that places with greater exposure to forced sterilization have higher confidence in private hospitals and doctors to provide good treatment.