Nursing documentation in intensive care units (ICUs) provides essential clinical intelligence but often suffers from inconsistent terminology, informal styles, and lack of standardization, challenges that are particularly critical in heart failure care. This study applies Direct Preference Optimization (DPO) to adapt Mistral-7B, a locally deployable language model, using 8,838 heart failure nursing notes from the MIMIC-III database and 21,210 preference pairs derived from expert-verified GPT outputs, model generations, and original notes. Evaluation across BLEU, ROUGE, BERTScore, Perplexity, and expert qualitative assessments demonstrates that DPO markedly enhances documentation quality. Specifically, BLEU increased by 84% (0.173 to 0.318), BERTScore improved by 7.6% (0.828 to 0.891), and expert ratings rose across accuracy (+14.4 points), completeness (+14.5 points), logical consistency (+14.1 points), readability (+11.1 points), and structural clarity (+6.0 points). These results indicate that DPO can align lightweight clinical language models with expert standards, supporting privacy-preserving, AI-assisted documentation within electronic health record systems to reduce administ
As the aging population increases and the shortage of healthcare workers increases, the need to examine other means for caring for the aging population increases. One such means is the use of humanoid robots to care for social, emotional, and physical wellbeing of the people above 65. Understanding skilled and long term care nursing home administrators' perspectives on humanoid robots in caregiving is crucial as their insights shape the implementation of robots and their potential impact on resident well-being and quality of life. This authors surveyed two hundred and sixty nine nursing homes executives to understand their perspectives on the use of humanoid robots in their nursing home facilities. The data was coded and results revealed that the executives were keen on exploring other avenues for care such as robotics that would enhance their nursing homes abilities to care for their residents. Qualitative analysis reveals diverse perspectives on integrating humanoid robots in nursing homes. While acknowledging benefits like improved engagement and staff support, concerns persist about costs, impacts on human interaction, and doubts about robot effectiveness. This highlights compl
Domain-specific foundation models for healthcare have expanded rapidly in recent years, yet foundation models for critical care time series remain relatively underexplored due to the limited size and availability of datasets. In this work, we introduce an early-stage pre-trained foundation model for critical care time-series based on the Bi-Axial Transformer (BAT), trained on pooled electronic health record datasets. We demonstrate effective transfer learning by fine-tuning the model on a dataset distinct from the training sources for mortality prediction, where it outperforms supervised baselines, particularly for small datasets ($<5,000$). These contributions highlight the potential of self-supervised foundation models for critical care times series to support generalizable and robust clinical applications in resource-limited settings.
For patients experiencing cancer, nurse navigation can ease the burden of complex care by enhancing coordination of health services and patient outcomes. However, in under-resourced areas, trained nurse navigators may be limited or non-existent. In the United States, artificial intelligence (AI)-enabled digital health tools are increasingly available and may help address gaps in care coordination; however, most are not designed to specifically support nursing. This perspective piece discusses a human-centered AI framework that integrates empathic and agentic approaches grounded in the American Nurses Association's code of ethics to support nurses in the United States in cancer care navigation. The framework could augment, not replace, human empathy and agency while improving nurse workflow, patient-clinician relationships, and care coordination services in under-resourced areas.
End-stage renal disease patients face a complicated sociomedical situation and rely on various forms of infrastructure for life-sustaining treatment. Disruption of these infrastructures during disasters poses a major threat to their lives. To improve patient access to dialysis treatment, there is a need to assess the potential threat to critical care facilities from hazardous events. In this study, we propose optimization models to solve critical care system resilience problems including patient and medical resource allocation. We use human mobility data in the context of Harris County (Texas) to assess patient access to critical care facilities, dialysis centers in this study, under the simulated hazard impacts, and we propose models for patient re-allocation and temporary medical facility placement to improve critical care system resilience in an equitable manner. The results show (1) the capability of the optimization model in efficient patient re-allocation to alleviate disrupted access to dialysis facilities; (2) the importance of large facilities in maintaining the functioning of the system. The critical care system, particularly the network of dialysis centers, is heavily re
An emergent challenge in geriatric care is improving the quality of care, which requires insight from stakeholders. Qualitative methods offer detailed insights, but they can be biased and have limited generalizability, while quantitative methods may miss nuances. Network-based approaches, such as quantitative ethnography (QE), can bridge this methodological gap. By leveraging the strengths of both methods, QE provides profound insights into need-finding interviews. In this paper, to better understand geriatric care attitudes, we interviewed ten nursing assistants, used QE to analyze the data, and compared their daily activities in real life with training experiences. A two-sample t-test with a large effect size (Cohen's d=1.63) indicated a significant difference between real-life and training activities. The findings suggested incorporating more empathetic training scenarios into the future design of our geriatric care simulation. The results have implications for human-computer interaction and human factors. This is illustrated by presenting an example of using QE to analyze expert interviews with nursing assistants as caregivers to inform subsequent design processes.
This paper explores the application of large language models (LLMs) in nursing and elderly care, focusing on AI-driven patient monitoring and interaction. We introduce a novel Chinese nursing dataset and implement incremental pre-training (IPT) and supervised fine-tuning (SFT) techniques to enhance LLM performance in specialized tasks. Using LangChain, we develop a dynamic nursing assistant capable of real-time care and personalized interventions. Experimental results demonstrate significant improvements, paving the way for AI-driven solutions to meet the growing demands of healthcare in aging populations.
Excessive caregiver workload in hospital nurses has been implicated in poorer patient care and increased worker burnout. Measurement of this workload in the Intensive Care Unit (ICU) is often done using the Nursing Activities Score (NAS), but this is usually recorded manually and sporadically. Previous work has made use of Ambient Intelligence (AmI) by using computer vision to passively derive caregiver-patient interaction times to monitor staff workload. In this letter, we propose using a Multiscale Vision Transformer (MViT) to passively predict the NAS from low-resolution thermal videos recorded in an ICU. 458 videos were obtained from an ICU in Melbourne, Australia and used to train a MViTv2 model using an indirect prediction and a direct prediction method. The indirect method predicted 1 of 8 potentially identifiable NAS activities from the video before inferring the NAS. The direct method predicted the NAS score immediately from the video. The indirect method yielded an average 5-fold accuracy of 57.21%, an area under the receiver operating characteristic curve (ROC AUC) of 0.865, a F1 score of 0.570 and a mean squared error (MSE) of 28.16. The direct method yielded a MSE of 1
Background: Telephone nursing is the first line of contact for many care-seekers and aims at optimizing the performance of the healthcare system by supporting and guiding patients to the correct level of care and reduce the amount of unscheduled visits. Good statistical models that describe the effects of telephone nursing are important in order to study its impact on healthcare resources and evaluate changes in telephone nursing procedures. Objective: To develop a valid model that captures the complex relationships between the nurse's recommendations, the patients' intended actions and the patients' health seeking behavior. Using the model to estimate the effects of telephone nursing on patient behavior, healthcare utilization, and infer potential cost savings. Methods: Bayesian ordinal regression modeling of data from randomly selected patients that received telephone nursing. Inference is based on Markov Chain Monte Carlo methods, model selection using the Watanabe-Akaike Information Criteria, and model validation using posterior predictive checks on standard discrepancy measures. Results and Conclusions: We present a robust Bayesian ordinal regression model that predicts 76% of
Relationship-centred care (RCC) recognises that healthcare quality depends not only on outcomes, but on how voice, responsibility, and emotional labour are negotiated among patients, caregivers, and providers. As AI systems enter sensitive care contexts, they introduce a new participant into these negotiations. Drawing on empirical work in Advance Care Planning (ACP) and peer support, we argue that AI's primary impact in high-subjectivity domains is not optimisation but redistribution: it reorganises who speaks, who decides, and who bears moral responsibility. Across both settings, participants were less concerned with technical accuracy than with relational consequences: whether AI would appropriately represent their decision, reduce burden, or blur accountability, scaffold connection, or subtly displace it. We identify three relational dimensions: authority, temporality, and visibility, through which AI reshapes care relationships, and propose design provocations centred on relational legibility, bounded agency, responsibility traceability, and non-substitutive scaffolding.
The flexibility level allowed in nursing care delivery and uncertainty in infusion durations are very important factors to be considered during the chemotherapy schedule generation task. The nursing care delivery scheme employed in an outpatient chemotherapy clinic (OCC) determines the strictness of the patient-to-nurse assignment policies, while the estimation of infusion durations affects the trade-off between patient waiting time and nurse overtime. We study the problem of daily scheduling of patients, assignment of patients to nurses and chairs under uncertainty in infusion durations for an OCC that functions according to any of the three commonly used nursing care delivery models representing fully flexible, partially flexible, and inflexible care models, respectively. We develop a two-stage stochastic mixed-integer programming model that is valid for the three care delivery models to minimize expected weighted cost of patient waiting time and nurse overtime. We propose multiple variants of a scenario grouping-based decomposition algorithm to solve the model using data of a major university oncology hospital. The variants of the algorithm differ from each other according to th
Long-term care service for old people is in great demand in most of the aging societies. The number of nursing homes residents is increasing while the number of care providers is limited. Due to the care worker shortage, care to vulnerable older residents cannot be fully tailored to the unique needs and preference of each individual. This may bring negative impacts on health outcomes and quality of life among institutionalized older people. To improve care quality through personalized care planning and delivery with limited care workforce, we propose a new care planning model assisted by artificial intelligence. We apply bandit algorithms which optimize the clinical decision for care planning by adapting to the sequential feedback from the past decisions. We evaluate the proposed model on empirical data acquired from the Systems for Person-centered Elder Care (SPEC) study, a ICT-enhanced care management program.
This paper addresses the methodology for the quarterly estimation of Compensation of Employees paid by the General Government (GG) sector, in accordance with the European System of Accounts (ESA 2010). Due to the limited high-frequency data availability and the need to guarantee the consistency with annual constraints, quarterly estimation relies on indirect temporal disaggregation techniques. These methods use specific infra-annual indicators as proxies for the variables being estimated. The specific case of the quarterly estimation of Compensation of employees presents several additional challenges. Firstly, the information provided by the sources, based on cash or legal-accrual data, is elaborated to define indicators which respect the accrual ESA 2010 principle as the annual estimates, based on more compliant data sources such as final budgets of public entities. Secondly, at a quarterly level the extraordinary events - such as the recording of delayed collective bargaining agreements which result in arrears - have a strong impact on quarterly indicators, whereas their effect is mitigated at annual level. To attribute these flows to the period when the work is performed, multi-
Latest advances in the field of natural language processing (NLP) enable new use cases for different domains, including the medical sector. In particular, transcription can be used to support automation in the nursing documentation process and give nurses more time to interact with the patients. However, different challenges including (a) data privacy, (b) local languages and dialects, and (c) domain-specific vocabulary need to be addressed. In this case study, we investigate the case of home care nursing documentation in Switzerland. We assessed different transcription tools and models, and conducted several experiments with OpenAI Whisper, involving different variations of German (i.e., dialects, foreign accent) and manually curated example texts by a domain expert of home care nursing. Our results indicate that even the used out-of-the-box model performs sufficiently well to be a good starting point for future research in the field.
Recent advancements in large language models (LLMs) have significantly transformed medical systems. However, their potential within specialized domains such as nursing remains largely underexplored. In this work, we introduce NurseLLM, the first nursing-specialized LLM tailored for multiple choice question-answering (MCQ) tasks. We develop a multi-stage data generation pipeline to build the first large scale nursing MCQ dataset to train LLMs on a broad spectrum of nursing topics. We further introduce multiple nursing benchmarks to enable rigorous evaluation. Our extensive experiments demonstrate that NurseLLM outperforms SoTA general-purpose and medical-specialized LLMs of comparable size on different benchmarks, underscoring the importance of a specialized LLM for the nursing domain. Finally, we explore the role of reasoning and multi-agent collaboration systems in nursing, highlighting their promise for future research and applications.
Consistent high-quality nursing care is essential for patient safety, yet current nursing education depends on subjective, time-intensive instructor feedback in training future nurses, which limits scalability and efficiency in their training, and thus hampers nursing competency when they enter the workforce. In this paper, we introduce a video-language model (VLM) based framework to develop the AI capability of automated procedural assessment and feedback for nursing skills training, with the potential of being integrated into existing training programs. Mimicking human skill acquisition, the framework follows a curriculum-inspired progression, advancing from high-level action recognition, fine-grained subaction decomposition, and ultimately to procedural reasoning. This design supports scalable evaluation by reducing instructor workload while preserving assessment quality. The system provides three core capabilities: 1) diagnosing errors by identifying missing or incorrect subactions in nursing skill instruction videos, 2) generating explainable feedback by clarifying why a step is out of order or omitted, and 3) enabling objective, consistent formative evaluation of procedures.
Autonomous interaction is crucial for the effective use of elderly care robots. However, developing universal AI architectures is extremely challenging due to the diversity in robot configurations and a lack of dataset. We proposed a universal architecture for the AI-ization of elderly care robots, called AoECR. Specifically, based on a nursing bed, we developed a patient-nurse interaction dataset tailored for elderly care scenarios and fine-tuned a large language model to enable it to perform nursing manipulations. Additionally, the inference process included a self-check chain to ensure the security of control commands. An expert optimization process further enhanced the humanization and personalization of the interactive responses. The physical experiment demonstrated that the AoECR exhibited zero-shot generalization capabilities across diverse scenarios, understood patients' instructions, implemented secure control commands, and delivered humanized and personalized interactive responses. In general, our research provides a valuable dataset reference and AI-ization solutions for elderly care robots.
Progress of machine learning in critical care has been difficult to track, in part due to absence of public benchmarks. Other fields of research (such as computer vision and natural language processing) have established various competitions and public benchmarks. Recent availability of large clinical datasets has enabled the possibility of establishing public benchmarks. Taking advantage of this opportunity, we propose a public benchmark suite to address four areas of critical care, namely mortality prediction, estimation of length of stay, patient phenotyping and risk of decompensation. We define each task and compare the performance of both clinical models as well as baseline and deep learning models using eICU critical care dataset of around 73,000 patients. This is the first public benchmark on a multi-centre critical care dataset, comparing the performance of clinical gold standard with our predictive model. We also investigate the impact of numerical variables as well as handling of categorical variables on each of the defined tasks. The source code, detailing our methods and experiments is publicly available such that anyone can replicate our results and build upon our work.
We find ourselves on the ever-shifting cusp of an AI revolution -- with potentially metamorphic implications for the future practice of healthcare. For many, such innovations cannot come quickly enough; as healthcare systems worldwide struggle to keep up with the ever-changing needs of our populations. And yet, the potential of AI tools and systems to shape healthcare is as often approached with great trepidation as celebrated by health professionals and patients alike. These fears alight not only in the form of privacy and security concerns but for the potential of AI tools to reduce patients to datapoints and professionals to aggregators -- to make healthcare, in short, less caring. This infixated concern, we - as designers, developers and researchers of AI systems - believe it essential we tackle head on; if we are not only to overcome the AI implementation gap, but realise the potential of AI systems to truly augment human-centred practices of care. This, we argue we might yet achieve by realising newly-accessible practices of AI healthcare innovation, engaging providers, recipients and affected communities of care in the inclusive design of AI tools we may yet enthusiastically
Nursing notes, an important part of Electronic Health Records (EHRs), track a patient's health during a care episode. Summarizing key information in nursing notes can help clinicians quickly understand patients' conditions. However, existing summarization methods in the clinical setting, especially abstractive methods, have overlooked nursing notes and require reference summaries for training. We introduce QGSumm, a novel query-guided self-supervised domain adaptation approach for abstractive nursing note summarization. The method uses patient-related clinical queries for guidance, and hence does not need reference summaries for training. Through automatic experiments and manual evaluation by an expert clinician, we study our approach and other state-of-the-art Large Language Models (LLMs) for nursing note summarization. Our experiments show: 1) GPT-4 is competitive in maintaining information in the original nursing notes, 2) QGSumm can generate high-quality summaries with a good balance between recall of the original content and hallucination rate lower than other top methods. Ultimately, our work offers a new perspective on conditional text summarization, tailored to clinical app