Chronic diseases are long-lasting conditions that require lifelong medical attention. Using big EMR data, we have developed early disease risk prediction models for five common chronic diseases: diabetes, hypertension, CKD, COPD, and chronic ischemic heart disease. In this study, we present a novel approach for disease risk models by integrating survival analysis with classification techniques. Traditional models for predicting the risk of chronic diseases predominantly focus on either survival analysis or classification independently. In this paper, we show survival analysis methods can be re-engineered to enable them to do classification efficiently and effectively, thereby making them a comprehensive tool for developing disease risk surveillance models. The results of our experiments on real-world big EMR data show that the performance of survival models in terms of accuracy, F1 score, and AUROC is comparable to or better than that of prior state-of-the-art models like LightGBM and XGBoost. Lastly, the proposed survival models use a novel methodology to generate explanations, which have been clinically validated by a panel of three expert physicians.
In pediatric chronic care, the triadic relationship among patients, caregivers, and healthcare providers introduces unique challenges for youth in managing their conditions. Diverging values, roles, and asymmetrical situational awareness across decision-maker groups often hinder collaboration and affect health outcomes, highlighting the need to support collaborative decision-making. We conducted co-design workshops with 6 youth with chronic kidney disease, 6 caregivers, and 7 healthcare providers to explore how digital technologies can be designed to support collaborative decision-making. Findings identify barriers across all levels of situational awareness, ranging from individual cognitive and emotional constraints, misaligned mental models, to relational conflicts regarding care goals. We propose design implications that support continuous decision-making practice, align mental models, balance caregiver support and youth autonomy development, and surface potential care challenges. This work advances the design of collaborative decision-making technologies that promote shared understanding and empower families in pediatric chronic care.
Wearable sensor technologies and deep learning are transforming healthcare management. Yet, most health sensing studies focus narrowly on physical chronic diseases. This overlooks the critical need for joint assessment of comorbid physical chronic diseases and depression, which is essential for collaborative chronic care. We conceptualize multi-disease assessment, including both physical diseases and depression, as a multi-task learning (MTL) problem, where each disease assessment is modeled as a task. This joint formulation leverages inter-disease relationships to improve accuracy, but it also introduces the challenge of double heterogeneity: chronic diseases differ in their manifestation (disease heterogeneity), and patients with the same disease show varied patterns (patient heterogeneity). To address these issues, we first adopt existing techniques and propose a base method. Given the limitations of the base method, we further propose an Advanced Double Heterogeneity-based Multi-Task Learning (ADH-MTL) method that improves the base method through three innovations: (1) group-level modeling to support new patient predictions, (2) a decomposition strategy to reduce model complexi
Background: Hemiparesis after subcortical stroke is classically described as distal upper-extremity (UE) predominant, but prevalence data in chronic stroke is limited. Objective: Determine the prevalence of distal predominant UE weakness in exclusively subcortical chronic stroke versus other stroke distributions, characterize cohort differences, and describe UE weakness patterns in chronic stroke overall. Methods: Outpatient records were retrospectively reviewed to identify chronic stroke subjects. Lesion locations were classified from radiographic reports as exclusively subcortical or not (using a whole brain and supratentorial definition). UE weakness was categorized as distal predominant or not. Prevalence was compared with $χ$-squared testing and odds ratios (OR). Results: 250 subjects were included (mean 861 days post-stroke). Using the whole-brain definition, distal predominant weakness occurred in 30.6% of exclusively subcortical versus 17.4% of non-exclusively subcortical strokes (OR 2.09, 95% CI 1.15-3.81; p=0.014). Using the supratentorial definition, distal predominant weakness occurred in 27.9% versus 17.9%, respectively (OR 2.16, 95% CI 1.17-3.96; p=0.012). Across all
This paper presents an explainable artificial intelligence (XAI)-based framework for the spectral analysis of cough sounds associated with chronic respiratory diseases, with a particular focus on Chronic Obstructive Pulmonary Disease (COPD). A Convolutional Neural Network (CNN) is trained on time-frequency representations of cough signals, and occlusion maps are used to identify diagnostically relevant regions within the spectrograms. These highlighted areas are subsequently decomposed into five frequency subbands, enabling targeted spectral feature extraction and analysis. The results reveal that spectral patterns differ across subbands and disease groups, uncovering complementary and compensatory trends across the frequency spectrum. Noteworthy, the approach distinguishes COPD from other respiratory conditions, and chronic from non-chronic patient groups, based on interpretable spectral markers. These findings provide insight into the underlying pathophysiological characteristics of cough acoustics and demonstrate the value of frequency-resolved, XAI-enhanced analysis for biomedical signal interpretation and translational respiratory disease diagnostics.
Chronic diseases are long-term, manageable, yet typically incurable conditions, highlighting the need for effective preventive strategies. Machine learning has been widely used to assess individual risk for chronic diseases. However, many models rely on medical test data (e.g. blood results, glucose levels), which limits their utility for proactive self-assessment. Additionally, to gain public trust, machine learning models should be explainable and transparent. Although some research on self-assessment machine learning models includes explainability, their explanations are not validated against established medical literature, reducing confidence in their reliability. To address these issues, we develop deep learning models that predict the risk of developing 13 chronic diseases using only personal and lifestyle factors, enabling accessible, self-directed preventive care. Importantly, we use SHAP-based explainability to identify the most influential model features and validate them against established medical literature. Our results show a strong alignment between the models' most influential features and established medical literature, reinforcing the models' trustworthiness. Crit
Chronic pain is a significant global health issue, with many patients experiencing persistent pain despite no identifiable organic cause, classified as nociplastic pain. Increasing evidence highlights the role of danger signal processing in the maintenance of chronic pain. In response, we developed Personal Danger Signals Reprocessing (PDSR), an online, group-based intervention designed to modify these mechanisms using coaching techniques to enhance accessibility and affordability. This study evaluated the efficacy of PDSR in reducing pain and mental health comorbidities. A cohort of women (N=19, mean age 43) participated in an 8-week online program, receiving weekly sessions on chronic pain mechanisms within a systemic framework. Outcomes were assessed at three time points: pre-intervention, mid-intervention, and post-intervention. A waiting list group (N=20, mean age 43.5) completed assessments at the same intervals. Participants in the PDSR group showed significant pain reduction (p < .001), with moderate to large effects observed at mid-intervention (Cohen's D = 0.7) and post-intervention (Cohen's D = 1.5) compared to controls. Pain interference significantly decreased (p &l
Chronic diseases, such as cardiovascular disease, diabetes, chronic kidney disease, and thyroid disorders, are the leading causes of premature mortality worldwide. Early detection and intervention are crucial for improving patient outcomes, yet traditional diagnostic methods often fail due to the complex nature of these conditions. This study explores the application of machine learning (ML) and deep learning (DL) techniques to predict chronic disease and thyroid disorders. We used a variety of models, including Logistic Regression (LR), Random Forest (RF), Gradient Boosted Trees (GBT), Neural Networks (NN), Decision Trees (DT) and Native Bayes (NB), to analyze and predict disease outcomes. Our methodology involved comprehensive data pre-processing, including handling missing values, categorical encoding, and feature aggregation, followed by model training and evaluation. Performance metrics such ad precision, recall, accuracy, F1-score, and Area Under the Curve (AUC) were used to assess the effectiveness of each model. The results demonstrated that ensemble methods like Random Forest and Gradient Boosted Trees consistently outperformed. Neutral Networks also showed superior perfor
Assessing chronic pain behavior in mice is critical for preclinical studies. However, existing methods mostly rely on manual labeling of behavioral features, and humans lack a clear understanding of which behaviors best represent chronic pain. For this reason, existing methods struggle to accurately capture the insidious and persistent behavioral changes in chronic pain. This study proposes a framework to automatically discover features related to chronic pain without relying on human-defined action labels. Our method uses universal action space projector to automatically extract mouse action features, and avoids the potential bias of human labeling by retaining the rich behavioral information in the original video. In this paper, we also collected a mouse pain behavior dataset that captures the disease progression of both neuropathic and inflammatory pain across multiple time points. Our method achieves 48.41\% accuracy in a 15-class pain classification task, significantly outperforming human experts (21.33\%) and the widely used method B-SOiD (30.52\%). Furthermore, when the classification is simplified to only three categories, i.e., neuropathic pain, inflammatory pain, and no p
This study addresses a critical gap in the healthcare system by developing a clinically meaningful, practical, and explainable disease surveillance system for multiple chronic diseases, utilizing routine EHR data from multiple U.S. practices integrated with CureMD's EMR/EHR system. Unlike traditional systems--using AI models that rely on features from patients' labs--our approach focuses on routinely available data, such as medical history, vitals, diagnoses, and medications, to preemptively assess the risks of chronic diseases in the next year. We trained three distinct models for each chronic disease: prediction models that forecast the risk of a disease 3, 6, and 12 months before a potential diagnosis. We developed Random Forest models, which were internally validated using F1 scores and AUROC as performance metrics and further evaluated by a panel of expert physicians for clinical relevance based on inferences grounded in medical knowledge. Additionally, we discuss our implementation of integrating these models into a practical EMR system. Beyond using Shapley attributes and surrogate models for explainability, we also introduce a new rule-engineering framework to enhance the i
Chronic stress was implicated in cancer occurrence, but a direct causal connection has not been consistently established. Machine learning and causal modeling offer opportunities to explore complex causal interactions between psychological chronic stress and cancer occurrences. We developed predictive models employing variables from stress indicators, cancer history, and demographic data from self-reported surveys, unveiling the direct and immune suppression mitigated connection between chronic stress and cancer occurrence. The models were corroborated by traditional statistical methods. Our findings indicated significant causal correlations between stress frequency, stress level and perceived health impact, and cancer incidence. Although stress alone showed limited predictive power, integrating socio-demographic and familial cancer history data significantly enhanced model accuracy. These results highlight the multidimensional nature of cancer risk, with stress emerging as a notable factor alongside genetic predisposition. These findings strengthen the case for addressing chronic stress as a modifiable cancer risk factor, supporting its integration into personalized prevention str
A teenager's experience of chronic pain reverberates through multiple interacting aspects of their lives. To self-manage their symptoms, they need to understand how factors such as their sleep, social interactions, emotions and pain intersect; supporting this capability must underlie an effective personalized healthcare solution. While adult use of personal informatics for self-management of various health factors has been studied, solutions intended for adults are rarely workable for teens, who face this complex and confusing situation with unique perspectives, skills and contexts. In this design study, we explore a means of facilitating self-reflection by youth living with chronic pain, through visualization of their personal health data. In collaboration with pediatric chronic pain clinicians and a health-tech industry partner, we designed and deployed MyWeekInSight, a visualization-based self-reflection tool for youth with chronic pain. We discuss our staged design approach with this intersectionally vulnerable population, in which we balanced reliance on proxy users and data with feedback from youth viewing their own data. We report on extensive formative and in-situ evaluatio
This study presents a mathematical model describing cloned hematopoiesis in chronic myeloid leukemia (CML) through a nonlinear system of differential equations. The primary objective is to understand the progression from healthy hematopoiesis to the chronic and accelerated-acute phases in myeloid leukemia. The model incorporates intrinsic cellular division events in hematopoiesis and delineates the evolution of chronic myeloid leukemia into five compartments: cycling stem cells, quiescent stem cells, progenitor cells, differentiated cells and terminally differentiated cells. Our analysis reveals the existence of three distinct non-zero steady states within the dynamical system, representing healthy hematopoiesis, the chronic phase and the accelerated-acute stage of the disease. We investigate the local and global stability of these steady states and provide a characterization of the hematopoietic states based on this analysis. Additionally, numerical simulations are included to illustrate the theoretical results.
In recent years, the intersection of Natural Language Processing (NLP) and public health has opened innovative pathways for investigating various domains, including chronic pain in textual datasets. Despite the promise of NLP in chronic pain, the literature is dispersed across various disciplines, and there is a need to consolidate existing knowledge, identify knowledge gaps in the literature, and inform future research directions in this emerging field. This review aims to investigate the state of the research on NLP-based interventions designed for chronic pain research. A search strategy was formulated and executed across PubMed, Web of Science, IEEE Xplore, Scopus, and ACL Anthology to find studies published in English between 2014 and 2024. After screening 132 papers, 26 studies were included in the final review. Key findings from this review underscore the significant potential of NLP techniques to address pressing challenges in chronic pain research. The past 10 years in this field have showcased the utilization of advanced methods (transformers like RoBERTa and BERT) achieving high-performance metrics (e.g., F1>0.8) in classification tasks, while unsupervised approaches
Electronic health records (EHRs) are designed to synthesize diverse data types, including unstructured clinical notes, structured lab tests, and time-series visit data. Physicians draw on these multimodal and temporal sources of EHR data to form a comprehensive view of a patient's health, which is crucial for informed therapeutic decision-making. Yet, most predictive models fail to fully capture the interactions, redundancies, and temporal patterns across multiple data modalities, often focusing on a single data type or overlooking these complexities. In this paper, we present CURENet, a multimodal model (Combining Unified Representations for Efficient chronic disease prediction) that integrates unstructured clinical notes, lab tests, and patients' time-series data by utilizing large language models (LLMs) for clinical text processing and textual lab tests, as well as transformer encoders for longitudinal sequential visits. CURENet has been capable of capturing the intricate interaction between different forms of clinical data and creating a more reliable predictive model for chronic illnesses. We evaluated CURENet using the public MIMIC-III and private FEMH datasets, where it achi
The best way to treat chronic hepatitis B is with pegylated interferon alone or with oral antiviral drugs. There is limited research comparing the renal safety of entecavir and tenofovir when used with pegylated interferon. This study will compare changes in renal function in chronic hepatitis B patients treated with pegylated interferon and either entecavir or tenofovir. The study included a cohort of 836 patients with chronic hepatitis B (CHB) who received treatment with pegylated interferon (IFN) either alone or in combination with entecavir (ETV) and tenofovir (TDF) between the years 2018 and 2021. Of these patients, 713 were included in a matched analysis comparing outcomes between those who were cured and those who were uncured, while 123 patients received IFN alone as a control group for comparison with the ETV and TDF treatment groups. The primary outcome measured was the change in renal function, specifically estimated glomerular filtration rate (eGFR), cystatin C (CysC), and inorganic phosphorus (IPHOS). Patients were categorized into stage 1 or stage 2 based on a baseline eGFR of less than 90 ml/min/m^2 Results: 125 CHB patients were matched 1:1 in both the combined trea
Clinical data informs the personalization of health care with a potential for more effective disease management. In practice, this is achieved by subgrouping, whereby clusters with similar patient characteristics are identified and then receive customized treatment plans with the goal of targeting subgroup-specific disease dynamics. In this paper, we propose a novel mixture hidden Markov model for subgrouping patient trajectories from chronic diseases. Our model is probabilistic and carefully designed to capture different trajectory phases of chronic diseases (i.e., "severe", "moderate", and "mild") through tailored latent states. We demonstrate our subgrouping framework based on a longitudinal study across 847 patients with non-specific low back pain. Here, our subgrouping framework identifies 8 subgroups. Further, we show that our subgrouping framework outperforms common baselines in terms of cluster validity indices. Finally, we discuss the applicability of the model to other chronic and long-lasting diseases.
Human physical function is governed by self-efficacy, the belief in one's motor capacity. In chronic pain patients, this capacity may remain reduced long after the damage causing the pain has been cured. Chronic pain alters body schema, affecting how patients perceive the dimension and pose of their bodies. We exploit this deficit using robotic manipulation technology and augmented sensory stimuli through virtual reality technology. We propose a sensory stimuli manipulation method aimed at modifying body schema to restore lost self-efficacy.
\textit{Objective:} Diagnosing pain in research and clinical practices still relies on self-report. This study aims to develop an automatic approach that works on resting-state raw EEG data for chronic knee pain prediction. \textit{Method:} A new feature selection algorithm called ``modified Sequential Floating Forward Selection'' (mSFFS) is proposed. The improved feature selection scheme can better avoid local minima and explore alternative search routes. \textit{Results:} The feature selection obtained by mSFFS displays better class separability as indicated by the Bhattacharyya distance measures and better visualization results. It also outperforms selections generated by other benchmark methods, boosting the test accuracy to 97.5\%. \textit{Conclusion:} The improved feature selection searches out a compact, effective subset of connectivity features that produces competitive performance on chronic knee pain prediction. \textit{Significance:} We have shown that an automatic approach can be employed to find a compact connectivity feature set that effectively predicts chronic knee pain from EEG. It may shed light on the research of chronic pains and lead to future clinical solution
The RECONNECT project addresses the fragmentation of Ireland's public healthcare systems, aiming to enhance service planning and delivery for chronic disease management. By integrating complex systems within the Health Service Executive (HSE), it prioritizes data privacy while supporting future digital resource integration. The methodology encompasses structural integration through a Federated Database design to maintain system autonomy and privacy, semantic integration using a Record Linkage module to facilitate integration without individual identifiers, and the adoption of the HL7-FHIR framework for high interoperability with the national electronic health record (EHR) and the Integrated Information Service (IIS). This innovative approach features a unique architecture for loosely coupled systems and a robust privacy layer. A demonstration system has been implemented to utilize synthetic data from the Hospital Inpatient Enquiry (HIPE), Chronic Disease Management (CDM), Primary Care Reimbursement Service (PCRS) and Retina Screen systems for healthcare queries. Overall, RECONNECT aims to provide timely and effective care, enhance clinical decision-making, and empower policymakers