Global rates of mental health concerns are rising, and there is increasing realization that existing models of mental health care will not adequately expand to meet the demand. With the emergence of large language models (LLMs) has come great optimism regarding their promise to create novel, large-scale solutions to support mental health. Despite their nascence, LLMs have already been applied to mental health related tasks. In this paper, we summarize the extant literature on efforts to use LLMs to provide mental health education, assessment, and intervention and highlight key opportunities for positive impact in each area. We then highlight risks associated with LLMs' application to mental health and encourage the adoption of strategies to mitigate these risks. The urgent need for mental health support must be balanced with responsible development, testing, and deployment of mental health LLMs. It is especially critical to ensure that mental health LLMs are fine-tuned for mental health, enhance mental health equity, and adhere to ethical standards and that people, including those with lived experience with mental health concerns, are involved in all stages from development through
Mobile health has the potential to revolutionize health care delivery and patient engagement. In this work, we discuss how integrating Artificial Intelligence into digital health applications-focused on supply chain, patient management, and capacity building, among other use cases-can improve the health system and public health performance. We present an Artificial Intelligence and Reinforcement Learning platform that allows the delivery of adaptive interventions whose impact can be optimized through experimentation and real-time monitoring. The system can integrate multiple data sources and digital health applications. The flexibility of this platform to connect to various mobile health applications and digital devices and send personalized recommendations based on past data and predictions can significantly improve the impact of digital tools on health system outcomes. The potential for resource-poor settings, where the impact of this approach on health outcomes could be more decisive, is discussed specifically. This framework is, however, similarly applicable to improving efficiency in health systems where scarcity is not an issue.
Selecting the right monitoring level in Remote Patient Monitoring (RPM) systems for e-healthcare is crucial for balancing patient outcomes, various resources, and patient's quality of life. A prior work has used one-dimensional health representations, but patient health is inherently multidimensional and typically consists of many measurable physiological factors. In this paper, we introduce a multidimensional health state model within the RPM framework and use dynamic programming to study optimal monitoring strategies. Our analysis reveals that the optimal control is characterized by switching curves (for two-dimensional health states) or switching hyper-surfaces (in general): patients switch to intensive monitoring when health measurements cross a specific multidimensional surface. We further study how the optimal switching curve varies for different medical conditions and model parameters. This finding of the optimal control structure provides actionable insights for clinicians and aids in resource planning. The tunable modeling framework enhances the applicability and effectiveness of RPM services across various medical conditions.
YouTube has rapidly emerged as a predominant platform for content consumption, effectively displacing conventional media such as television and news outlets. A part of the enormous video stream uploaded to this platform includes health-related content, both from official public health organizations, and from any individual or group that can make an account. The quality of information available on YouTube is a critical point of public health safety, especially when concerning major interventions, such as vaccination. This study differentiates itself from previous efforts of auditing YouTube videos on this topic by conducting a systematic daily collection of posted videos mentioning vaccination for the duration of 3 months. We show that the competition for the public's attention is between public health messaging by institutions and individual educators on one side, and commentators on society and politics on the other, the latest contributing the most to the videos expressing stances against vaccination. Videos opposing vaccination are more likely to mention politicians and publication media such as podcasts, reports, and news analysis, on the other hand, videos in favor are more li
The growing demand for home healthcare calls for tools that can support care delivery. In this study, we explore automatic health assessment from voice using real-world home care visit data, leveraging the diverse patient information it contains. First, we utilize Large Language Models (LLMs) to integrate Subjective, Objective, Assessment, and Plan (SOAP) notes derived from unstructured audio transcripts and structured vital signs into a holistic illness score that reflects a patient's overall health. This compact representation facilitates cross-visit health status comparisons and downstream analysis. Next, we design a multi-stage preprocessing pipeline to extract short speech segments from target speakers in home care recordings for acoustic analysis. We then employ an Audio Language Model (ALM) to produce plain-language descriptions of vocal biomarkers and examine their association with individuals' health status. Our experimental results benchmark both commercial and open-source LLMs in estimating illness scores, demonstrating their alignment with actual clinical outcomes, and revealing that SOAP notes are substantially more informative than vital signs. Building on the illness
Electronic Health Record (EHR) has become an essential tool in the healthcare ecosystem, providing authorized clinicians with patients' health-related information for better treatment. While most developed countries are taking advantage of EHRs to improve their healthcare system, it remains challenging in developing countries to support clinical decision-making and public health using a computerized patient healthcare information system. This paper proposes a novel EHR architecture suitable for developing countries--an architecture that fosters inclusion and provides solutions tailored to all social classes and socioeconomic statuses. Our architecture foresees an internet-free (offline) solution to allow medical transactions between healthcare organizations, and the storage of EHRs in geographically underserved and rural areas. Moreover, we discuss how artificial intelligence can leverage anonymous health-related information to enable better public health policy and surveillance.
The rapid spread of health misinformation on online social networks (OSNs) during global crises such as the COVID-19 pandemic poses challenges to public health, social stability, and institutional trust. Centrality metrics have long been pivotal in understanding the dynamics of information flow, particularly in the context of health misinformation. However, the increasing complexity and dynamism of online networks, especially during crises, highlight the limitations of these traditional approaches. This study introduces and compares three novel centrality metrics: dynamic influence centrality (DIC), health misinformation vulnerability centrality (MVC), and propagation centrality (PC). These metrics incorporate temporal dynamics, susceptibility, and multilayered network interactions. Using the FibVID dataset, we compared traditional and novel metrics to identify influential nodes, propagation pathways, and misinformation influencers. Traditional metrics identified 29 influential nodes, while the new metrics uncovered 24 unique nodes, resulting in 42 combined nodes, an increase of 44.83%. Baseline interventions reduced health misinformation by 50%, while incorporating the new metrics
Large Language Models (LLMs) are advancing rapidly, yet the benchmarks used to measure this progress are becoming increasingly unreliable. Score inflation and selective reporting have eroded the authority of standard benchmarks, leaving the community uncertain about which evaluation results remain trustworthy. We introduce the Benchmark Health Index (BHI), a pure data-driven framework for auditing evaluation sets along three orthogonal and complementary axes: (1) Capability Discrimination, measuring how sharply a benchmark separates model performance beyond noise; (2) Anti-Saturation, estimating remaining headroom before ceiling effects erode resolution and thus the benchmark's expected longevity; and (3) Impact, quantifying influence across academic and industrial ecosystems via adoption breadth and practice-shaping power. By distilling 106 validated benchmarks from the technical reports of 91 representative models in 2025, we systematically characterize the evaluation landscape. BHI is the first framework to quantify benchmark health at a macro level, providing a principled basis for benchmark selection and enabling dynamic lifecycle management for next-generation evaluation prot
The Oregon Health Insurance Experiment (OHIE) offers a unique opportunity to examine the causal relationship between Medicaid coverage and happiness among low-income adults, using an experimental design. This study leverages data from comprehensive surveys conducted at 0 and 12 months post-treatment. Previous studies based on OHIE have shown that individuals receiving Medicaid exhibited a significant improvement in mental health compared to those who did not receive coverage. The primary objective is to explore how Medicaid coverage impacts happiness, specifically analyzing in which direction variations in healthcare spending significantly improve mental health: higher spending or lower spending after Medicaid. Utilizing instrumental variable (IV) regression, I conducted six separate regressions across subgroups categorized by expenditure levels and happiness ratings, and the results reveal distinct patterns. Enrolling in OHP has significantly decreased the probability of experiencing unhappiness, regardless of whether individuals had high or low medical spending. Additionally, it decreased the probability of being pretty happy and having high medical expenses, while increasing the
This research paper presents a meta-analysis of the multifaceted role of technology in mental health. The pervasive influence of technology on daily lives necessitates a deep understanding of its impact on mental health services. This study synthesizes literature covering Behavioral Intervention Technologies (BITs), digital mental health interventions during COVID-19, young men's attitudes toward mental health technologies, technology-based interventions for university students, and the applicability of mobile health technologies for individuals with serious mental illnesses. BITs are recognized for their potential to provide evidence-based interventions for mental health conditions, especially anxiety disorders. The COVID-19 pandemic acted as a catalyst for the adoption of digital mental health services, underscoring their crucial role in providing accessible and quality care; however, their efficacy needs to be reinforced by workforce training, high-quality evidence, and digital equity. A nuanced understanding of young men's attitudes toward mental health is imperative for devising effective online services. Technology-based interventions for university students are promising, al
Objective: To enhance health literacy and accessibility of health information for a diverse patient population by developing a patient-centered artificial intelligence (AI) solution using large language models (LLMs) and Fast Healthcare Interoperability Resources (FHIR) application programming interfaces (APIs). Materials and Methods: The research involved developing LLM on FHIR, an open-source mobile application allowing users to interact with their health records using LLMs. The app is built on Stanford's Spezi ecosystem and uses OpenAI's GPT-4. A pilot study was conducted with the SyntheticMass patient dataset and evaluated by medical experts to assess the app's effectiveness in increasing health literacy. The evaluation focused on the accuracy, relevance, and understandability of the LLM's responses to common patient questions. Results: LLM on FHIR demonstrated varying but generally high degrees of accuracy and relevance in providing understandable health information to patients. The app effectively translated medical data into patient-friendly language and was able to adapt its responses to different patient profiles. However, challenges included variability in LLM responses a
Health messages on social media are typically constructed through combinations of source cues, appeals, frames, and evidence, which jointly shape communication and persuasive effects. However, prior research has largely focused on single elements or simple pairwise interactions, offering insufficient insight into how multiple elements operate together in real-world digital environments. To address this gap, this study adopts a systems perspective to examine multi-element message combinations. Using 1.8 million health-related Weibo posts, we apply clustering analysis to identify recurring combinations and assess their relationships with communication effects. First, four recurring element combinations are identified: Institutional Authority, Narrative, Assertive Appeal, and Contextual Expression. These combinations function as core structures organized around two key elements. Second, stronger communication effects depend not only on core structures but also on peripheral elements aligned with these structures, with combinations of two to four peripheral elements generally showing greater advantages. Third, the optimal level of peripheral complexity varies with source influence, ind
Global public health surveillance relies on reporting structures and transmission of trustworthy health reports. But in practice, these processes may not always be fast enough, or are hindered by procedural, technical, or political barriers. GPHIN, the Global Public Health Intelligence Network, was designed in the late 1990s to scour mainstream news for health events, as that travels faster and more freely. This paper outlines the next generation of GPHIN, which went live in 2017, and reports on design decisions underpinning its new functions and innovations.
Abundant evidence has tracked the labour market and health assimilation of immigrants, including static analyses of differences in how foreign-born and native-born residents consume health care services. However, we know much less about how migrants' patterns of health care usage evolve with time of residence, especially in countries providing universal or quasi-universal coverage. We investigate this process in Spain by combining all the available waves of the local health survey, which allows us to separately identify period, cohort, and assimilation effects. We find that the evidence of health assimilation is limited and solely applies to migrant females' visits to general practitioners. Nevertheless, the differential effects of ageing on health care use between foreign-born and native-born populations contributes to the convergence of utilisation patterns in most health services after 20 years in Spain. Substantial heterogeneity over time and by region of origin both suggest that studies modelling future welfare state finances would benefit from a more thorough assessment of migration.
Data quality (DQ) and transparency of secondary data are critical factors that delay the adoption of clinical AI models and affect clinician trust in them. Many DQ studies fail to clarify where, along the lifecycle, quality checks occur, leading to uncertainty about provenance and fitness for reuse. This study develops a framework for transparent reporting of DQ assessments across the clinical electronic health record (EHR) data lifecycle. The reporting framework was developed through iterative analysis to identify actors and phases of the clinical data lifecycle. The framework distinguishes between data-generating organizations and data-receiving organizations to allow users to map DQ parameters to stages across the data lifecycle. The framework defines 5 key lifecycle phases and multiple actors. When applied to the real-world dataset, the framework demonstrated applicability in revealing where DQ issues may originate. The framework provides a structured approach for reporting DQ assessments, which can enhance transparency regarding data fitness for reuse, supporting reliable clinical research, AI model development, and internal organisational governance. This work provides practi
Although serious games have been increasingly used for mental health applications, few explicitly address coping with grief as a core mechanic and narrative experience for patients. Existing grief-related digital games often focus on clinical training for medical professionals rather than immersive storytelling and agency in emotional processing for the patient. In response, we designed Road to Acceptance, a VR game that presents grief through first-person narrative and gameplay. As the next phase of evaluation, we propose a workshop-based study with 12 licensed mental health professionals to assess the therapeutic impacts of the game and the alignment with best practices in grief education and interventions. This will inform iterative game design and patient evaluation methods, ensuring that the experience is clinically appropriate. Potential findings can contribute to the design principles of grief-related virtual reality experiences, bridging the gap between interactive media, mental health interventions, and immersive storytelling.
Large Language Models (LLMs) hold promise in addressing complex medical problems. However, while most prior studies focus on improving accuracy and reasoning abilities, a significant bottleneck in developing effective healthcare agents lies in the readability of LLM-generated responses, specifically, their ability to answer public health problems clearly and simply to people without medical backgrounds. In this work, we introduce RephQA, a benchmark for evaluating the readability of LLMs in public health question answering (QA). It contains 533 expert-reviewed QA pairs from 27 sources across 13 topics, and includes a proxy multiple-choice task to assess informativeness, along with two readability metrics: Flesch-Kincaid grade level and professional score. Evaluation of 25 LLMs reveals that most fail to meet readability standards, highlighting a gap between reasoning and effective communication. To address this, we explore four readability-enhancing strategies-standard prompting, chain-of-thought prompting, Group Relative Policy Optimization (GRPO), and a token-adapted variant. Token-adapted GRPO achieves the best results, advancing the development of more practical and user-friendl
Social Determinants of Health correlate with patient outcomes but are rarely captured in structured data. Recent attention has been given to automatically extracting these markers from clinical text to supplement diagnostic systems with knowledge of patients' social circumstances. Large language models demonstrate strong performance in identifying Social Determinants of Health labels from sentences. However, prediction in large admissions or longitudinal notes is challenging given long distance dependencies. In this paper, we explore hospital admission multi-label Social Determinants of Health ICD-9 code classification on the MIMIC-III dataset using reasoning models and traditional large language models. We exploit existing ICD-9 codes for prediction on admissions, which achieved an 89% F1. Our contributions include our findings, missing SDoH codes in 139 admissions, and code to reproduce the results.
Promoting healthy lifestyle behaviors remains a major public health concern, particularly due to their crucial role in preventing chronic conditions such as cancer, heart disease, and type 2 diabetes. Mobile health applications present a promising avenue for low-cost, scalable health behavior change promotion. Researchers are increasingly exploring adaptive algorithms that personalize interventions to each person's unique context. However, in empirical studies, mobile health applications often suffer from small effect sizes and low adherence rates, particularly in comparison to human coaching. Tailoring advice to a person's unique goals, preferences, and life circumstances is a critical component of health coaching that has been underutilized in adaptive algorithms for mobile health interventions. To address this, we introduce a new Thompson sampling algorithm that can accommodate personalized reward functions (i.e., goals, preferences, and constraints), while also leveraging data sharing across individuals to more quickly be able to provide effective recommendations. We prove that our modification incurs only a constant penalty on cumulative regret while preserving the sample comp
Mobile health apps are revolutionizing the healthcare ecosystem by improving communication, efficiency, and quality of service. In low- and middle-income countries, they also play a unique role as a source of information about health outcomes and behaviors of patients and healthcare workers, while providing a suitable channel to deliver both personalized and collective policy interventions. We propose a framework to study user engagement with mobile health, focusing on healthcare workers and digital health apps designed to support them in resource-poor settings. The behavioral logs produced by these apps can be transformed into daily time series characterizing each user's activity. We use probabilistic and survival analysis to build multiple personalized measures of meaningful engagement, which could serve to tailor content and digital interventions suiting each health worker's specific needs. Special attention is given to the problem of detecting churn, understood as a marker of complete disengagement. We discuss the application of our methods to the Indian and Ethiopian users of the Safe Delivery App, a capacity-building tool for skilled birth attendants. This work represents an