The integration of artificial intelligence-generated content (AIGC) tools into academic research offers transformative potential for enhancing productivity and innovation. However, within the highly regulated and ethically sensitive medical context, the use of AIGC is accompanied by significant challenges. Medical postgraduates, as the future vanguard of medical science, play a crucial role in the advancement of digital health, and their intention to use AIGC tools will significantly influence the use of these emerging technologies in medical research. Despite the growing popularity of AIGC tools, there remains a paucity of in-depth understanding of the factors driving or hindering medical postgraduates' intention to use these tools in academic research. A clear comprehension of these influencing factors is essential to foster the responsible, effective, and sustainable integration of AIGC into medical research. This study aimed to systematically explore the key factors influencing medical postgraduates' intention to use AIGC tools in academic research, with the goal of informing strategies to promote their ethical use and enhance scholarly research capabilities. We used a qualitative research design based on grounded theory. Semistructured interviews were conducted with 30 medical postgraduates across diverse specialties, all of whom had prior research experience and familiarity with AIGC tools. Participants were recruited purposively to ensure diverse perspectives. Data analysis followed a systematic coding process to inductively develop a conceptual model, which was further structured and interpreted through the theoretical lens of the Unified Theory of Acceptance and Use of Technology. Our analysis identified 7 core factors directly shaping usage intention: performance expectancy, effort expectancy, social influence, facilitating conditions, individual characteristics, task characteristics, and technology characteristics. Further analysis revealed that performance expectancy acted as a mediating variable in the relationships between both task characteristics and technology characteristics and usage intention. Additionally, social influence moderated the relationship between task characteristics and performance expectancy. The research findings underscore that, while AIGC tools are valued for assisting daily research tasks, medical postgraduates' intention to use them in academic research is influenced by technical deficiencies, high cognitive load, and the strict ethical risks and data governance requirements in the medical field. This study constructs a conceptual model aimed at elucidating the influencing factors of medical graduate students' intention to use AIGC in academic research. Recommendations derived from the findings include (1) fostering artificial intelligence literacy and critical competency among medical postgraduates; (2) optimizing AIGC tools to better address domain-specific needs, accuracy, and security concerns prevalent in health research; and (3) establishing clear academic supervision and ethical governance mechanisms to ensure responsible use. These measures are essential to harness the potential of AIGC while safeguarding the rigor and integrity of medical academic research.
In the internet of medical things, data primarily exhibits time-series and streaming characteristics, featuring typical attributes such as large-scale volume, high transmission rates, and significant heterogeneity. Given these data properties and the application requirements of medical scenarios, the development of specialized data platforms tailored to these needs holds considerable research significance and practical value. This study innovatively proposes the internet of medical things data platform solution based on a cloud-edge-end architecture, and elaborates on its architecture, functions, and implementation effects. The edge side is responsible for streaming data access, storage, and computation; the cloud side encompasses three layers of services: resources, data, and applications, constructing a data lake to provide data analysis services. This study has been implemented in PLA General Hospital for verification. From 2021 to 2024, 263 medical devices have been connected accumulatively, with a total data volume of 24.07 TB and stable operation within 4 years. In the performance stress test, the platform achieved the data access throughput of 23.91 MB/s and the data storage efficiency of 30.98 MB/s. These results demonstrate the feasibility of the architecture platform. This study has engineered and successfully applied the cloud-edge-end architecture in complex internet of medical things scenarios, addressing challenges such as heterogeneous protocol compatibility of medical devices, real-time response to clinical operations, and large-scale storage and application of the internet of things data. The established data platform provides a solid data foundation for smart medical applications and holds significant value for the research of medical artificial intelligence and the construction of future smart hospitals. 医疗物联网中数据主要为时序与流式形态,具有规模大、传输速率高以及异构性强等典型特征。鉴于这些数据特性与医疗场景的应用需求,开发与之适配的专用数据平台,具有重要研究意义和应用价值。本研究创新性提出了基于云—边—端架构的医疗物联网数据平台解决方案,并对其架构、功能与实施效果进行了阐述。其边缘侧负责流数据接入、存储与计算;云侧涵盖资源、数据与应用三层服务,构建数据湖,提供数据分析服务。本研究在中国人民解放军总医院进行了实施效果验证,自2021—2024年累计接入263台医疗设备,数据总量24.07 TB,系统持续稳定运行四年;性能压力测试中平台数据接入吞吐量23.91 MB/s,数据存储效率30.98 MB/s,其结果证明了该架构平台的可行性。本研究将云—边—端架构在复杂医疗物联网场景下进行了工程化落地和成功应用,解决了医疗设备异构协议兼容、临床业务实时响应以及物联网数据大规模存储应用等难题,建立的数据平台为智慧医疗应用提供了坚实的数据底座,对医学人工智能的研究和未来智慧医院的建设具有重要价值。.
Digital oral health builds on the broader framework of eHealth, leveraging digital technologies to improve patient care, increase access to dental services, and enhance oral health outcomes. However, health care organizations and institutions encounter challenges in implementing digital oral health interventions across various levels. Addressing these challenges requires a comprehensive understanding of the barriers and facilitators that influence its successful adoption. This study aimed to explore the facilitators of and barriers to the implementation of digital oral health programs from the perspective of chief dental officers from countries across the World Health Organization (WHO) regions. This study is part of a broader investigation into global readiness for digital oral health. Participants were the 144 chief dental officers or designated oral health officials within ministries of health across the 6 WHO regions. An explanatory sequential mixed methods design was used across 2 phases. In the quantitative phase, an online survey was administered using the WHO's global survey on eHealth instrument. Some items were modified slightly to be applied to the field of dentistry. Descriptive statistics were used to present the quantitative data. In the qualitative phase, data were collected through virtual interviews, using an interview guide developed based on preliminary findings from the quantitative phase, the technology acceptance model, and the eHealth readiness assessment tool. The qualitative data were analyzed using thematic analysis. The survey response rate was 70.1% (101/144). The qualitative phase involved in-depth interviews with 15 participants. The findings were integrated under 2 broad themes of facilitators and barriers. Perceived facilitators included the existence of national policies and guidelines on eHealth. Approximately 63.9% (53/83) of the respondents indicated the presence of a national oral health policy in their countries. Capacity building, motivation of health care providers and academic leadership, digital health training for students or professionals, and WHO support to implement the mOral Health program were the other facilitators. The strongest barriers were a lack of funding to develop and support digital health programs, lack of norms and standards to guarantee application interoperability, and lack of equipment and/or connectivity. Approximately 45.1% (37/82) of the participants reported having government-sponsored mobile health programs, while 31.7% (26/82) reported having no financial support for the implementation of national digital oral health programs. Furthermore, lack of evidence on the effectiveness and cost-effectiveness of programs was highlighted as a barrier by 73.8% (59/80) and 73% (57/78) of the participants, respectively. The results of this study enabled the identification of key barriers to and enablers of the implementation of digital oral health programs in WHO member countries. Supportive governmental policies and adequate funding and investment in digital infrastructure and technologies are essential to mitigate digital oral health-related challenges.
Medical data sharing initiatives are crucial for advancing research, improving patient outcomes, and fostering innovation in health care. With the advent of blockchain technology, there has been significant interest in exploring its potential to enhance the security, transparency, and efficiency of medical data sharing. This study aimed to examine a selected set of blockchain-based medical data sharing initiatives, focusing on their governance, incentive structures, ownership models, business approaches, transaction mechanisms, and sustainability strategies. The analysis explored patterns in operational status and longevity, providing insight into the factors shaping these initiatives. The objective was to identify common characteristics and contextual factors that may influence their development and persistence. The study used snowball sampling to identify a selection of primarily blockchain-based medical data sharing initiatives, drawing from academic literature, web searches, and expert consultations. To examine structural and operational patterns, initiatives were selected based on the availability of sufficient public documentation for systematic classification. Each initiative was categorized by governance, incentives, ownership, business models, transaction mechanisms, and sustainability strategies. A follow-up assessment examined operational status over time. The analysis applied qualitative comparative analysis to identify common structural features and relationships between governance, incentives, and sustainability. The survey identified 42 initiatives, categorizing them based on ownership, governance, business, incentive, transaction, and sustainability models. These categories were systematically identified and assigned numerical values to facilitate fuzzy-set qualitative comparative analysis. The base model, run at an inclusion threshold of 0.65, identified multiple configurations associated with sustained initiative activity, highlighting the role of governance mechanisms and transaction structures in supporting long-term viability. The sensitivity analysis, conducted across multiple thresholds, demonstrated that while several configurations remained stable, higher thresholds led to more restrictive solutions. At 0.80, only two configurations remained, representing the most consistent pathways to sustained activity, reinforcing the importance of governance and transaction models in initiative sustainability. The analysis revealed a range of governance, ownership, business, and sustainability models, with no single structural configuration guaranteeing long-term viability. The findings suggest that governance and transaction mechanisms are particularly influential in sustaining initiatives, often compensating for the absence of strong business or sustainability models. The scope was limited to initiatives identified through available documentation and snowball sampling, and the results underscore the need for further research into the interplay between governance structures, financial models, and long-term sustainability in medical data sharing.
Parent training interventions effectively reduce disruptive behavior in children. However, research on how participant characteristics and program factors influence the outcomes in real-world settings remains scarce. This study aimed to identify factors predicting outcomes of the internet-based, telephone-assisted Strongest Families Parent training program. A prospective cohort implementation study conducted within population-based screening embedded in routine health checkups targeting all children aged 4 years in Finland, to identify children with high levels of conduct problems and functional impairment. From a study population of 49,504, altogether 3911 participants completed baseline measures, 707 participants did not do so at 6 months follow-up, resulting in a sample of 3204 (1158/3186, 36.3% girl, 2028/3186, 63.7% boys). Reported duration of difficulties was 6 months in 29.57% (934/3159) of participants, 6-12 months in 27% (853/3159) of participants, and >12 months in 43.43% (1372/3159) of participants. Most children lived with 2 biological parents (2721/3194, 85.19%). A total of 35.24% (1121/3181) of mothers and 26.18% (797/3044) of fathers had a university degree. Data was collected via parent report. Multinomial logistic regression analyses were conducted to identify which child-, family-, and program-related factors predicted changes in the Child Behavior Checklist 1.5-5 (CBCL) externalizing subscale from baseline to 6-month follow-up. The standardized change in CBCL externalizing score was created by subtracting the mean at baseline from the individual 6-month measurement, divided by the SD at baseline. The standardized change was categorized to ±0.5 SD (no change); +0.5 to +1.5 SD (moderate improvement), >+1.5 SD (large improvement), and more than -0.5 SD (deterioration). A P value of <.05 was considered significant. In 77% (2468/3204) of participants, symptoms improved at 6-month follow-up. Multinomial logistic regression analyses with α-level of <0.05 showed that >12 months duration of initial problems, callous-unemotional traits, and CBCL internalizing symptoms were linked to lower likelihood of large improvement (odds ratio [OR] 0.43, 95% CI 0.33-0.56; P<.001; OR 0.64, 95% CI 0.57-0.73; P<.001; OR 0.54, 95% CI 0.47-0.63; P<.001, respectively). Definite and severe problems at baseline were linked to deterioration (OR 2.29, 95% CI 1.62-3.24; P<.001; OR 4.38, 95% CI 2.80-6.85; P<.001, respectively). Parental stress was linked to a lower likelihood of large improvement (OR 0.78, 95% CI 0.67-0.91; P=.002), and anxiety to a higher likelihood of deterioration (OR 1.20, 95% CI 1.04-1.39; P=.02). Children with longer-term and more severe behavioral symptoms may require tailored intervention. Support for parents with stress may be recommended. Much of the current literature on parent training is based on randomized controlled trials, while the literature on the implementation of parenting programs and studies examining change is limited. Our study informs about predictors of treatment outcomes when interventions are implemented. These results are important clinically as they allow personalization of interventions.
The COVID-19 pandemic had an unprecedented impact on the delivery of health care, with digital interventions accelerating more than ever before. However, evidence of how hybrid care models, combining digital health interventions with in-person care, were implemented during the pandemic remains scattered. Understanding hybrid care models is imperative to build resilient health systems that can ensure access to care during crisis situations. The study aimed to examine the implementation of hybrid care modifications to support the delivery of nonpandemic health care services in Europe during the COVID-19 pandemic. A scoping review was conducted following PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines. Systematic searches were conducted in PubMed or MEDLINE, Embase, CINAHL, Web of Science, and PsycINFO on May 22, 2024, and updated on January 14, 2026. Studies were eligible if they included primary data on the use of digital care modifications implemented or scaled up during the COVID-19 pandemic for the delivery of nonpandemic health care services in Europe. Non-peer-reviewed publications and studies with a primary focus on mental health or pediatric care were excluded. Quality appraisal was conducted using the Mixed Methods Appraisal Tool. Descriptions of digital care modifications were inductively analyzed and used to create digital flows, combining telehealth systems, digital interventions, and care functions. Digital care modifications were categorized according to their hybrid care implementation (digital-only or hybrid). Study evaluations were extracted using the Kirkpatrick model. A total of 189 studies were included for analysis. Studies covered evidence from 2020 to 2024, a total of 23 countries, and 37 health care disciplines. Hybrid care implementation was reported in over 60% (115/189) of the studies, describing various forms of digital and in-person care. Care modifications incorporating in-person and digital care components were more commonly described in specialty care contexts. A total of 68 distinct digital flows were identified, with a limited number of telehealth systems allowing substantial variety in both interventions and care functions. Prominent digital flows included the use of online platforms to support video and messaging for follow-up care. Over half of the studies did not describe any kind of evaluation. This review has shown how few telehealth systems were able to support a variety of care functions in the delivery of nonpandemic care throughout the COVID-19 pandemic, underscoring their practical versatility. Integrating digital health as part of hybrid care models is essential in designing care pathways that can adapt to different contexts, including future health crises. Although a comprehensive search was conducted, the heterogeneous reporting of care modifications may have influenced the interpretation of the findings. In the future, research may expand the application of hybrid care models to innovative strategies for effective crisis management.
Animated messages designed to promote preventive health behaviors (health animations) are a prevalent form of digital health communication globally and are used across a variety of health behaviors. Health animations are visual and can be less reliant on language, helping to reduce health literacy barriers and inequities. Being easily and inexpensively shared, they are a potentially powerful tool for disease prevention. Evidence suggests that animations can be effective in health and non-health care settings. Despite a plethora of health animations in existence and their potential reach and scale, evidence underpinning their design and application, including the potential causal mechanisms at work, is often limited or unknown. This realist review aimed to understand why, how, for whom, to what extent, and when health animations designed to promote preventive health behaviors work. The review was conducted in accordance with the Realist and Meta-narrative Evidence Syntheses: Evolving Standards (RAMESES). Peer-reviewed publications identified through database searching from inception until April 27, 2025, as well as gray literature, were considered for inclusion. Animations designed to promote preventive health behaviors in any population and evaluations using any design were included. Animations that could not be viewed, were designed to treat illness or disease, or were part of multicomponent interventions were excluded. Data were appraised for their relevance, rigor, and richness. Data syntheses sought to produce context-mechanism-outcome configurations, contributing to a program theory of health animations. International stakeholder workshops with professionals and members of the public were used to sense check findings and refine the program theory. This review synthesized data from evaluations of 48 health animations. Within the data, design, content, and delivery constructs were identified, including audience or challenge representation, using storytelling, evoking emotion, accessibility, and other contexts. These contexts enabled the triggering of key mechanisms, such as identification, transportation into a story, attention, building self-efficacy, and cognitive processes. The evidence available for synthesis in the building of our program theory of health animations was limited by a lack of data on behavioral outcomes, meaning that the theory is largely derived from evidence of the contexts and mechanisms influencing key determinants known to affect behavior. These key determinants include behavioral intentions, skills and attitudes, and knowledge as a key factor in raising awareness of the need to change behavior. This realist review advances the understanding of the impact of health animations designed to promote preventive health behaviors by providing insight into the design, content, and delivery features at work. Our program theory describes the specific contexts and mechanisms that influence evidence-based determinants of behavior and behavior change. These contexts and mechanisms, therefore, should be considered during health animation design and development processes. A set of 10 recommendations is provided to this end.
Older adults managing chronic illnesses, such as cancer and Alzheimer disease and related dementias (ADRD), often experience significant physical or cognitive impairments that hinder daily activities and increase caregiver burden. Smart Internet of Things (IoT) technologies offer promising solutions by enabling passive monitoring, timely reminders, and personalized support at home. However, these technologies must be carefully tailored to accommodate users' individualized needs and preferences. This formative qualitative study aimed to explore stakeholder perspectives, including patients, caregivers, health care providers, and technical experts, on the use of smart home-based IoT systems to support chronic illness management. The goal was to inform the early development of the audio and radio connected (AURA) system, an IoT prototype integrating Wi-Fi sensing, wearable trackers, and voice-assistive features. Semistructured interviews were conducted with 6 patients who underwent postostomy creation for colorectal or bladder cancer treatment and 5 patients with ADRD and their caregivers. Input from additional stakeholders, including 2 health care providers, 2 community health workers, and 2 computer scientists, was also included in the report. Stakeholders reviewed a demonstration video depicting the conceptual features of the AURA system. Interviews explored stakeholders' needs and preferences for using such systems. Thematic analysis was guided by the extended Unified Theory of Acceptance and Use of Technology 2 (UTAUT2) framework, with 5 adapted constructs: performance expectancy, effort expectancy, social influence, facilitating conditions, and hedonic motivation and habit. Stakeholders identified distinct yet complementary needs across populations. Patients with cancer emphasized physical health monitoring, integration with health care systems, and customization; ADRD stakeholders prioritized routine support, emotional engagement, and simplicity; caregivers and clinicians emerged as key influencers of adoption. Barriers included privacy concerns, technology literacy, and fatigue, while facilitators included perceived caregiving support, streamlined interfaces, and electronic health record integration. Patients with cancer focused on motivational cues for physical activity, while emotional engagement and habit were more prominent for ADRD users. Stakeholder insights underscore the importance of designing adaptable, user-centered IoT systems that reflect the varied capabilities and care needs of older adults with chronic illnesses. These findings informed the design of the AURA prototype and highlighted theoretical considerations for technology acceptance in health care. Future work will test AURA in real-world settings to evaluate usability, acceptability, and clinical relevance.
Generative artificial intelligence (GenAI) tools are increasingly used in scientific research to support literature searches, evidence synthesis, and manuscript preparation. While these systems promise substantial efficiency gains, concerns have emerged regarding their reliability, particularly their tendency to cite inaccurate, fabricated, or retracted literature. The unrecognized inclusion of retracted studies poses a serious risk to research integrity and evidence-based decision-making. Whether commonly used GenAI tools can reliably detect, exclude, or transparently communicate the retraction status of scientific publications remains unclear. This study aimed to evaluate the ability of freely available GenAI tools to correctly handle retracted scientific articles during literature searches. Primary and secondary outcomes focused on accuracy, reliability, and consistency in recognizing retracted literature. In this pragmatic trial, nine widely used free-access GenAI tools (ChatGPT 4, ChatGPT 5, Claude, Gemini, Perplexity, Microsoft Copilot, SciSpace, ScienceOS, and Consensus) were evaluated. Each tool was asked five predefined, standardized questions addressing topic overview, article identification, article summarization, and explicit assessment of retraction status. Overall, 15 retracted articles (the 10 most cited and 5 most recently retracted as of May 23, 2025) were selected from the Retraction Watch database. All questions were repeated twice to assess intratool consistency. Responses were independently rated as correct or incorrect by 2 researchers. Descriptive statistics summarized performance, and comparisons between general-purpose and research-focused AI tools were conducted using descriptive statistics. Interreviewer agreement was assessed using Cohen kappa coefficient. None of the evaluated AI tools consistently handled retracted articles correctly. No model achieved perfect accuracy across all question sets. ChatGPT 5 performed best, defined by the primary outcome of achieving fully correct responses to all five predefined tasks (5/5) for the highest number of retracted articles, correctly answering all five questions for 8 of 15 articles (53.3%). Research-focused tools (SciSpace, ScienceOS, and Consensus) failed to produce a single fully correct response set. Retracted articles were frequently included in topic overviews without warning, with error rates exceeding 40% in several tools. When specifically asked about retraction status, most systems failed to provide correct or complete information. OpenEvidence only reported data for a subset of our retracted articles as it is only used in health care literature. It demonstrated strong performance in topic overviews but low accuracy in identifying retracted articles. Freely available GenAI tools are currently not able to detect, exclude, or appropriately flag retracted scientific literature. The widespread and confident reproduction of retracted studies represents a substantial threat to research integrity, particularly in medical and evidence-based fields. Until retraction-aware verification mechanisms are systematically integrated, independent source checking remains essential when using AI-assisted literature tools.
Close follow-up of stable patients with axial spondyloarthritis (axSpA) presents a financial burden and inconvenience to patients. A remote monitoring patient-reported outcome measures (PROMs)-based model of care (PROMise) was designed to reduce the frequency of in-person consultations for stable patients with axSpA. However, little is known about the facilitators and barriers of implementing a remote monitoring PROMise. This study aims to understand the facilitators and barriers, as well as the mitigation strategies to implementing a PROMise in the Singapore context. We conducted a qualitative study involving in-depth interviews with 19 patients with axSpA (78.9% (15) male, mean age 39.4, SD 11.7 years) and 13 health care professionals (HCPs) (23.1%, 3 male; mean age 37.9, SD 7.2 years) in a tertiary hospital in Singapore until data saturation was reached. Participants were purposively recruited based on sex, age, and ethnicity. Patients were additionally recruited based on the number of years since diagnosed with axSpA, while HCPs were recruited based on seniority and their role in the care of patients with axSpA. Interviews were transcribed, deductively analyzed, and mapped to the Consolidated Framework for Implementation Research (CFIR) framework to identify facilitators and barriers from both the patients' and HCPs' perspectives. The CFIR-Expert Recommendations for Implementing Change (ERIC) match tool was used to produce implementation strategies to overcome the CFIR barriers identified. All five domains of the CFIR framework were elicited. Facilitators included (1) reduced inconvenience and costs for patients and reduced patient load in the clinic, (2) need for PROMise, (3) similarity to current workflows, and (4) suitable patient selection. Barriers included concerns for (1) financial sustainability of PROMise, (2) cultural conditions, (3) patient safety, and (4) increased workload for HCPs. In total, 35 ERIC strategies were matched to the corresponding CFIR barriers. We identified ERIC strategies that will facilitate the implementation of the PROMise model. In particular, focus should be placed on developing an implementation blueprint and obtaining continuous feedback from affected patients with axSpA and HCPs involved in the care of the affected patients. These implementation strategies cross-cut the CFIR barriers identified and thus may overcome the barriers to implementation.
In an era of widespread mobile phone usage, digital public health interventions offer a new cost-effective way of improving public health. In the context of smoking cessation, studies indicate that mobile technologies have the potential to support individuals to quit smoking. However, there is no systematic synthesis of how often they are used by smokers and former smokers. The aim of this study is to assess the prevalence of mobile technology use for smoking cessation among smokers and former smokers and to examine their intention to use. MEDLINE via PubMed, Embase, and PsycInfo were searched from inception to February 13, 2025. Studies were eligible if they reported how often smokers and former smokers in high-income countries used mobile technologies for smoking or vaping cessation. Study quality was assessed using the Joanna Briggs Institute tool for prevalence studies. Data synthesis was conducted narratively. Twenty-seven cross-sectional studies were included, 25 on smoking and 2 on vaping cessation. The 25 studies on smoking cessation collected data between 2005 and 2024 and comprised 117 to 27,323 participants (mean age 19.9-50.3 years; n=8). Lifetime prevalences of mobile technology use for smoking cessation ranged between 2.5% and 35.9% (n=8), depending on technology type and population. Period prevalences (0%-12%; n=4) and point prevalences (1.1%-10.9%; n=11) were generally lower. Regardless of the prevalence type, the internet was the most frequently used technology (0.8%-35.9%; n=14). Intention to use mobile technologies for smoking cessation ranged from 19.5% for Twitter to 46.7% for websites (n=2). Of the 2 studies on vaping cessation, 1 presented lifetime prevalence (1.1%-17.3%), while the other presented period prevalence (5.5%-6.3%). The intention to use mobile technologies for vaping cessation ranged from 9.7% for web-based programs to 34.6% for apps (n=1). Based on the risk of bias assessment, study quality was heterogeneous, with frequent limitations in sampling procedures, reporting, and reliance on self-reported measures. This review provides novel insights into the role of mobile technologies in smoking cessation. Evidence indicates that the prevalence of mobile technology use for smoking cessation is low and that disparities in access and engagement exist. However, there is a high intention to use such tools. Therefore, efforts should focus on delivering existing evidence-based tools rather than developing new ones. Included studies were characterized by high methodological variability and poor reporting, so the results must be interpreted with caution. Overall, despite the widespread availability of mobile technologies to support smoking cessation, research on their utilization remains limited.
Social determinants of health (SDOH) are the social, economic, and environmental conditions that influence health outcomes. SDOH information is often embedded in unstructured text, such as notes in electronic health records and social media posts. Advances in natural language processing (NLP), including emergent large language models (LLMs), offer opportunities to extract, analyze, and interpret SDOH expressions from free text for inclusion in downstream analyses. Existing literature on NLP applications for SDOH is dispersed across disciplines and characterized by methodological heterogeneity and variability in study quality and scope, complicating synthesis and cross-study comparison. This study aimed to examine the use of NLP, including LLMs, in SDOH research, and highlight gaps and future research directions. We conducted a systematic review following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, searching 7 major databases for publications between 2014 and November 2025. We included journal and conference proceedings papers that applied NLP methods to identify, classify, extract, or predict SDOH from text. Three reviewers independently screened studies and extracted data; conflicts were resolved by two senior reviewers. We abstracted study metadata, dataset characteristics, NLP approaches, SDOH domains addressed, and NLP performance metrics. We also conducted risk-of-bias analyses and identified influential studies based on relative citation counts. 142 studies met the inclusion criteria. Nearly two-thirds (89/142, 62.7%) were published between 2023 and 2025, reflecting rapid recent growth. Most studies relied on electronic health records (93/142, 65.5%) and private datasets (81/142, 57.0%), while only 20.4% (29/142) used publicly available data. Commonly studied SDOH domains were housing instability (72/142, 50.7%), employment (65/142, 45.8%), and financial conditions (63/142, 44.4%); structural factors, such as immigration status (5/142, 3.5%), were rarely examined. Of studies that reported evaluation metrics, most focused on classification (26/83, 31.32%) or extraction (38/83, 45.7%), and used cross-sectional designs. Reported model performances were typically strong, with median F1-scores ranging roughly from 0.75 to 0.85 across model categories. Only 49 studies shared code, and fewer than half clearly described model interpretability or reproducibility practices. LLMs (including encoder-decoder models) appeared in 19.7% (28/142) of studies, highlighting emerging interest but also raising new concerns around transparency and governance. This review provides a timely synthesis of NLP and LLM applications across the SDOH research spectrum, addressing an important gap in a topic receiving increasing research attention. By comparing task formulations, data sources, and performance patterns, the review clarifies the research readiness of current approaches and reveals critical gaps. Our findings advance the field by highlighting the absence of a unified SDOH framework, uneven availability of public benchmarks, and limited evaluation of real-world deployment. Addressing these gaps through transparent, inclusive dataset development and implementation-focused evaluation is essential for translating NLP advances into equitable, real-world health impact.
Acute gout attacks cause severe pain, and short-video platforms have become patients' primary source of information. However, the quality and reliability of this information are increasingly concerning. This study will systematically evaluate the information quality of gouty arthritis-related content on Bilibili and TikTok video-sharing platforms, along with factors influencing video quality. This study systematically evaluated the quality and reliability of 100 popular gout-related videos each from Bilibili and TikTok. Video quality and reliability were assessed using the global quality score, Modified DISCERN (mDISCERN), JAMA Benchmark Standard, and Hexagonal Radar Schema (HRS) tools. Correlations between video quality and metrics such as likes, comments, saves, and shares were also analyzed. Results showed median scores across 4 metrics on Bilibili: global quality score 3.0 (2.00, 4.00), mDISCERN3..0 (3.00, 4.00), JAMA 3.0 (2.00, 3.00), HRS 5.0 (4.00, 6.00); TikTok's corresponding scores were 3.0 (IQR 3.00-4.00), 3.0 (IQR 3.00-4.00), 3.0 (IQR 3.00-3.75), and 3.0 (IQR 2.00-4.50). Although Bilibili's HRS scores were higher than TikTok's, video quality was generally poor across both platforms. Furthermore, the study found a positive correlation between video length and quality. Increased likes and shares may not always reflect improved video quality, as these metrics can be influenced by the entertainment nature of online videos and may not fully indicate quality. Our research indicates that the health information short videos related to gouty arthritis on Bilibili and TikTok have poor quality, but the videos uploaded by medical professionals are considered reliable in terms of comprehensiveness and content quality. Health information seekers must carefully evaluate the scientific accuracy and reliability of short videos providing medical information on Bilibili and TikTok before making healthcare decisions.
The rapid rise of artificial intelligence-based contactless sensors (AI-CS) is expected to significantly transform how patients are measured, monitored, and understood through a versatile, noninvasive approach to data collection and health assessment. However, there is a lack of empirical research specifically focusing on AI-CS in health. Moreover, existing studies tend to focus on medical or patient perspectives, while neglecting other stakeholders such as researchers, political actors, or the general public. The study aims to provide an in-depth empirical ethical analysis and, through a multistakeholder approach, a uniquely comprehensive overview by addressing the research question: what are the attitudes of different stakeholders (patients, health care professionals, researchers, political stakeholders, and the general public) toward AI-CS and their applications in health? We conducted a cross-sectional study with 104 participants using a semistructured interview guide. Interviews were analyzed using qualitative content analysis with ATLAS.ti software (ATLAS.ti Scientific Software Development GmbH), following a 3-component model of feelings, thoughts, and behavioral aspects. The results of the study provide an in-depth analysis of attitudes toward AI-CS in health among different stakeholders. Overall, the results show a high level of openness to AI-CS in health across all stakeholder groups. In terms of feelings and their correlation with behavioral aspects, 2 key trends emerged: first, greater experience and knowledge correlated with a reduced tendency to react emotionally. Second, participants with positive experiences with technologies were generally more open and positive toward contactless sensors. The combined findings on thoughts and behavioral aspects highlighted 3 key tensions-around contact(lessness) and the importance and ambivalence of touch, between protection and surveillance (particularly regarding path- and context-dependency) and between the benefits and challenges of unobtrusiveness (especially in relation to control and governance implications). In addition, the analysis revealed the need for information and consent about AI-CS and clarified possible technical implementations and fields of application. This study provides a comprehensive and empirically grounded ethical analysis of stakeholder attitudes toward AI-CS in health. The findings offer valuable guidance for the responsible development, implementation, and governance of AI-CS in health care contexts.
The high mortality and recurrence rates associated with coronary heart disease (CHD) impose substantial health care costs and economic burdens globally. Identifying effective interventions to improve patient outcomes is paramount. Digital health technologies (DHTs) offer novel solutions to overcome the challenge of low participation rates in traditional cardiac rehabilitation (CR). This review aims to systematically map the scope of application, intervention objectives, and evaluation metrics of DHTs in CR for patients with CHD, thereby providing a structured evidence base for future research and practice. This scoping review adheres to the Joanna Briggs Institute's methodology and is reported according to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. A systematic search was conducted across 5 major databases, PubMed, Web of Science, Embase, Cochrane Library, and EBSCO, covering the period from inception to February 2026. Inclusion criteria were developed based on the participants, concept, and context framework. Studies focused on the application of various DHTs within CR settings for patients with CHD. Eligible literature comprised randomized controlled trials, quasi-randomized controlled trials, and longitudinal before-and-after studies published in peer-reviewed journals. Two researchers (XZ and ZL) independently conducted literature screening and data extraction. Findings were presented through a comprehensive narrative synthesis and evidence gap maps. A total of 43 studies were included, predominantly randomized controlled trials (n=40). Findings revealed (1) diverse technological formats, categorized into 3 main types: digital health tools, real-time remote support, and asynchronous communication. Multitechnology combined interventions have become the mainstream model (36/43, 83.7%). (2) Intervention objectives were multifaceted, consolidating into 4 dimensions: motivation and guidance, knowledge and skills, monitoring and security, and social and group dynamics. (3) Evaluation metrics were multidimensional, encompassing clinical physiological indicators, health behaviors, patient-reported outcomes, service use rates, and technological feasibility. DHTs demonstrated positive effects in improving short-term physiological function and health behaviors; however, evidence remains insufficient regarding their impact on long-term clinical outcomes such as reducing adverse events. The innovation of this scoping review lies in integrating highly heterogeneous evidence to reveal the field's evolution from isolated tools toward systematic, integrated solutions. Research confirms that DHTs effectively overcome temporal and spatial constraints, enhancing rehabilitation accessibility and engagement. They serve as crucial strategic tools for bridging geographical disparities in health care resources and advancing equity in cardiovascular health services. However, the evidence base remains limited, including insufficient long-term efficacy data and inadequate exploration of vulnerable populations such as older people and those with low digital literacy. Future research urgently requires large-scale, long-term follow-up clinical trials, alongside enhanced studies on adaptability for specific populations and considerations of health equity. This will propel digital CR toward greater scientific rigor, universal applicability, and precision.
Balance and gait disorders in Parkinson disease (PD) impair motor function and quality of life. Evidence on soft exoskeleton robots (SERs) for PD rehabilitation is limited. This study evaluated the impact of SERs on motor dysfunction in PD. A total of 56 people with PD (July 2023 to May 2024) were randomized to 2 groups: the control group (n=25, 44.6%) received conventional rehabilitation, and the experimental group (n=31, 55.4%) received conventional rehabilitation combined with SER training (ChiCTR2500111990). Training occurred 5 times per week for 20 minutes each session over 4 weeks. Primary outcomes included gait speed and stride length, while secondary outcomes assessed the percentage of swing phase, ankle joint range of motion, Unified Parkinson Disease Rating Scale total and motor scores, and Montreal Cognitive Assessment. Paired sample t tests (2-tailed) were used for within-group pre- and postintervention comparisons, and independent sample t tests (2-tailed) were used for between-group comparisons. Correlation analyses were conducted between gait parameters and improvements in ankle mobility. After 4 weeks, the experimental group showed significant improvements in gait and balance. Specifically, left stride length increased by a mean of 0.15 (SD 0.16; 95% CI 0.09-0.21) m (P<.001), right stride length by a mean of 0.15 (SD 0.15; 95% CI 0.10-0.21) m (P<.001), left ankle dorsiflexion by a mean of 2.84 (SD 1.46; 95% CI 2.32-3.36) degrees (P<.001), left swing phase percentage by a mean of 1.56% (SD 3.05%; 95% CI 0.44-2.68; P=.01), and right swing phase percentage by a mean of 1.6% (SD 2.72%; 95% CI 0.62-2.62; P=.002). The Unified Parkinson Disease Rating Scale Part III total score decreased by a mean of 2.80 (SD 3.98) points, and balance subscale scores decreased by a mean of 0.40 (0.58) points (P<.001). Montreal Cognitive Assessment scores increased by a mean of 1.23 (1.23; 95% CI 0.77-1.68) points (P<.01), and Barthel Index scores increased by a mean of 6.84 (7.14; 95% CI 4.22-9.46) points (P<.001). Other measures such as balance reaction time, reaction speed, maximum movement distance, and movement direction control showed significant improvement (P<.01). Compared to the control group, the experimental group demonstrated greater improvements in gait speed (P=.04), balance reaction time (P=.04), and maximum movement distance (P=.048). Correlation analysis revealed that improvements in left ankle dorsiflexion were positively correlated with improvements in gait speed, stride length, and swing phase duration (P<.05). SER-assisted training significantly improves gait, balance, and PD symptoms. Our work integrates multidimensional assessments (gait analysis, balance metrics, and clinical scales) and reveals that gains in ankle mobility directly correlate with gait improvements, suggesting a key mechanism. This study contributes by establishing SER as an effective adjunct to conventional therapy, supported by comprehensive quantitative data.
Electronic health records (EHRs) have been widely adopted, but most nursing records remain in unstructured free-text format, which limits the secondary use of nursing data. Standardized terminologies improve semantic interoperability; however, manual annotation is labor intensive and yields inconsistent results. Advances in large language models (LLMs) and retrieval-augmented generation (RAG) have created new possibilities for automating the mapping of nursing records to standardized terminologies, thereby enhancing the utility of nursing data. This study aimed to develop and evaluate Clinical Care Classification nursing terminology with retrieval-augmented mapping (CNTRAM), a 2-stage RAG framework incorporating an LLM, for the automated mapping of nursing diagnoses and interventions from free-text intensive care unit nursing records to standardized Clinical Care Classification (CCC) terms. CNTRAM is a 2-stage retrieval-augmented framework that integrates dense embedding retrieval, retrieval-enhanced prompting, and few-shot LLM guidance to map free-text nursing records to standardized CCC terminology. Free-text records and their segments were embedded as subqueries to retrieve the most relevant CCC reference entries and annotated examples, which were merged to construct context windows. Each subquery was combined with its retrieved context using a predefined RAG prompt template that enforces CCC coding rules and a structured JSON schema and was then processed by an LLM to generate CCC outputs. A gold standard dataset of 100 intensive care unit nursing records was annotated by 3 senior nurses and finalized via consensus, with interrater reliability quantified using the Fleiss κ. Model performance was compared with traditional baselines (term frequency-inverse document frequency, Bidirectional Encoder Representations from Transformer, and fine-tuned Bidirectional Encoder Representations from Transformers model) and 4 LLMs (Mistral-7B, Qwen3-14B, Llama3.3-70B, and DeepSeek-R1) across no-RAG, zero-shot, and few-shot settings, using precision, recall, F1-score, and intersection over union (IoU) as metrics. Interrater agreement was substantial, with Fleiss κ=0.6449 for diagnoses and κ=0.6180 for interventions. CNTRAM achieved substantial performance gains over all baseline approaches. For nursing diagnoses, DeepSeek-R1 with RAG+few-shot prompting achieved the best performance, with a precision of 0.7909, a recall of 0.7901, an F1-score of 0.7836, and an IoU of 0.7614. These results were significantly higher than those of traditional baselines (F1-score 0.0268-0.2027), no-RAG LLMs (F1-score 0.0299-0.0588), and RAG+zero-shot LLMs (F1-score 0.0716-0.2160). For nursing interventions, the same configuration achieved a precision of 0.8453, a recall of 0.8504, an F1-score of 0.8413, and an IoU of 0.8097, outperforming traditional baselines (F1-score 0.1200-0.2323), no-RAG LLMs (F1-score 0.0077-0.0189), and RAG+zero-shot LLMs (F1-score 0.2744-0.4461). This study developed CNTRAM, an LLM-based 2-stage RAG framework that combines dense embedding retrieval and few-shot prompting for CCC terminology mapping. Using DeepSeek-R1, CNTRAM outperformed baseline models, improved mapping accuracy, and provided a feasible solution for standardizing unstructured nursing data.
Inadequate health literacy and low engagement challenge public health education. Digital serious games show potential to enhance health knowledge and attitudes. However, the comparative effectiveness of different game formats is unclear. This study aimed to evaluate and compare the effectiveness of different digital serious game formats in improving public health knowledge and attitudes. This systematic review and Bayesian network meta-analysis followed PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 guidelines. Seven databases (PubMed, CINAHL, Embase, PsycINFO, Cochrane Library, Scopus, and Web of Science) were searched from January 2000 to October 2025. An updated search in February 2026 identified no additional studies. Eligible studies were randomized controlled trials (RCTs) involving nonprofessional participants comparing digital serious games with traditional or noninteractive education. Standardized mean differences and 95% credible intervals were pooled using Bayesian network models with random effects. Subgroup analyses examined population characteristics, intervention duration, health topic, and delivery format. Risk of bias was assessed using the Cochrane risk-of-bias tool, and evidence certainty was rated using the Grading of Recommendations Assessment, Development and Evaluation. Forty randomized controlled trials from 19 countries (N=8764 participants) were included. Digital serious games significantly improved knowledge (standardized mean difference 0.66, 95% CI 0.32-0.99; I²=89.1%) and attitudes (standardized mean difference 0.50, 95% CI 0.27-0.76; I²=80.7%) compared with traditional education. Multisession interventions showed larger effects than single-session interventions for knowledge (0.76 vs 0.43) and attitudes (0.53 vs 0.30), with greater improvements among adolescents, nonpatient populations, and Asian studies. Network meta-analysis showed low heterogeneity (I²=8% for knowledge; 3% for attitudes). Mobile app-based, computer-offline, and web-based games ranked highest for knowledge; computer-offline, web-based, and virtual reality games ranked highest for attitudes. Evidence certainty was moderate for knowledge and low-to-moderate for attitudes. Digital serious games improve public health knowledge and attitudes across diverse contexts. Using a Bayesian network meta-analysis of randomized controlled trials, this review compares the relative effectiveness of different game formats. Mobile app-based, computer-offline, and web-based games most improved knowledge; computer-offline, web-based, and virtual reality formats most improved attitudes. Multisession interventions were more effective than single-session ones, particularly for adolescents and nonpatient populations. These findings guide scalable digital health education strategies. Future research requires adequately powered trials, longer follow-up, and standardized frameworks.
Digital health solutions and personalized medicine are increasingly promoted as pathways to improve health care delivery in low-resource settings, including Ghana. Drawing on insights from our examination of the published literature and our engagement with digital health research in this context, we present a scholarly viewpoint on how digital health has been positioned in relation to personalized medicine in Ghana, where progress has been uneven and largely oriented toward population-level interventions. We observe that most digital health initiatives in Ghana focus on mobile health apps and health information systems that support service delivery and access, with limited translation toward truly personalized models of care. Although personalized medicine is frequently discussed as a future goal, it remains weakly operationalized in practice, and approaches such as N-of-1 trials-often cited as exemplars of individualized care-are notably absent from the existing literature. Importantly, the limited uptake of personalized approaches does not reflect a lack of relevance in Ghana, but rather the constraints of population-level digital health strategies that, while essential, have shown limited capacity to address individual heterogeneity in treatment responses, adherence, and long-term outcomes. We argue that this absence reflects structural, methodological, and policy-related challenges. At the same time, emerging digital health infrastructure, policy interest, and research capacity present opportunities to reposition digital health as an enabler of personalized medicine. Helping many single individuals through scalable digital personalized approaches may be a valuable innovative approach to public health. This viewpoint articulates key gaps, contextual constraints, and future directions, with the aim of informing researchers, policymakers, and implementers seeking to advance personalized, data-driven care in Ghana and comparable settings.
Patients with obstructive sleep apnea (OSA) frequently seek information online, yet the comparative quality of content delivered by web search engines versus generative AI systems is unclear. This study evaluated how different digital information sources perform in answering common patient questions about OSA. Thirty high-volume, patient-facing OSA questions were identified using Google Trends. Each question was submitted verbatim to four general-purpose large language models (GPT-4, GPT-5, DeepSeek, Mistral), a medically specialized retrieval-augmented model (OpenEvidence), and Google Search. Seven otolaryngologists with clinical experience in OSA independently rated each response for accuracy, clarity, completeness, relevance, and usefulness using a five-point rubric. Composite and domain scores were analyzed using one-way analysis of variance with multiple-comparison correction; inter-rater reliability was assessed with two-way random-effects intraclass correlation coefficients. A total of 180 question-system pairs received 6295 domain-level ratings. OpenEvidence achieved the highest mean composite score (4.33), followed by a tightly clustered group of LLMs (means 4.00-4.04). Google Search scored significantly lower (3.15). Differences among systems were statistically significant across all domains (p < 0.001), with large effect sizes for comparisons of OpenEvidence and general LLMs versus Google. Composite average-rater reliability was good (ICC = 0.70). For common OSA questions, generative AI systems-particularly a retrieval-augmented medical model-produced higher-quality patient-facing information than standard web search. These findings support cautious consideration of GenAI tools to supplement patient education in OSA, while underscoring the need for ongoing evaluation across diseases, disciplines, and patient populations. Patients with obstructive sleep apnea (OSA) frequently rely on online sources such as Google Search to understand symptoms, testing, and treatment, yet the quality of patient-facing information varies widely. As generative artificial intelligence tools are increasingly used for health questions, their comparative performance for OSA education has not been systematically evaluated using blinded expert review. In this blinded comparative study, generative AI systems, particularly a retrieval-augmented medical model, provided more accurate, clear, complete, and useful answers to common OSA questions than standard web search. These findings highlight that the choice of digital information source can meaningfully influence the quality of patient education in sleep medicine and support further evaluation of AI tools within clinical practice.