Listening to heart and lung sounds - auscultation - is one of the first and most fundamental steps in a clinical examination. Despite being fast and non-invasive, it demands years of experience to interpret subtle audio cues. Recent deep learning methods have made progress in automating cardiopulmonary sound analysis, yet most are restricted to simple classification and offer little clinical interpretability or decision support. We present StethoLM, the first audio-language model specialized for cardiopulmonary auscultation, capable of performing instruction-driven clinical tasks across the full spectrum of auscultation analysis. StethoLM integrates audio encoding with a medical language model backbone and is trained on StethoBench, a comprehensive benchmark comprising 77,027 instruction-response pairs synthesized from 16,125 labeled cardiopulmonary recordings spanning seven clinical task categories: binary classification, detection, reporting, reasoning, differential diagnosis, comparison, and location-based analysis. Through multi-stage training that combines supervised fine-tuning and direct preference optimization, StethoLM achieves substantial gains in performance and robustne
Cardiac arrest is very common nowadays. Sudden heart attack is a condition where the heart suddenly stops beating causing a significant decrease in blood flow to the brain. The first step in medical point of view is for a patient experiencing sudden heart attack is Cardiopulmonary Resuscitation (CPR).Moreover, compression rate needed for CPR process is far beyond for humans to provide manually. So, there is intense need of mechanical device which can perform resuscitation. Cardiopulmonary resuscitation device is used to augment the blood flow and maintain hemodynamic cycle of human body. CPR device is proposed to meet the effective and unique blood flow mechanism, feedback system. In term of effective and unique blood flow mechanism design and fabrication of low cost cardiopulmonary resuscitation device based on principle of CPR and two concepts. It is combined Sterno-Thoracic Cardiopulmonary Resuscitation. The "cardiac pump" generates blood flow by squeezing blood out of the heart as the sternum is depressed. The "thoracic pump" increases intrathoracic pressure due to elastic recoil of ribs. In order to meet the American Heart Association standard guidelines a feedback system has
Embolic stroke during cardiopulmonary bypass (CPB) is strongly influenced by cannula induced flow disturbances that govern emboli transport and aortic wall loading. This study quantifies how aortic cannula orientation affects embolic distribution and atherosclerotic plaque disruption risk across patient specific, age-dependent aortic anatomies under clinical CPB conditions. A validated computational fluid dynamics and Lagrangian particle tracking (CFD-LPT) framework was applied to four patient-specific aortic models representing pediatric, adolescent, adult, and geriatric anatomies. Two clinically relevant cannula orientations: perpendicular (90 deg) and angled (30 deg), were evaluated under varying blood viscosities (1.5 to 3.5 cP) and embolus sizes (0.5 to 2.5 mm). Aortic branch exit percentage, wall pressure, and wall shear stress (WSS) were quantified. The 30 deg angled cannula reduced embolic transport into the aortic branches by 18 to 50 percent compared with perpendicular cannulation, with the largest reduction observed in the geriatric model. Perpendicular cannulation produced concentrated jet impingement, resulting in significantly elevated posterior wall pressure (24 perc
High-quality cardiopulmonary resuscitation (CPR) requires stable control of compression rhythm and depth, yet most training systems presuppose instructor mediation, repeated practice, and explanatory guidance-assumptions that do not hold in the Tibet Autonomous Region, where instruction is fragmented and learners' linguistic and educational backgrounds are heterogeneous. We present TibetCPR, a low-cost, self-guided CPR training system that pairs depth-driven electrotactile feedback with rhythm-driven visual cues within a Tibetan-language narrative. In a randomised study with 40 lay community members aged 19--56, the experimental group showed progressive minute-by-minute stabilisation of rhythm and depth across a 10-minute intervention, substantially exceeding an unguided-practice control, with gains transferring to an unscaffolded one-minute post-test. Qualitative accounts described the feedback as legible through participants' bodily action, and usability was high (SUS = 84.3). We synthesise three transferable design principles for self-guided embodied training: feedback as a calibration reference, not an immediate corrector; modality temporal granularity matched to behaviour's te
Cardiopulmonary exercise testing (CPET) provides a comprehensive assessment of functional capacity by measuring key physiological variables including oxygen consumption ($VO_2$), carbon dioxide production ($VCO_2$), and pulmonary ventilation ($VE$) during exercise. Previous research has established that parameters such as peak $VO_2$ and $VE/VCO_2$ ratio serve as robust predictors of mortality risk in chronic heart failure patients. In this study, we leverage CPET variables as surrogate mortality endpoints for patients with Congenital Heart Disease (CHD). To our knowledge, this represents the first successful implementation of an advanced machine learning approach that predicts CPET outcomes by integrating electrocardiograms (ECGs) with information derived from clinical letters. Our methodology began with extracting unstructured patient information-including intervention history, diagnoses, and medication regimens-from clinical letters using natural language processing techniques, organizing this data into a structured database. We then digitized ECGs to obtain quantifiable waveforms and established comprehensive data linkages. The core innovation of our approach lies in exploiting
Non-intrusive monitoring of vital signs has become increasingly important in a variety of healthcare settings. In this paper, we present PulseFi, a novel low-cost non-intrusive system that uses Wi-Fi sensing and artificial intelligence to accurately and continuously monitor heart rate and breathing rate, as well as detect apnea events. PulseFi operates using low-cost commodity devices, making it more accessible and cost-effective. It uses a signal processing pipeline to process Wi-Fi telemetry data, specifically Channel State Information (CSI), that is fed into a custom low-compute Long Short-Term Memory (LSTM) neural network model. We evaluate PulseFi using two datasets: one that we collected locally using ESP32 devices and another that contains recordings of 118 participants collected using the Raspberry Pi 4B, making the latter the most comprehensive data set of its kind. Our results show that PulseFi can effectively estimate heart rate and breathing rate in a seemless non-intrusive way with comparable or better accuracy than multiple antenna systems that can be expensive and less accessible.
The increase in cardiac and pulmonary diseases presents an alarming and pervasive health challenge on a global scale responsible for unexpected and premature mortalities. In spite of how serious these conditions are, existing methods of detection and treatment encounter challenges, particularly in achieving timely diagnosis for effective medical intervention. Manual screening processes commonly used for primary detection of cardiac and respiratory problems face inherent limitations, increased by a scarcity of skilled medical practitioners in remote or under-resourced areas. To address this, our study introduces an innovative yet efficient model which integrates AI for diagnosing lung and heart conditions concurrently using the auscultation sounds. Unlike the already high-priced digital stethoscope, our proposed model has been particularly designed to deploy on low-cost embedded devices and thus ensure applicability in under-developed regions that actually face an issue of accessing medical care. Our proposed model incorporates MFCC feature extraction and engineering techniques to ensure that the signal is well analyzed for accurate diagnostics through the hybrid model combining Gat
Heart and lung sounds are crucial for healthcare monitoring. Recent improvements in stethoscope technology have made it possible to capture patient sounds with enhanced precision. In this dataset, we used a digital stethoscope to capture both heart and lung sounds, including individual and mixed recordings. To our knowledge, this is the first dataset to offer both separate and mixed cardiorespiratory sounds. The recordings were collected from a clinical manikin, a patient simulator designed to replicate human physiological conditions, generating clean heart and lung sounds at different body locations. This dataset includes both normal sounds and various abnormalities (i.e., murmur, atrial fibrillation, tachycardia, atrioventricular block, third and fourth heart sound, wheezing, crackles, rhonchi, pleural rub, and gurgling sounds). The dataset includes audio recordings of chest examinations performed at different anatomical locations, as determined by specialist nurses. Each recording has been enhanced using frequency filters to highlight specific sound types. This dataset is useful for applications in artificial intelligence, such as automated cardiopulmonary disease detection, sou
The vast majority of people who suffer unexpected cardiac arrest are performed cardiopulmonary resuscitation (CPR) by passersby in a desperate attempt to restore life, but endeavors turn out to be fruitless on account of disqualification. Fortunately, many pieces of research manifest that disciplined training will help to elevate the success rate of resuscitation, which constantly desires a seamless combination of novel techniques to yield further advancement. To this end, we collect a custom CPR video dataset in which trainees make efforts to behave resuscitation on mannequins independently in adherence to approved guidelines, thereby devising an auxiliary toolbox to assist supervision and rectification of intermediate potential issues via modern deep learning methodologies. Our research empirically views this problem as a temporal action segmentation (TAS) task in computer vision, which aims to segment an untrimmed video at a frame-wise level. Here, we propose a Prompt-enhanced hierarchical Transformer (PhiTrans) that integrates three indispensable modules, including a textual prompt-based Video Features Extractor (VFE), a transformer-based Action Segmentation Executor (ASE), and
The analysis of trail-running performance appears to be complex and cardio-respiratory and muscular factors could have a variable importance depending on the inclination. Our study aims to determine the role of these parameters in performance. 13 subjects with heterogeneous levels participated in the study. They carried out 7 visits including 3 maximal aerobic speed (MAS) test at 1, 10 and 25% slope on treadmill, 3 endurance tests at 100% of the MAS reached at 1, 10 and 25% and an evaluation on isokinetic ergometer at different speeds (60-180-240 {\textdegree}/s). Gas exchange measured during the incremental tests. We were able to identify 2 groups, a performance and a recreational group. We observe a difference in VO2max, MAS at 1 and 10%, and maximal aerobic ascensional speed (MAaS) at 25%, between the 2 groups but no difference in VO2max and exhaustion time at 100% MAS between the different conditions (1-10-25%). Interestingly, at ventilatory thresholds the metabolic parameters, expressed as absolute or relative values, are similar between conditions (10-25%) while the ascensional speed are different. This study suggests that the measurement of ascensional speed is not as releva
VO2max is a critical indicator of cardiopulmonary fitness, reflecting the maximum amount of oxygen the body can utilize during intense exercise. Accurately measuring VO2max is essential for assessing cardiovascular health and predicting outcomes in clinical settings. However, current methods for VO2max estimation, such as Cardiopulmonary Exercise Testing (CPET), require expensive equipment and the supervision of trained personnel, limiting accessibility for large-scale screening. Preliminary efforts have been made to create a more accessible method, such as the Cardiopulmonary Spot Jog Test (CPSJT). Unfortunately, these early attempts yielded high error margins, rendering them unsuitable for widespread use. In our study, we address these shortcomings by refining the CPSJT protocol to improve prediction accuracy. A crucial contribution is improved feature extraction which include gender, body mass index, aerobic duration, and anaerobic duration. This targeted approach helps in streamlining the model to enhance prediction precision while minimizing the risk of overfitting. In a cohort of 44 participants from the Indian population, we assessed the performance of various machine learni
Objective: The percentage of long-term survival in out of hospital cardiac arrest cases is remarkably low. One approach would be to increase the effectiveness of cardiopulmonary resuscitation (CPR), which is currently not measurable in a quantifiable way. The most significant challenge in providing a mobile solution for CPR evaluation is a mobile, hazard free sensor attachment with high usability. Methods: We present a sensor attachment solution usable for semiautomatic ultrasonic (US) Doppler measurements. Components are attached to a Stifneck Select Collar$^{\mathrm{TM}}$ (Laerdal). An inflatable cushion (TR-Band$^{\mathrm{TM}}$, Terumo) allows adjustable contact pressure. A clinical study was conducted in which the system was evaluated based on comfort, pain, sensor support, the viability of Doppler signals, and the absence of skin irritations. Results: The system was utilized in a prospective study involving 102 healthy probands. On a scale between 1 (Low) and 10 (Intense), ratings were 1.19 (SD 0.46), 6.52 (SD 1.78), and 9.95 (SD 0.32) for pain, comfort, and support, respectively. The average duration of application was 31.19 minutes (SD 16.75 minutes). Audible Doppler signals
Wrist-worn photoplethysmography (PPG) enables continuous monitoring of cardiopulmonary physiology, but reliable heart rate (HR) and respiratory rate (RR) estimation in free-living conditions remains challenging due to non-stationary motion artifacts that spectrally overlap with physiological dynamics. Existing signal-processing methods degrade under strong motion, while unconstrained deep learning approaches often lack physiological interpretability and identifiable structure. We propose a Physically-Constrained Harmonic Separation (PCHS) framework that formulates HR and RR estimation from wrist PPG as an analysis-by-synthesis problem, where accelerometer measurements condition artifact separation rather than directly regressing vital signs. A physics-guided harmonic generator decomposes the observed signal into quasi-periodic physiological components and a motion-related residual, enabling HR recovery from the fundamental frequency and RR prediction from respiratory-driven modulations of the harmonic parameters. Robust reconstruction objectives, separation constraints, and uncertainty-aware weighting stabilize the decomposition under motion. Experiments on the motion-intensive PPG
Current video benchmarks for multimodal large language models (MLLMs) focus on event recognition, temporal ordering, and long-context recall, but overlook a harder capability required for expert procedural judgment: tracking how ongoing interactions update the procedural state and thereby determine the correctness of later actions. We introduce SiMing-Bench, the first benchmark for evaluating this capability from full-length clinical skill videos. It targets rubric-grounded process-level judgment of whether interaction-driven state updates preserve procedural correctness across an entire workflow. SiMing-Bench is instantiated with SiMing-Score, a physician-annotated dataset of real clinical skill examination videos spanning cardiopulmonary resuscitation, automated external defibrillator operation, and bag-mask ventilation, each paired with a standardized step-wise rubric and dual-expert labels. Across diverse open- and closed-source MLLMs, we observe consistently weak agreement with physician judgments. Moreover, weak performance on rubric-defined intermediate steps persists even when overall procedure-level correlation appears acceptable, suggesting that coarse global assessment s
Question-asking is one of the key indicators of cognitive engagement. However, understanding how the distinct psychological affordances of presentation media shape learners' spoken inquiries with embodied Intelligent Virtual Agents (IVAs) remains limited. To systematically examine this process, we propose a 5W1H-based framework for analyzing learner questions. Using this framework, we conducted a user study comparing an Augmented Reality-based IVA (AR-IVA) deployed in the physical environment with a screen-based IVA (Video-IVA) during cardiopulmonary resuscitation (CPR) instruction. Results showed that the AR-IVA elicited higher spatial and social presence and promoted more frequent and longer questions focused on clarification and understanding. In contrast, the Video-IVA encouraged questions regarding procedural refinement. Presence acted as a selective filter, shaping the timing and topic of questions rather than as a universal mediator. These effects were significantly moderated by learners' motivational and strategic characteristics toward learning. Based on these findings, we propose design implications for IVA-supported learning systems.
Cardiopulmonary resuscitation (CPR) is a critical life-saving procedure, and effective training benefits from self-directed practice beyond instructor-led sessions. In this paper, we propose a closed-loop CPR training glove that integrates a high-resolution tactile sensing array and vibrotactile actuators for self-directed practice. The tactile sensing array measures distributed pressures across the palm and dorsum to enable real-time estimation of compression rate, force, and hand pose. Based on these estimations, the glove delivers immediate haptic feedback to guide the user for proper CPR, reducing reliance on external audio-visual displays. We quantified the tactile sensor performance by measuring wide-range sensitivity (~0.85 over 0-600 N), computing hysteresis (56.04%), testing stability (11.05% drift over 300 cycles), and estimating global signal-to-noise ratio (18.90 +/- 2.41 dB at 600 N). Our closed-loop pipeline provides continuous modeling and feedback of key performance metrics essential for high-quality CPR. Our lightweight statistical models achieves >92% accuracy for force estimation and hand pose classification within sub-millisecond inference time. Our user stud
Pulmonary embolism (PE) is a high risk cardiopulmonary condition whose management requires both timely diagnosis and reliable assessment of future clinical risk. Because PE care routinely combines computed tomography pulmonary angiography (CTPA), radiology interpretation, and longitudinal electronic health record (EHR) evidence, it provides a clinically meaningful setting for evaluating compact multimodal language models. In this work, we build a benchmark using efficient multimodal large language models (MLLMs) on INSPECT, a multimodal PE dataset containing 23,248 CTPA studies from 19,402 patients. We formulate eight diagnostic and prognostic tasks as structured clinical question answering problems and evaluate on typical efficient MLLMs under CTPA-Only, EHR-Only, and CTPA+EHR settings with zero-shot and few-shot prompting. Results show that Gemma4 E4B and Gemma4 E2B perform more strongly when EHR evidence is available, especially under CTPA+EHR input. Task level analysis further shows that PE diagnosis achieves higher performance than prognostic tasks, particularly readmission prediction. These observations suggest that compact multimodal models have the great potential in early
Background: Mechanical ventilation is life-saving for preterm infants with respiratory distress syndrome but can also contribute to lung injury and long-term morbidity. Protective ventilation strategies are recommended, yet implementation in neonatal intensive care units remains inconsistent, and infants continue to be exposed to injurious ventilator settings. Objective: To develop and validate a cohort of neonatal digital twins, based on mechanistic models of cardiopulmonary physiology calibrated to individual patient data, as a tool for simulating and optimising protective ventilation strategies. Methods: A high-fidelity computational simulator of human cardiopulmonary physiology was adapted to neonatal-specific parameters, including lung compliance, dead space, pulmonary vascular resistance, oxygen consumption, and fetal haemoglobin oxygen affinity. Digital twins were generated using data at 65 time points from 11 preterm neonates receiving volume-controlled ventilation. Model parameters were calibrated to minimise the error between simulated and observed PaO2, PaCO2, and peak inspiratory pressure (PIP). Results: Digital twins reproduced measured data with mean absolute percenta
With the rising prevalence of cardiovascular and respiratory disorders and an aging global population, healthcare systems face increasing pressure to adopt efficient, non-contact vital sign monitoring (NCVSM) solutions. This study introduces a robust framework for multi-person localization and vital signs monitoring, using multiple-input-multiple-output frequency-modulated continuous wave radar, addressing challenges in real-world, cluttered environments. Two key contributions are presented. First, a custom hardware phantom was developed to simulate multi-person NCVSM scenarios, utilizing recorded thoracic impedance signals to replicate realistic cardiopulmonary dynamics. The phantom's design facilitates repeatable and rapid validation of radar systems and algorithms under diverse conditions to accelerate deployment in human monitoring. Second, aided by the phantom, we designed a robust algorithm for multi-person localization utilizing joint sparsity and cardiopulmonary properties, alongside harmonics-resilient dictionary-based vital signs estimation, to mitigate interfering respiration harmonics. Additionally, an adaptive signal refinement procedure is introduced to enhance the ac
This paper presents a novel micro-Doppler energy-based framework for robust multi-target vital signs monitoring using 77-GHz Frequency-Modulated Continuous-Wave (FMCW) radar. Unlike conventional phase-based methods that are susceptible to environmental noise, random body movements, and stringent calibration requirements, our approach exploits the energy variations in radar returns induced by cardiopulmonary activities. The proposed system integrates a comprehensive processing pipeline including space-time adaptive processing (STAP) for target detection and tracking, MUSIC algorithm for high-resolution angle estimation, and an innovative adaptive spectral filtering technique for vital signs extraction. We establish a rigorous mathematical framework that formalizes the relationship between micro-Doppler energy variations and physiological activities, enabling robust separation of closely spaced targets. The key innovation lies in the micro-Doppler energy extraction methodology that provides inherent robustness to phase noise and motion artifacts. Experimental results using millimeter-wave radar datasets demonstrate that the system can accurately detect and separate vital signs of up