Tracking the movements of animals provides fundamental insights into behavioral strategies and ecological adaptations. Recent advances in automated video analysis have enabled the extraction of movement trajectories from video recordings. However, effective automated tracking often requires careful adjustment of recording environments and iterative tuning of software parameters, imposing substantial costs in user proficiency and trial-and-error processes before analyzable data can be obtained. In exploratory analyses of pilot or opportunistically recorded videos, rapid extraction of trajectories from a limited number of videos is often more important than throughput, making such setup costs a practical barrier. Here, we introduce ManuTrace, an HTML-based interface for manual extraction of trajectory data from videos. In this tool, users interactively play, pause, and seek videos and record object positions by mouse clicks, with positions in intermediate frames automatically interpolated and the resulting trajectories exported as CSV files for downstream analysis. As a demonstration, we extracted trajectories from videos of termite groups recorded under both natural and tracking-optimized conditions. Automated tools showed highly variable tracking accuracy depending on recording conditions and individual behavior, often requiring substantial manual correction under unfavorable conditions. In contrast, ManuTrace enabled consistent trajectory extraction across recording conditions, processing a 1800-frame video with 12 individuals (21,600 coordinates in total) within 20 min. The software consists of a single HTML file and runs in any modern web browser without installation, allowing use across diverse environments. Owing to its simplicity and portability, ManuTrace complements automated tracking tools as an entry point to trajectory analysis and also provides a useful resource for educational purposes.
Generating high-quality character animation videos is a fascinating yet challenging task. Existing methods use geometry guidance signals like skeletons, normal maps, or depth maps in a diffusion model to generate character videos from a single reference image. Although these approaches have shown encouraging results, they solely rely on cross attention layers to extract geometry guidance which inevitably leads to temporal inconsistencies and reduced quality. In this paper, we present a novel framework AniFeats to generate high-quality character animation videos. In contrast to existing methods, our key insight is to incorporate explicit features on 3D character meshes during the video generation to achieve significantly improved temporal consistency. Specifically, AniFeats extracts detailed features from the reference image, projects them onto 3D feature meshes based on SMPL-X, and utilizes rendered feature maps from the animated 3D feature meshes as guidance throughout the generation process. This approach directly links local patterns in the input image to those in the output video, effectively strengthening temporal coherence. Extensive experiments demonstrate that AniFeats generates high-quality, temporally consistent character animations with remarkably enhanced realism.
Minimally invasive thymectomy has largely replaced the traditional trans-sternal approach. Among the minimally invasive strategies, video-assisted thoracoscopic and robot-assisted thoracoscopic procedures are the most frequently used. The introduction of new robotic platforms has renewed interest in comparing different systems. This study aimed to evaluate short-term outcomes, learning curves, and surgeon workload in thymectomy performed with a video-assisted thoracoscopic technique, the Da Vinci robotic system, and the Versius robotic system. A retrospective, comparative study was conducted on 90 consecutive patients who underwent minimally invasive thymectomy between 2016 and 2025 at a single tertiary center. Thirty procedures were performed by a video-assisted thoracoscopic approach (2016-2020), and sixty with robotic assistance (2021-2025), equally divided between Da Vinci and Versius systems. The primary endpoint was postoperative hospital stay. Secondary endpoints included operative time, postoperative analgesic use, complications, conversion rate, and chest tube duration. Surgeon workload was evaluated using a validated multidimensional questionnaire assessing mental, physical, and temporal demands, performance, effort, and frustration. Hospital stay was significantly shorter after robotic thymectomy compared with the video-assisted approach (3.2 ± 1.0 days for Da Vinci and 3.0 ± 1.2 days for Versius vs. 4.5 ± 1.9 days; p < 0.001). Secondary outcomes were comparable among the three techniques. Cumulative sum analysis demonstrated faster procedural stabilization with robotic systems. Workload assessment indicated higher perceived demand with the Versius system, mainly in mental and physical domains. In this single-center experience, robotic thymectomy was associated with shorter hospital stay and earlier procedural stabilization compared with VATS, with similar perioperative safety. Differences between robotic platforms were mainly observed in surgeon-reported workload. These findings should be interpreted within the context of surgeon experience and institutional evolution.
Video class-incremental learning (VCIL) aims to progressively recognize novel action categories while preserving spatial-temporal knowledge of previous tasks. Unlike image class-incremental learning (CIL), VCIL requires simultaneously capturing spatial semantics and temporal dynamics, which makes catastrophic forgetting more significant. Prompt-based learning has recently made remarkable advancements in VCIL. However, they primarily optimize a predefined static prompt for all action sequences in a group-level manner, which overlooks the diverse spatial-temporal characteristics across frames and exhibits limited generalization capability for future action categories. To address the above limitations, we propose a novel exemplar-free VCIL framework termed STCP-low-rank dynamic routing (LRDR) that consists of two crucial innovations, i.e., the spatiotemporal context-aware prompting (SCAP) and LRDR. The former dynamically generates instance-level prompts based on the input video. Specifically, the frame-level prompt is developed to adaptively emphasize fine-grained details in each frame by leveraging the attention-guided spatial activation module. Meanwhile, we also design the cross-frame prompt to capture the differential importance of sequences, allowing the model to focus on key frames and facilitate its ability to explore temporal dependencies. Furthermore, we introduce a parameter-efficient LRDR that achieves dynamic scalability by incorporating mixture-of-experts low-rank spatial-temporal adapters, which can maintain old prompt knowledge and cross-task collaboration as the trainable SCAP also suffers from catastrophic forgetting when learning from incremental tasks. Finally, we propose a novel prompt correction mechanism (PCM) that prevents the proposed SCAP from acquiring ineffective class-wise spatiotemporal representations via the discriminative class-level prompt knowledge. Extensive experiments are conducted on four video benchmarks, and our approach consistently achieved substantial gains over state-of-the-art methods in VCIL.
Kaposi sarcoma (KS) is an angioproliferative neoplasm driven by human herpesvirus-8 that primarily affects immunocompromised patients, particularly patients with advanced HIV infection or AIDS. Pulmonary and pleural involvement is uncommon and may mimic other infections or malignant diseases. We report a 49-year-old man with AIDS and recurrent pleural effusions in whom repeated thoracenteses were nondiagnostic. Diagnosis was ultimately achieved through video-assisted thoracoscopic surgery, revealing intrapleural lesions consistent with KS on human herpesvirus-8 immunostaining. This case emphasizes the importance of maintaining suspicion for KS in immunocompromised patients with unexplained effusions and underscores the dual diagnostic and therapeutic value of video-assisted thoracoscopic surgery.
To evaluate how a custom web application integrating clinician-edited large language model (LLM)-generated summaries, clickable definitions, and artificial intelligence (AI)-generated videos affects radiology report comprehension, feature preferences, and overall sentiment toward AI-assisted report summaries. This prospective study recruited participants between May and July 2025 at a hospital-based outpatient imaging floor before their scheduled examinations at a tertiary university hospital. Following exam completion and report publication, patient-friendly AI summaries were generated and reviewed by a radiologist for accuracy. Participants were then shown a web application containing their own de-identified, AI-augmented reports featuring clinician-edited LLM-generated summaries with clickable terms and AI videos. Participants were surveyed on comprehension, feature usefulness, and attitudes toward LLM summaries. Participants (n=101, 40 male/61 female, racially diverse) ranged from 20 to 82 (mean 58±15) years old. Overall comprehension improved significantly (median pre: 4.00, post: 5.00, p<0.001), with 47.52% (n=48) identifying LLM-summaries as most helpful. However, LLM-summaries required manual clinician edits (average per summary: 24.75 words removed; 0.13 words added, lexical similarity = 84.63%; semantic similarity = 98.25%). When asked if they were comfortable with LLM-summaries without clinician edits, most participants reported being only Somewhat comfortable (27.72%) or Very uncomfortable (25.74%). This prospective study demonstrates that interactive, LLM-driven applications can significantly improve self-reported patient comprehension of complex radiology reports, emphasizing their potential to enhance patient-centered communication. However, patients had reservations about clinician-edited LLM-generated summaries, indicating that successful integration is contingent on professional oversight - an added workload that may limit scalable real-world implementation.
Recent debates on violent video game (VVG) effects highlight the importance of analytical robustness and transparency in longitudinal research. Teng and Bushman reanalyzed our prior study, identifying several methodological concerns and arguing that our conclusions were unwarranted. In this response, we examine their reanalysis and assess whether their conclusions hold across alternative analytical specifications and operational decisions. Through comprehensive robustness checks available on OSF, we address concerns regarding model specification, missing data, measurement decisions, and time constraints. We further examine the unexpected empathy-aggression association and alternative VVG operationalizations. While some inconsistencies emerge across different specifications, the results do not consistently support General Aggression Model (GAM) interpretations in this dataset. We further identify inconsistencies and incomplete reporting in Teng and Bushman's presentation of results, data processing procedures, and interpretation of effect sizes. Finally, we argue that the current debate reveals a fundamental theoretical problem: without precise specifications of mechanisms, developmental patterns, and falsifiable predictions, GAM risks unfalsifiability when contradictory findings can all be accommodated as supporting evidence. We advocate for formal modeling approaches to enhance theoretical rigor and predictive specificity in violent media research.
Toothbrushing remains the most widely practiced method for dental plaque removal. However, evidence-based guidance on optimal brushing method, duration, toothbrush bristle wear, and dentifrice quantity is still insufficient. This study aimed to identify behavioural factors associated with post-brushing dental plaque levels using video-based analysis of habitual toothbrushing. Eighty-eight adults were recruited and recorded while brushing under habitual conditions. Toothbrushing behaviours (duration, motion, surface coverage, and multi-area brushing) were coded using behavioural analysis software (Mangold Interact). Toothbrush bristle wear was classified by the Conforti index and dentifrice quantity was recorded photographically. Dental plaque was assessed using the Turesky Modification Quigley-Hein Index (TMQHI). Data were analysed using independent t-tests, one-way ANOVA, and linear mixed model (LMM) with SPSS 26.0. Statistical significance was set at p < 0.05. Area-specific brushing duration was the strongest factor associated with lower plaque scores. Sites brushed for ≥ 5 s showed significantly lower plaque index scores than those brushed for < 5 s. In the LMM adjusting for all measured covariates, only brushing duration and brushing area remained significant. Among all measured variables, area-specific brushing duration was the factor most strongly associated with post-brushing plaque levels. To achieve effective plaque control, it is essential to distribute brushing time equitably across all tooth surfaces and to allocate a minimum of 5 s per area.
Left cardiac sympathetic denervation (LCSD) via video-assisted thoracoscopy (VATS) is an effective therapy for drug-refractory malignant arrhythmias in congenital long QT syndromes and requires meticulous perioperative planning in children with automatic implantable cardioverter-defibrillators (AICDs). We describe what is likely the first paediatric VATS-LCSD performed in Pakistan. A boy in early childhood with Jervell and Lange-Nielsen syndrome, severe QT prolongation and recurrent ventricular arrhythmias despite beta-blockade and mexiletine had received multiple AICD shocks. Intraoperative management focused on preventing electrocautery-induced AICD activation by applying a magnet to suspend antitachycardia therapies, using external defibrillation pads and ensuring continuous electrophysiology support. Anaesthesia incorporated sevoflurane, dexmedetomidine, cisatracurium and lidocaine, with one-lung ventilation achieved by intentional endobronchial tracheal tube placement. Thoracoscopic excision of the left sympathetic chain (T5-T1), including the lower stellate, was completed uneventfully. The child was extubated in the operating room and discharged the next day without complications.
This case report describes a 59-year-old woman with a residual cervicomediastinal thyroid mass after right thyroid lobectomy for papillary carcinoma. Because of its deep intrathoracic extension and proximity to major vessels, a purely cervical approach was not feasible. The operation was planned using high-resolution, contrast-enhanced computed tomography and preoperative 3-dimensional reconstruction to delineate vascular and anatomic relationships. Video-assisted thoracoscopic surgery enabled complete resection with uneventful recovery. This case highlights how 3-dimensional imaging can enhance surgical planning, precision, and safety in complex mediastinal procedures.
暂无摘要(点击查看详情)
This study examined 10-year trends in potential head injury situations across five consecutive FIFA Beach Soccer World Cups (2015-2024). Footage from 160 matches (974 match-hours) was analysed to identify potential head injury situations and associated mechanisms, player actions, outcomes, and visible signs of possible concussion. A total of 463 potential head injury situations were identified, corresponding to an incidence rate (IR) of 475.3 potential head injury situations/1000 match-hours (2.9/match). The overall IR increased significantly by 6.6% biennially, resulting in a 1.95-fold increase over the 10-year period. Increases were most pronounced in incidents involving upper-extremity-to-head contact (9.0%), those occurring during other-duels (17.5%), and those where players were competing for the ball with their feet (14.4%). The IR of medically assessed incidents and those presenting visible signs of possible concussion did not significantly change over time (p>0.05). Findings suggest that increasing potential head injury situations are primarily driven by contested situations involving greater upper-extremity use, warranting targeted prevention strategies.
Cancer pain remains a major public health concern and a leading cause of suffering in patients with cancer. With the rapid expansion of short video platforms, such as TikTok, an increasing number of users are turning to these platforms for information on cancer pain management. This study evaluated the quality, reliability, and content characteristics of cancer pain-related short videos on TikTok. A total of 241 videos were included in the final analysis. Data on video characteristics, uploader type, engagement metrics, and medical content were extracted. Two independent reviewers assessed video quality using the Global Quality Score (GQS), the modified DISCERN tool (mDISCERN), and the JAMA benchmark criteria. The included videos received substantial user engagement; however, overall quality was moderate, with median scores of 3.00 for GQS, 3.00 for mDISCERN, and 2.00 for JAMA. Healthcare professionals (HCPs) uploaded the majority of videos (77.18 percent) and provided significantly higher-quality and more reliable content than non-healthcare professionals (p<0.001). HCP videos more frequently covered diagnosis, prognosis, and clinical manifestations, whereas videos from non-healthcare professionals received higher comment engagement despite lower reliability. Spearman correlation analysis showed that user engagement metrics were strongly correlated with each other but had negligible associations with video quality. These findings indicate that although TikTok serves as an important platform for disseminating cancer pain information, substantial gaps remain in content accuracy, particularly among non-professional creators. Increased involvement of healthcare professionals and enhanced platform-level oversight may help improve the quality of cancer pain-related educational content shared on short-video platforms.
The inability to adopt a hands-on approach can be viewed as a limitation to provide veterinary care virtually. The objective of this study was to establish guidance to support the remote delivery of a companion animal physical examination by video. A modified approach to the Delphi method was followed. Five interviews with experienced veterinarians and existing literature were used to identify potential components for the companion animal physical examination conducted by video. The results informed an initial questionnaire, which was distributed to veterinarians with telemedicine experience using purposive and snowball sampling. Three rounds of online questionnaires were distributed, with consensus set a priori at 90%. Fourteen participants were recruited and completed round 1. Of these, 10 participants completed round 2, and seven completed round 3. A total of 42 physical examination components were evaluated throughout the three rounds. Of these, consensus was reached that 19 components were possible to perform via video. Another 15 components were deemed not possible to perform by video. Eight components did not reach consensus either way. Loss to follow-up was experienced, which may weaken the strength of the consensus. The findings offer guidance and support for veterinarians in conducting components of the companion animal physical examination by video when the clinical context is appropriate.
Evaluation of cleanliness in capsule endoscopy is essential to validate the exam. However, reliable evaluation of bowel cleanliness remains challenging. The validated KODA score provides robust assessment but is time-consuming and impractical in routine practice. We aimed to evaluate the performance of an AI-based tool (AXAROlite, Augmented Endoscopy) compared with the KODA score for assessing small bowel cleanliness in CD patients. This was a post hoc analysis of a multicenter randomized controlled trial (NCT05117996) including 142 CD patients undergoing SBCE after either a standard PEG-based or simplified clear liquid preparation. All videos were evaluated for cleanliness using the KODA score by trained readers. The same videos were analyzed using the AI-based AXAROlite tool. Correlation, agreement, and diagnostic performance of AXAROlite were compared with the KODA score. Clinical factors influencing AI-assessed cleanliness were also investigated. AXAROlite and KODA showed strong correlation for whole small bowel (ρ = 0.61; p < 0.001). AXAROlite demonstrated excellent diagnostic accuracy for detecting adequate cleanliness defined as KODA > 2.25 (AUROC = 0.85, 95% CI 0.79-0.92), with an optimal threshold of 72% clean frames (sensitivity 86%, specificity 72%). Mean cleanliness decreased progressively from proximal to distal small bowel with both methods (p = 0.001). The AXROlite score was independently reduced in patients with active disease and a longer small bowel transit time. AXAROlite provides a rapid, fully automated, and accurate evaluation of small bowel cleanliness in CD, comparable to the validated KODA score. Its integration into routine workflows may streamline reporting and reduce inter-observer variability.
RGB camera-based surveillance systems enable human action recognition for public safety and healthcare, yet raise serious privacy concerns. Existing methods rely on post-capture algorithms, which fail to protect privacy during data acquisition. We propose Lens Privacy Sealing (LPS), a simple hardware solution that physically obscures camera lenses with adjustable laminating film, providing pre-sensor privacy protection at minimal cost. Unlike software methods or expensive engineered optics, LPS achieves strong privacy through stochastic multi-layer scattering that is physically irreversible. We introduce the P3AR dataset for privacy-preserving action recognition, featuring both large-scale replay-captured (P3AR-NTU, 114K videos) and real-world collected (P3AR-PKU) subsets with privacy attribute annotations. To handle video degradation from LPS, we propose MSPNet, a single-stage framework incorporating Inter-Frame Noise Suppressor (IFNS) and Cross-Frame Semantic Aggregator (CFSA), enhanced by contrastive language-image pre-training for robust semantic extraction. Extensive experiments demonstrate that MSPNet with IFNS and CFSA nearly doubles action recognition accuracy compared to baseline methods while suppressing identity recognition to low levels. Comprehensive validation shows LPS achieves a superior privacy-utility trade-off compared to state-of-the-art hardware methods, resists reconstruction attacks including PSF inversion and data-driven recovery, and generalizes robustly across optical configurations and challenging environments. Code is available at https://github.com/wangzy01/MSPNet.
The Annamite striped rabbit (Nesolagus timminsi) is a forest-dwelling lagomorph endemic to the Annamite Mountains of Vietnam and Laos. Herein, we report the first detailed observations of its burrowing and reproductive behavior from information gathered via camera traps. From May to October 2024, six camera traps were deployed at suspected burrow sites identified through semi-structured interviews and field surveys. The active burrows were recorded at multiple sites within broadleaf wet evergreen forests between 359 m and 775 m asl. The camera traps were active for 864 camera trap nights, recording 1293 videos of the species. The videos showed first observations of the Annamite striped rabbit performing a sequence of burrow opening, closing and concealment, including placing compacted soil and small rocks over the entrance. Furthermore, a single behavioral event lasting 190 min was recorded and interpreted as being suggestive of parturition. Forty-six days after this event, the female was recorded leaving the burrow accompanied by two juveniles. These findings represent the first documentation of such behavior in the species. Our results provide new behavioral information on this little known and endangered animal.
We evaluate stereo-differentiable rendering-based pose estimation for marker-free real-time surgical robots tracking, mitigating occlusion-prone marker-based tracking in cluttered surgical environments, potentially improving safety, reducing setup times, and enabling intelligent multi-robot interaction. This work extends the differentiable rendering-based markerless robot pose estimation framework roboreg for online real-time dynamic tracking in two ways. (i) Sequential optimisation propagates pose estimates across consecutive frames, with motion-adaptive hyperparameter tuning balancing convergence and precision during estimation. (ii) Integrate CUDA stream parallelisation for segmentation and the optimisation steps and combines it with CUDA-graph accelerated segmentation. We collect 38 displacement video sequence datasets with unobstructed robot and 5 occluded-robot dataset with static start/end ground-truth pose calibrations and dynamic marker-based reference tracking in between for accuracy evaluation under different scenarios. Real-time localisation at 30 fps for 1080p video sequence is achieved, accelerating from 14 fps in the vanilla roboreg, thereby matching the camera frame rate. Near-1 cm accuracy is demonstrated, with 1.7 cm translational and 0. 6 ∘ rotational error against static ground-truth pose calibration; and with 1.2 cm average 3D error across 27,460 frames against a marker-based reference standard (1.53 cm in over 1242 frames in occlusion evaluation). Our method outperforms FoundationPose by 11% (63% in occlusion dataset) in dynamic estimation and 250% in static estimation, while achieving 6 × faster inference. We demonstrate real-time high-resolution marker-free tracking of surgical robots through stereo-differentiable rendering. Localisation accuracy performed on par with marker-based approaches and improved upon foundational baselines.