ECG-based artificial intelligence may enable efficient prediction of incident heart failure (HF) risk to facilitate preventive efforts. Prior models are proprietary, with modest or inconsistent accuracy. We sought to develop and validate a generalizable and publicly available convolutional neural network to predict incident HF using the 12-lead ECG waveform (ECG-to-HF [ECG2HF]). We developed ECG2HF in 94 636 patients receiving longitudinal ambulatory care at Massachusetts General Hospital (MGH), and validated it in 3 test sets: MGH, Brigham and Women's Hospital (BWH), and Beth Israel Deaconess Medical Center (BIDMC), among 93 868 individuals aged 30 to 79 years without HF. HF events at 10 years were identified using a validated electronic health record-based natural language processing model. Discrimination was quantified using the area under the receiver operating characteristic curve. We then compared discrimination and net reclassification (at <10%, 10% to 20%, ≥20% 10-year risk categories) using ECG2HF versus the 15-component Pooled Cohorts Equations to Prevent HF score. The test sets comprised MGH (13 954 individuals, 441 events, age 57±13 years, 48% women), BWH (54 396 individuals, 1809 events, age 57±13 years, 55% women), and BIDMC (25 457 individuals, 901 events, age 57±13 years, 53% women). Over 10 years, the cumulative risk of HF was 4.6% (95% CI, 4.1-5.0) in MGH, 5.0% (4.8-5.2) in BWH, and 4.4% (4.1-4.7) in BIDMC. ECG2HF discriminated 10-year incident HF in each test set (area under the receiver operating characteristic curve: MGH 0.86 [0.84-0.87]; BWH 0.85 [0.84-0.86]; BIDMC 0.84 [0.83-0.86]). Compared with the Pooled Cohorts Equations to Prevent HF, ECG2HF provided favorable discrimination (improvement in area under the receiver operating characteristic curve MGH/BWH 0.061 [0.025-0.097]; BIDMC 0.038 [-0.0096 to 0.086]) and net reclassification (NRI MGH/BWH 0.16 [0.077-0.24]; BIDMC 0.23 [0.10-0.35]) of 10-year HF risk. ECG2HF is a publicly available 12-lead ECG-based artificial intelligence model that discriminates the risk of future HF with favorable and consistent performance across 3 large health care samples from the northeastern United States. ECG2HF may enable efficient prioritization of high-risk individuals for HF-related preventive measures.
The exponential growth of artificial intelligence in healthcare has created unprecedented computational demands, contributing significantly to carbon emissions while often lacking transparency in critical medical decisions. Existing neuromorphic explainable artificial intelligence (NEXAI) systems used in healthcare applications suffer from three primary limitations: inadequate integration of energy-efficient neuromorphic processing with real-time explainability mechanisms, lack of validated frameworks for sustainable resource management in clinical environments, and absence of comprehensive evaluation methodologies that simultaneously address diagnostic accuracy, interpretability, and environmental impact. We develop the NEXAI-Health framework by processing continuous spike streams, iteratively sampling spike rates in the range [Formula: see text]–520 spikes/s, with cycle-to-cycle variations of [Formula: see text] spikes confirming stable neuromorphic firing behavior. Event-driven thresholds are dynamically tuned to [Formula: see text], and simulation sweeps further validate threshold drift within the narrow interval [Formula: see text]. The integrated explainability module processes gradient-based attributions using sample magnitudes [Formula: see text]–0.94, internally expanding to per-layer saliency scores [Formula: see text] across representative trials. Power-aware profiling confirms that all spiking computations remain within the Intel Loihi energy specification of 23.6 pJ per event, supporting sustainable deployment. Experimental iterations on 109,446 MIT-BIH heartbeat samples yield mean diagnostic accuracy of [Formula: see text] with explainability scores of [Formula: see text], and projected energy-efficiency gains converging to [Formula: see text] over conventional AI baselines. Statistical validation employs 10-fold stratified cross-validation with Bonferroni-corrected paired t-tests ([Formula: see text]), demonstrating significant improvements over conventional approaches (Cohen’s [Formula: see text], [Formula: see text]). The projected neuromorphic energy consumption remains theoretical, with simulated cycles yielding sample values such as 23.6pJ–28.2pJ per spike under a modeled firing rate of [Formula: see text]–[Formula: see text]. Claims regarding biodegradable substrate integration are likewise conceptual, assuming provisional material constants [Formula: see text]–1.34 for tensile–thermal coupling. Clinical translation further mandates regulatory approval and structured physician training, while algorithmic correctness is supported through iterative validation on the MIT-BIH dataset (109, 446 labeled beats). Ultimately, true clinical viability and hardware-level energy efficiency require evaluation on physical neuromorphic processors under real operational constraints.This study presents a theoretical framework validated through software simulation using publicly available MIT-BIH Arrhythmia Database; no physical neuromorphic hardware implementation, clinical trials, or human participants were involved.
Pneumoconioses remain an important occupational health issue, particularly in low- and middle-income countries. The International Labour Organization (ILO) Classification standardizes chest radiograph interpretation but requires trained readers and is affected by inter-reader variability. This study evaluated whether generative multimodal artificial intelligence (AI) models can approximate ILO-based diagnostic reasoning. Eighty-two chest radiographs from the official NIOSH B Reader syllabus were analysed using four AI systems (GPT-4o, GPT-5, MedGemma-4B, MedGemma-27B). Each image was evaluated with a standardized prompt based on the 2022 revised ILO guidelines using deterministic settings. Model outputs were mapped to ILO codes and compared with the official answer keys of the ILO Standard Radiograph Set used for B Reader training and examination. Performance metrics included balanced accuracy, sensitivity, specificity, precision, and Matthews correlation coefficient (MCC). Bootstrap 95% confidence intervals, McNemar's test, and Cohen's κ assessed performance variability and agreement. All four AI models showed moderate diagnostic performance, with balanced accuracy ranging from 60.8% to 70.3%. Sensitivity remained limited (35.5%-54.9%), while specificity was consistently high (84.6%-86.2%). MedGemma-27B performed best for small opacities, GPT-5 for pleural abnormalities and for technical quality. Large opacities and rare findings were systematically under-detected. Statistical comparisons showed significant differences between models, although agreement patterns were broadly similar. All AI models partially followed structured ILO radiographic criteria but did not achieve expert-level performance, confirming that they cannot replace certified B Readers. Larger, real-world datasets are needed to assess their potential clinical utility as supportive tools in occupational health surveillance programs.
This article presents a systematic review of research on the social effects and controversies surrounding artificial intelligence (AI) in Primary Care (PC). Systematic review conducted in accordance with the PRISMA 2020 guidelines. A search was performed in the Scopus and Web of Science databases using keywords and disciplinary filters. A total of 703 publications were identified, of which 63 were ultimately included. Publications from 2015 to 2025 were selected if they addressed the social effects of AI in PC and employed qualitative, quantitative, mixed-methods approaches, or conceptual contributions. Clinical studies were excluded. An inductive (non-automated) thematic analysis of the abstracts was conducted for all included articles to identify primary and secondary themes. Full-text readings were subsequently carried out to enrich the analysis. Ten themes were identified: (1) professionals' perceptions, perspectives, and attitudes; (2) patients' perceptions, perspectives, and attitudes; (3) future imaginaries; (4) ethics; (5) physician-patient relationship; (6) impact on management; (7) policy and governance; (8) bias and equity; (9) user experience with prototypes; and (10) job precarity. There is a considerable gap between studies focusing on perceptions and potentialities and empirical studies examining the social effects of AI in PC. Moreover, most analyses are based on prototype studies that have not been routinely implemented in PC settings. Este artículo presenta una revisión sistemática de la investigación sobre los efectos y controversias sociales de la inteligencia artificial (IA) en Atención Primaria (AP). Revisión sistemática según el modelo PRISMA 2020. Búsqueda en las bases de datos Scopus y Web of Science basada en palabras claves y filtros de disciplinas. Se identificaron 703 publicaciones, de las que finalmente se incluyeron 63. Se seleccionaron publicaciones entre 2015 y 2025, sobre los efectos sociales de la IA en AP, que utilizaran metodologías cualitativas, cuantitativas, mixtas y contribuciones conceptuales, excluyendo estudios clínicos. De todos los artículos incluidos se realizó un análisis temático inductivo (no automatizado) de los resúmenes, para identificar temas principales y secundarios. Después se hizo una lectura del texto completo para enriquecer el análisis. Se identificaron 10 temas: 1) percepciones, perspectivas y actitudes de profesionales; 2) percepciones, perspectivas y actitudes de pacientes; 3) imaginarios de futuro; 4) ética; 5) relación médico-paciente; 6) impacto en la gestión; 7) políticas y gobernanza; 8) sesgos y equidad; 9) experiencia de usuario con prototipos y 10) precariedad laboral. Existe una brecha considerable entre los estudios sobre percepciones y potencialidades, y los estudios empíricos sobre los efectos sociales de la IA en la AP. Además, en su mayoría estos análisis se basan en estudios de prototipos, no implementados de manera normalizada en AP.
Precision agriculture leverages advanced technologies to optimize crop management, increase yield and promote sustainable farming practices. Despite significant progress in agricultural automation, continuous field monitoring remains a challenge for farmers due to labor demands and variable environmental conditions. To address this, the use of mobile robots equipped with intelligent perception systems enables autonomous data collection and analysis in real agricultural environments. This work presents a dataset focused on crop monitoring, containing images of corn and beet fields captured by a ground mobile robot. The images were acquired using the Summit XL platform from Robotnik, equipped with an Intel RealSense D455 camera and collected under natural daylight conditions. The robot was teleoperated across the crop fields while recording rosbags that include RGB images, suitable for tasks such as plant detection. The dataset comprises 10,080 images organized following the YOLO object detection format, with 9104 training images, 493 validation images, and 483 test images. All images are annotated with bounding boxes in normalized YOLO format, distinguishing between two crop classes: beet and corn. To enhance model robustness, the dataset includes augmented versions created through geometric transformations and photometric variations. Privacy protection measures were implemented using automated person detection and anonymization. This dataset aims to support research in precision agriculture, particularly in developing intelligent systems for crop monitoring, plant health assessment, and autonomous agricultural inspection. All data are publicly available through a single Hugging Face repository.
Atractylodes macrocephala Rhizoma (AMR) is a frequently used medicinal herb for treating gastrointestinal disorders, with its quality influenced by factors such as origin and cultivation duration. Traditional quality control methods for AMR are time-consuming and invasive, making the development of faster and more efficient alternatives urgently needed. This study aims to utilize electronic nose (E-nose) and electronic tongue (E-tongue) to achieve the acquisition of odor-taste two-dimensional information of AMR. Integrating this approach with machine learning (ML) enables intelligent transformation from "experience-driven" to "data-driven" quality assessment, thereby developing a rapid and cost-effective quality control strategy for AMR. Feature-extraction and feature-selection techniques were employed to optimize back-propagation neural network (BPNN) classification and regression models for eight key quality markers, selecting the optimal feature subset. Additionally, nine machine-learning algorithms were applied with the optimal feature subset to establish classification models for different AMR grades and quantitative regression models for eight components based on E-nose and E-tongue data. The results demonstrated that the E-tongue combined with the k-nearest neighbors (KNN) algorithm could achieve a rapid classification of AMR grades with an accuracy of 95.56%. It also successfully predicted the contents of the extract, volatile oil, polysaccharides, atractylenolide I, atractylenolide II, atractylenolide III, bis-atractylenolide, and atractylone, with the test set's coefficient of determination (R2) values of 0.8874, 0.8313, 0.9628, 0.8406, 0.8736, 0.8532, 0.7758, and 0.8101, respectively. In conclusion, this study provides a comprehensive and rapid solution for AMR grade classification and quality evaluation, significantly improving efficiency compared with traditional methods. This strategy holds substantial promise for real-world applications, as it enables a high-throughput, non-destructive screening of AMR in settings such as post-harvest processing and market quality surveillance, thereby supporting the sustainable and intelligent development of the herbal medicine industry.
SUMMARYAfrica's ongoing struggles with emerging epidemics and antimicrobial resistance (AMR) underscore the urgency of integrating pathogen genomics and surveillance systems into the continent's One Health strategy, particularly given the existing limitations in preparedness and technological resources. This review brings together current evidence on the growth of sequencing infrastructure, the development of regional genomic hubs, and the establishment of governance frameworks, while identifying critical challenges in data integration, bioinformatics capacity, and sustainable financing. Special focus is placed on the lack of African-based genomic data, with our analysis showing that only 1.82% of the global total is available. Case studies illustrate the immense potential and importance of pathogen genomics, giving policymakers a tangible sense of its impact. These examples demonstrate how genomic technologies integrated with artificial intelligence (AI) are transforming outbreak response, AMR surveillance, and stewardship programs by enabling early detection of zoonotic threats, mapping transmission pathways, and guiding vaccine development. However, to fully realize this scientific intel, it is essential to embed One Health pathogen surveillance within strong policy and system frameworks to ensure the translation of technical progress into lasting institutional capacity and sustainable impact. Long-term implementation depends on coordinated investment and advocacy across four interdependent pillars: data architecture, governance and sovereignty, human capital, and technical capacity.
Accurate LiDAR point cloud registration on resource-constrained edge platforms is a prerequisite for intelligent robotics and industrial automation, yet it remains challenging because low-overlap matching, false correspondences, and fine alignment must be handled under limited computing budgets without GPU acceleration. While learning-based methods have advanced the field, their heavy hardware dependency and training requirements often hinder their practical deployment on mobile edge devices. To bridge this gap, this paper proposes GeoRescue, a training-free geometric registration framework designed for high-precision perception under stringent hardware limits. The method consists of three modular stages: Asymmetric Correspondence Expansion (ACE), which enlarges the candidate correspondence set to reduce the loss of true matches; Dynamic Geometric Topology Gating (DGTG), which suppresses false matches through distance-consistency-based hypothesis filtering; and Uncertainty-Aware Manifold Refinement (UAMR), which improves fine alignment by explicitly modeling local anisotropic noise via covariance-guided optimization. Experiments on 3DMatch, 3DLoMatch, and KITTI show that GeoRescue achieves registration recall rates of 84.84% and 41.27%, respectively, and a 94.95% success rate on KITTI. Remarkably, the framework matches the accuracy of high-capacity learning models while running on a GPU-free, 15 W edge CPU platform (Intel Core i5-8265U). These results indicate that GeoRescue provides a deployment-ready solution with an optimal efficiency-accuracy trade-off for LiDAR sensing and robotics perception in complex, real-world scenarios.
This article provides a salmon fillet dataset to investigate the detection of distinct regions, undesirable spots, and possibly the higher nutrient content measurements. Since we know that the belly of salmon is high in omega-3 fatty acids, we can use computer vision and image processing to identify the belly areas of salmon fillets (for trim A, B, and C cuts, trim A cut has the largest belly area) and determine the percentage of these fatty acids. As a result, this dataset becomes essential for training models that identify and examine the belly regions. Datasets were acquired from Lerøy Aurora, a salmon processing plant in Skjervøy, Norway, as well as images taken in our lab during experiments. To acquire the images at the Lerøy plant, two settings were used: (i) using a stand with 3 Intel RealSense RGB-D cameras and (ii) using a stand with 1 Intel RealSense RGB-D camera, depending on the amount of space available to put our setup near the production line. The camera equipment was positioned close to the production line. In total, 712 RGB images, 10 ROS (Robot Operating System) bags with 3 camera settings, and 5 ROS bags with 1 camera setting were taken in the Lerøy plant, while 60 RGB images were captured at the NMBU lab. ROS nodes were utilized to capture both the ROS bags (which carried RGB-D information) and the RGB images. To facilitate further research on salmon fillets, this collection also contains 509 multispectral images of fish fillets. The dataset is intended primarily as a benchmarking and pre-training resource, demonstrating the potential of computer vision for salmon fillet analysis. In conclusion, this comprehensive dataset provides a solid base for potential research on automated salmon fillet analysis. This will enable computer vision and image processing to enhance quality control and nutritional evaluation of salmon fillets.
Polarization imaging using division-of-focal-plane (DoFP) sensors enables simultaneous capture of polarization information, but their super-pixel structure introduces aliasing artifacts after demosaicking. This paper presents a three-stage polarization image demosaicking (PIDM) method using inter-channel interpolation to guide the reconstruction of the missing components. In addition, multi-scale texture-aware guided filtering with confidence-aware fusion is employed to refine both textured and smooth regions. Finally, an objective function combining confidence levels with correlations among the demosaicked image, DoFP input, and Stokes parameters is minimized using Adam's optimization. The method is implemented in two variants with different complexities. Experiments with real DoFP sensor data show that they surpass existing methods in the root mean square error (RMSE) and structural similarity index measure (SSIM) by at least 33.02% and 7.85%, respectively. The results on simulated skylight polarization images further validate its accuracy. The method is highly parallelizable, achieving a 16 × speedup on an Nvidia GTX 1060 GPU over an Intel i5-8300H CPU for a 512 × 612 × 4 input.
Internet of Things (IoT) agents that trigger network enforcement actions must be both well-calibrated (for safe triage) and tail-latency predictable (for service level objectives, SLOs). We present Confidence-Calibrated HP-FedGAT-Trust-IBN, a federated, graph-attention architecture that closes the loop from IoMT sensing to SDN enforcement via parameter-efficient (LoRA/PEFT) updates ([Formula: see text] MB/round), trust-weighted secure aggregation, and intent verification (IBN) triage. Evaluation follows a two-plane protocol: a learning plane with [Formula: see text] simulated clients under a matched comparator harness (Graph-FL and uncertainty-aware FL baselines), and a serving plane that replays exported checkpoints on real edge devices (Raspberry Pi 5, Jetson Orin Nano, Intel NUC 11) and validates SLOs using hardware ECDFs and empirical [Formula: see text]. The model achieves high discrimination (ROC-AUC/PR-AUC [Formula: see text]-[Formula: see text]) with improved calibration (low ECE) under the matched harness, while the serving loop satisfies the [Formula: see text] ms requirement by device-measured [Formula: see text] (e.g., enforcement [Formula: see text] ms, vs. [Formula: see text] ms for an efficient-UQ baseline) and explicit compliance [Formula: see text]. The latency decomposition includes all calibration costs and Monte-Carlo expectations ([Formula: see text], with measured MC share reported), and security modes are quantified end-to-end: CKKS + SMPC adds device-measured [Formula: see text] and crypto-attributable Joules (e.g., [Formula: see text] ms and [Formula: see text] J/round on Raspberry Pi 5). Energy/round is measured on identical hardware and mapped to CO2e for carbon-aware selection of operating points.
The ERBB4 gene encodes a tyrosine kinase receptor for neuregulins and EGF family members, and plays a crucial role in various neurobiological processes. At present, the phenotypic manifestations of genetic variants that disrupt ERBB4 gene function (null variants) are not well established. A search for new patients with null variants in ERBB4 was initiated through an international data-sharing collaboration via GeneMatcher, and by searching the databases Decipher and ClinVar. Diagnosis had been performed using chromosomal microarray analysis, whole-exome sequencing, or whole-genome sequencing. Twenty-four new patients from 13 unrelated families with null variants in ERBB4 were identified. Genetic findings included single- or multiple-exon deletions in eight families, a reciprocal translocation disrupting ERBB4 in one family, and sequence variants in four. Variants arose de novo in four probands, were inherited in eight, and had an unknown inheritance pattern in one. Co-segregation of variants with clinical manifestations was observed within families. The predominant clinical features included neurodevelopmental disorders (intellectual disability, neurodevelopmental delay, autism spectrum disorder, and attention deficit hyperactivity disorder), speech delay, challenging behaviors, hypotonia, psychiatric conditions and seizures. This study represents the largest case series of patients with neurological disorders and null variants in the ERBB4 gene. Our findings support haploinsufficiency as the most plausible pathophysiological mechanism underlying ERBB4-related disorders and broaden the spectrum of associated phenotypes. Autism spectrum disorders and psychiatric manifestations have emerged as frequent, previously underrecognized features. Penetrance appears to be high but incomplete, and expressivity is highly variable, with a tendency toward intrafamilial phenotypic conservation.
Formal verification using temporal logics such as computation tree logic (CTL) is essential for validating safety and correctness in complex systems. However, traditional model-checking techniques face severe scalability limitations due to the state explosion problem and their reliance on exhaustive symbolic traversal. Moreover, existing learning-based verification methods often lack formal guarantees and interpretability. These challenges create a pressing need for scalable, learning-based verification methods that preserve verification reliability while improving computational efficiency. This article introduces a novel deep reinforcement learning (DRL)-based model checking framework that learns to verify CTL formulas directly through interaction with system models. Unlike traditional symbolic model checkers such as NuSMV, the proposed DRL-CTL checker trained using proximal policy optimization (PPO) interprets CTL semantics over system models represented as Kripke structures without performing symbolic state-space traversal at inference time. Reward functions are designed for individual CTL operators, and fixed-point reasoning is incorporated to handle global temporal properties such as $AG(\phi)$ and $EG(\phi)$ . Experimental results show that the proposed method achieves near-constant inference time of approximately 2 ms per formula on an Intel Core i9-13900K CPU (24 cores, 3.0 GHz), 64 GB RAM, NVIDIA RTX 4090 GPU (24 GB VRAM), reduces verification time by up to 90% compared with traditional model checkers, and scales to models with more than $10^{1192}$ reachable states. The framework also produces witnesses and counterexamples and yields verification outcomes identical to those of symbolic checkers in our experiments. These results highlight the potential of DRL to serve as a scalable, efficient, and explainable alternative to classical CTL model checking.
Study DesignRetrospective Multicenter Cohort Study.ObjectivesTo develop and validate an AI-based high-speed multi-class instance segmentation system for lumbar spinal endoscopic surgery using multicenter surgical video data and to assess performance across hardware environments.MethodsEndoscopic videos from 112 patients at 5 hospitals (2020-2025) were analyzed. One frame per 300 frames was sampled, yielding 58,087 annotated images for 7 classes (instrument, fat, soft tissue, bone, nerve, disc, vessel). A Segment Anything Model (SAM)-assisted workflow improved annotation efficiency, followed by expert refinement. A YOLOv11-seg model was trained with a patient-level 4:1 split. Performance was evaluated using precision, recall, F1-score, mAP50, and mAP50-95, stratified by surgical approach. Inference speed was benchmarked across CPU (Intel i5/i7) and GPU (RTX 4080/5080) configurations.ResultsIn the biportal group, overall precision, recall, F1-score, and mAP50 were 0.975, 0.633, 0.768, and 0.629, respectively. The uniportal group demonstrated 0.659, 0.670, 0.664, and 0.682, respectively. Class-wise performance varied substantially by surgical approach: the instrument class showed exceptionally high mAP50 (0.949) in uniportal settings, whereas anatomical structures like vessels were detected with superior accuracy in biportal settings (mAP50 = 0.863). Benchmarking yielded 21.86-27.45 FPS with CPU-only, ∼92 FPS with RTX 4080, and ∼117 FPS with RTX 5080.ConclusionsThis multicenter study highlights the potential of high-speed, multi-class instance segmentation in endoscopic spine surgery. Improving model robustness in visually degraded environments requires further research. Prioritizing high precision to prevent surgeon distraction, supported by rapid inference to maintain temporal continuity, is a practical direction for future surgical AI models.
In this article, we present an efficient and concise OpenMP implementation of the Nussinov RNA folding algorithm, a well-known representative of non-serial polyadic dynamic programming (NPDP). Our goal is to develop an optimized implementation that can serve as a template for related dynamic programming applications. The proposed code is derived from a detailed analysis of manual implementations, emphasizing the separation of problematic and non-problematic instances and structuring computations in a way analogous to matrix multiplication. This design enables the semi-automatic extraction of data locality using tools based on Presburger arithmetic-techniques widely employed in classical loop transformations and advanced source-to-source compilers grounded in the polyhedral model. In the experimental evaluation, we assess the performance of our implementation on modern massively parallel AMD and Intel processors with 64, 128, and 192 threads. Our approach leverages cache-aware tiling, parallelism, and explicit vectorization to maximize computational efficiency, achieving performance that surpasses both automatically generated compiler-based solutions and manually tuned implementations on the evaluated platforms. Specifically, our implementation achieves execution times up to two orders of magnitude faster than polyhedral code, while also outperforming unvectorized manual approaches-being at least 30 × faster than array transposition-based methods and at least 5 × faster than the tiled sparsified Four Russians variant. Additionally, our results indicate that CPU implementations do not exhibit significantly worse performance compared to their corresponding GPU counterparts. These results demonstrate the importance of leveraging Advanced Vector Extensions (AVX) to fully exploit the capabilities of modern multi-core processors, particularly those in the AMD Epyc family.
Accurate grading of cervical intraepithelial neoplasia (CIN1-3) from colposcopic images is clinically critical yet challenging due to subtle inter-grade morphology and substantial imaging variability. We propose an attention-guided mixture-of-experts (MoE) framework that ensembles five pretrained DenseNet-121 experts and employs an attention mechanism over pooled intermediate features to drive a gating network that adaptively weights expert outputs for each image. Operating on feature representations rather than raw pixels allows the gating network to perform input-specific expert selection and improves robustness to ambiguous cases. Using the Intel & MobileODT cervical screening dataset with a strict patient-wise 70/10/20 split, we report mean performance over five runs with 95% confidence intervals. On the independent test set, the proposed MoE achieves 74.0% ± 1.6 accuracy and 72.1% ± 1.8 F1, with per-class AUCs of 0.88 (CIN1), 0.82 (CIN2), and 0.85 (CIN3). The method yields statistically significant improvements over single-network DenseNet-121 baselines and alternative MoE backbones (MobileNet, EfficientNet, ShuffleNet) (p < 0.01). Ablation studies show that attention-guided gating contributes approximately 5-8% absolute accuracy gain over uniform weighting, and that five experts provide the optimal accuracy-efficiency balance. We further present attention visualizations and limited external validation to assess interpretability and generalizability. Although performance remains below that of recent transformer-ensemble models evaluated on smaller or less diverse test sets, the modular and interpretable MoE architecture offers a practical foundation for integrating segmentation or transformer-based experts to advance clinical utility. Code and trained models will be released to support reproducibility.
Volume is an important shape descriptor in postharvest quality evaluation and breeding programs of sweetpotatoes and is also valuable for other agricultural engineering applications. Traditional volume measurement methods based on water displacement are, however, laborious, destructive, and unsuitable for high-throughput online scenarios. To address this gap, this dataset was developed to support the advancement of non-destructive, automated online volume estimation using a LiDAR (light detection and ranging)-based three-dimensional (3-D) machine vision system. A total of 200 sweetpotato storage roots of the cultivar "Beauregard" were collected for constructing a 3-D multi-view imagery dataset. Each sample was imaged online using a short-range LiDAR camera (Intel RealSense™ L515) while traveling on a custom-built roller conveyor system that enables simultaneous translation and rotation for full-surface coverage. The curated dataset comprises raw color images (1280 × 720 pixels, .png format) and corresponding raw and segmented point clouds (1280 × 720 pixels, .laz format) for individual samples, alongside the reference volume measurements obtained using the standard water displacement method. In addition, to illustrate the modeling pipeline for volume prediction, the dataset provides the extracted geometric features derived from the segmented two-dimensional (2-D) masks and point clouds, and volume prediction results obtained through regression modeling. As the first publicly available LiDAR-based dataset for sweetpotato volume estimation, this dataset provides a valuable resource for developing and validating image processing pipelines, optimizing machine learning models, and advancing 3-D vision technologies for non-destructive, rapid measurement of the volume of irregularly shaped agricultural products.
Traditional visual SLAM pipelines are typically designed under the static-world assumption and often degrade severely in indoor environments with frequent human motion. To improve trajectory accuracy and front-end stability in such scenarios while maintaining real-time throughput, we present SY-SLAM, an RGB-D SLAM system for dynamic indoor environments with frequent human motion. (S stands for SuperPoint, which is used as a detector-only learned keypoint front-end, and Y stands for YOLO, which provides asynchronous person-aware keypoint suppression based on detected human bounding boxes.) We integrate a TensorRT-deployed detector-only SuperPoint module to improve keypoint repeatability and robustness while retaining ORB binary descriptors for efficient matching and place recognition within the ORB-SLAM3 framework. To avoid feature starvation while preserving keypoint quality, we further introduce an adaptive SuperPoint keypoint selection strategy that applies stricter filtering when keypoints are abundant and relaxes the selection constraints when they are scarce. In parallel, an asynchronous YOLOv8s TensorRT thread performs person detection with temporal bounding-box memory, and keypoints inside detected person regions are removed before ORB descriptor computation and matching to reduce dynamic-feature contamination in the front end. We evaluate SY-SLAM on five dynamic TUM RGB-D fr3 sequences using ATE and RPE metrics. Compared with ORB-SLAM3, SY-SLAM reduces ATE RMSE by 93.45% across four dynamic walking sequences. On the widely reported fr3/w/x sequence, SY-SLAM achieves competitive accuracy with recent dynamic SLAM methods while maintaining real-time performance. The system runs in real time at 46.8 Hz (21.36 ms per frame) on an Intel i9-13900H CPU with an NVIDIA RTX 4070 Laptop GPU.
Automated anomaly detection in transportation infrastructure is essential for enhancing safety and reducing the operational costs associated with manual inspection protocols. This study presents an improved neuromorphic vision system, which extends the prior SIFT-SNN (scale-invariant feature transform-spiking neural network) proof-of-concept by incorporating temporal feature aggregation for context-aware and sequence-stable detection. Analysis of classical stitching-based pipelines exposed sensitivity to motion and lighting variations, motivating the proposed temporally smoothed neuromorphic design. SIFT keypoints are encoded into latency-based spike trains and classified using a leaky integrate-and-fire (LIF) spiking neural network implemented in PyTorch. Evaluated across three hardware configurations-an NVIDIA RTX 4060 GPU, an Intel i7 CPU, and a simulated Jetson Nano-the system achieved 92.3% accuracy and a macro F1 score of 91.0% under five-fold cross-validation. Inference latencies were measured at 9.5 ms, 26.1 ms, and ~48.3 ms per frame, respectively. Memory footprints were under 290 MB, and power consumption was estimated to be between 5 and 65 W. The classifier distinguishes between safe, partially dislodged, and fully dislodged barrier pins, which are critical failure modes for the Auckland Harbour Bridge's Movable Concrete Barrier (MCB) system. Temporal smoothing further improves recall for ambiguous cases. By achieving a compact model size (2.9 MB), low-latency inference, and minimal power demands, the proposed framework offers a deployable, interpretable, and energy-efficient alternative to conventional CNN-based inspection tools. Future work will focus on exploring the generalisability and transferability of the work presented, additional input sources, and human-computer interaction paradigms for various deployment infrastructures and advancements.
The integration and miniaturization of chips lead to significant power consumption and heat accumulation. Typically, the energy consumption of cooling systems accounts for morn than 50% of the input energy. Current thermal management technologies do not offer solutions for on-chip thermal energy loss. Herein, we propose an on-chip integrated thermal recovery system, which can simultaneously achieve efficient heat dissipation. Present system on chips is based on hydrovoltaic generator technology, consisting of electrodes and gel. With the deep ultraviolet LED (236 nm) chip suffering from severe heat accumulation as a prototype, upon integration with the thermal recovery system, not only maintain the chip temperature below 40 °C, but also converts waste heat into stored electrical energy, resulting in a 610.70% improvement in overall energy utilization efficiency. To demonstrate its general applicability in commercial CPU systems, we used the commercial Intel G3220 chip and as an example, by incorporating four HEG units, the temperature was reduced from 93 °C to below 60 °C, effectively enhancing computational performance and extending the chip's lifespan.