Digital cameras1 and displays2 use picture elements (pixels3) that perform a single function: detecting or emitting light intensity. To exploit the full information content of electromagnetic waves, more advanced elements are required. This has driven the development of multifunctional components that, for example, simultaneously detect and emit intensity4,5 or extract intensity and spectral information6-8. However, no pixel exists that both senses and generates optical wavefronts with full control over amplitude, phase and polarization, limiting bidirectional control and feedback of sophisticated light fields. Here we present a route to such pixels by demonstrating a versatile platform of miniaturized diffractive elements based on Fourier optics9. We use plasmonic surface waves10, which propagate coherently11 and efficiently12-15 across metallic surfaces. When these plasmons are launched towards wavy microstructures16 designed with simple Fourier analysis, arbitrary and background-free optical wavefronts are generated. Conversely, incoming light can be sensed, and its amplitude, phase and polarization can be fully characterized. By combining or superposing several such components, we create multifunctional 'Fourier pixels' that provide compact and accurate control over the optical field. Our approach, which we extend to photonic waveguide modes, establishes a scalable, universal architecture for vectorially programmable pixels with applications in adaptive optics17,18, holographic displays19-21, optical communication22,23 and quantum information processing24.
Although SARS-CoV-2 has been extensively studied from clinical, virological, and diagnostic perspectives, the problem of accurate automatic semantic segmentation of SARS-CoV-2 particles in electron microscopy images remains inadequately explored. Existing studies have largely focused on virus detection, classification, morphometry, or conventional image analysis, while comparatively little attention has been paid to pixel-level delineation of viral structures using specialised deep learning segmentation frameworks. To address this gap, we propose here a deep learning system based on convolutional neural networks (CNNs) combined with image processing techniques to establish semantic segmentation tools for the automatic identification of SARS-CoV-2. Our approach utilises the super-Euclidean pixels method as an intermediate layer within the CNN for semantic segmentation. We then compare its performance against the gradient vector flow (GVF) and Poisson inverse gradient (PIG) segmenters. The proposed CNN model surpassed the traditional GVF and PIG segmentation models, achieving the following metrics (mean ± variance): Dice similarity coefficient (DSC) = 0.9345 ± 0.0006; intersection over union (IoU) = 0.8782 ± 0.0018; sensitivity/true positive rate (TPR) = 0.9373 ± 0.0018; specificity/true negative rate (SPC) = 0.9517 ± 0.0012; accuracy = 0.9449 ± 0.0004; area under the ROC curve (AUC) = 0.9446 ± 0.0431; and Cohen's Kappa = 0.9137 ± 0.0011. This method enables virologists to employ an automatic CNN-based segmentation tool for detecting SARS-CoV-2 and demonstrates superiority over GVF and PIG.
Images acquired with point-scanning instruments, line-scanning instruments, and rolling shutter cameras are distorted by motion of the instrument or the scene. This distortion arises from the sequential capture of pixels or rows of pixels and is known as the rolling shutter effect. Here, we demonstrate the correction of rolling shutter distortion caused by in-plane motion with a dual-imaging technique inspired by strabismus, an eye misalignment condition. When the velocity observed in pairs of synchronously captured and misaligned images can be considered constant within each image row or column, the distortion estimation problem can be formulated as a system of linear equations describing the displacement of image strips along the slow scanning direction or as an optimization problem that maximizes the similarity between the two distortion-corrected images. Both methods can be modified to correct motion distortion caused by in-plane rotation and scaling changes. Strabismic image pairs, acquired with a point-scanning instrument and a dual-rolling shutter camera setup, are used to demonstrate both approaches.
Stable three-dimensional hand landmark reconstruction using low-cost RGB-D sensors is important for human-computer interaction, robot teleoperation, and vision-based motion analysis. RGB-based hand landmark detectors provide stable semantic 2D landmarks, but their depth output is not a metric measurement in the physical camera coordinate system. Stereo cameras can provide metric depth, but direct landmark-level back-projection is sensitive to invalid pixels, local depth holes, boundary noise, and partial occlusion. To address these problems, this paper presents a lightweight RGB-D sensing front-end that combines MediaPipe semantic hand landmarks with ZED2 stereo depth. The proposed pipeline detects 21 semantic hand landmarks in the RGB image, obtains landmark-level metric depth from the aligned ZED2 depth map using local median sampling, reconstructs 3D landmarks by camera back-projection, and further applies exponential moving average filtering and a bone-length consistency constraint. Experiments were conducted on a self-collected SVO dataset containing 13 hand actions and 26 recorded sequences, and an additional checkerboard-based reference-distance validation was performed to evaluate the metric depth sampling and 3D back-projection component. Compared with single-pixel sampling, the 5×5 local median strategy slightly increased the valid-depth ratio from 0.9731 to 0.9738 and reduced the temporal smoothness metric from 1.7163 mm to 1.6902 mm. To further justify the temporal filtering choice, an additional comparison with the 1 Euro Filter was conducted using the reconstructed win5 trajectories. The 1 Euro Filter produced stronger smoothing, reducing the temporal smoothness metric to 0.196 mm, but also reduced the path-length ratio to 0.484, indicating substantial motion attenuation. EMA0.7 was therefore retained as a more balanced setting, reducing the temporal smoothness metric to 0.826 mm while maintaining a path-length ratio of 0.803. The BL0.5 bone-length constraint reduced the bone-length standard deviation from 2.0727 mm to 1.1995 mm with limited trajectory modification. The final configuration provides a practical low-cost RGB-D front-end for stable 3D hand landmark reconstruction under controlled indoor conditions.
Scale-Invariant Feature Transform (SIFT) features are widely used in target recognition and tracking. This study aimed to exploit the robust performance of the SIFT algorithm to accurately calculate tumor tissue displacement during respiratory motion. A thoracic phantom was employed in this study. Eight synthetic nodules with different shapes and gray values were inserted into the phantom. First, the nodules were displaced upward, downward, leftward, and rightward by 10 pixels to simulate motion in different directions during respiration. The nodules were then rotated clockwise and counterclockwise by 5°, 10°, 30°, and 45°. Subsequently, the SIFT and cross-correlation algorithms were applied to analyze the phantoms. Finally, a t-test was used to assess differences among motion directions. The t-test result indicated no significant difference in the detection of moving phantom across all motion directions (p > 0.05). No statistically significant difference was observed between SIFT and cross-correlation in detecting translational motion. In contrast, the results obtained from rotating phantoms demonstrated that SIFT could effectively detect rotational distortion. The SIFT algorithm can be used to calculate tissue distortion caused by thoracic motion during respiration.
To develop and validate a fully automated quality assurance method for high-dose rate (HDR) brachytherapy, enabling the precise evaluation of source dwell-position accuracy and dwell-time linearity using radiochromic film analysis and advanced image processing algorithms. A novel analytical framework was established to verify HDR source positional precision and dwell-time linearity. Radiochromic films were irradiated under predefined treatment plans. Image preprocessing involved grayscale conversion, median filtering, rotation correction, and absolute coordinate calibration. The centroid of each dwell position was determined using two-dimensional Gaussian fitting. Temporal linearity analysis employed radial grayscale summation within the radii of 250-400 pixels. Linearity was quantified using the coefficient of determination (R2), with R2 approaching 1 indicating optimal performance. Automated Gaussian fitting achieved sub-millimeter positional accuracy (maximum error ≤ 0.6 mm), eliminating manual intervention. Rotation correction algorithms effectively mitigated positional errors caused by film misalignment. Comparative analysis with commercial software yielded nearly identical results, validating the reliability of the method. Dwell-time analysis demonstrated excellent linearity (R2≥0.999) over summation radii ranging from 150 to 400 dpi. This study presents a robust and fully automated quality assurance (QA) solution for HDR brachytherapy that simultaneously verifies sub-millimeter-level dwell-position accuracy and temporal linearity. The integration of radiochromic film dosimetry with algorithmic image processing eliminates manual intervention and standardizes QA workflows, representing a substantial advancement in treatment safety and efficiency.
Multitask vehicle routing problems (VRPs) play a critical role in enhancing efficiency across various industries and service sectors. These problems consist of multiple variants that optimize routing costs while meeting diverse customer constraints. Existing multitask VRP solvers solely utilize a graph-based modality, limiting their ability to address variants with multiple constraints. As a format to represent complex semantics, the vision modality shows great potential for encoding diverse VRP constraints. This motivates us to learn patch-level semantics from the vision images, and then integrate them into a graph-based model to solve various VRP variants simultaneously. However, directly applying this approach to multitask VRPs presents three challenges: 1) existing VRP images lack constraint representations, which are essential for multitask VRPs; 2)the fixed receptive field of individual patches cannot effectively accommodate varying requirements across tasks; and 3) imbalanced pixel distribution among constraints may cause the model to overlook constraints with fewer pixels. In this article, we propose a vision-assisted foundation model (VaFM) to address these challenges. In the vision modality, input images tailored to all constraints are encoded by a convolutional neural network (CNN). The obtained patch embeddings are fused with graph-based nodes to generate solutions, with an auxiliary task designed to address the pixel-imbalanced issue. In particular, we design a hybrid cross-attention fusion module to enable adaptive receptive fields for different tasks. It incorporates feature maps from shallow layers to focus on local details and uses cross-patch attention to capture global information. Moreover, we design a constraint-aware auxiliary task and utilize binary cross-entropy (BCE) loss to ensure balanced learning of all constraints. The performance of VaFM is evaluated across 16 different VRP variants. The experimental results demonstrate the superiority of VaFM over state-of-the-art (SOTA) methods, especially for variants with complex constraints.
Accurate and automatic segmentation of the liver and liver tumors from computed tomography (CT) images is essential for computer-assisted diagnosis, treatment planning, and clinical decision-making. Although deep learning-based segmentation models, particularly U-Net and its variants, have achieved promising results in medical image analysis, many existing approaches mainly focus on local pixel-level feature extraction and may have limited ability to explicitly model long-range spatial relationships among anatomically meaningful regions. In addition, liver tumor segmentation remains challenging due to low contrast, irregular tumor boundaries, heterogeneous tumor appearances, and noise or artifacts in CT images. To address these limitations, this study proposes a hybrid ensemble neural network architecture that integrates an improved U-Net and a Graph U-Net for automatic liver and liver tumor segmentation. The improved U-Net is designed to capture fine-grained local features and preserve detailed spatial information through an encoder-decoder structure with skip connections, while the Graph U-Net uses Simple Linear Iterative Clustering (SLIC)-based superpixels to construct a graph representation of CT images and model spatial dependencies between adjacent image regions. By combining these complementary representations through an ensemble learning strategy, the proposed framework enhances both pixel-level segmentation accuracy and robustness against noisy imaging conditions. The proposed method was evaluated on the LiTS17 dataset, where CT images were preprocessed using intensity filtering, resizing, data augmentation, and normalization. Experimental results demonstrate that the proposed ensemble architecture achieves 99.2% accuracy for liver segmentation and 98.1% accuracy for liver tumor segmentation, outperforming representative segmentation models such as MultiresUnet and R2U-Net. Furthermore, robustness experiments under different signal-to-noise ratio conditions show that the proposed model maintains stable performance in noisy CT images, achieving 85% accuracy even under severe noise at -4 dB SNR. This result highlights the advantage of integrating convolutional feature learning with graph-based spatial relationship modeling for improving segmentation stability when image quality is degraded by noise or artifacts. These findings indicate that the integration of improved U-Net, SLIC-based graph construction, and Graph U-Net provides an effective and noise-robust solution for liver and liver tumor segmentation, with potential applicability as a computer-assisted tool in clinical image analysis after further validation on larger and external datasets.
Accurate positioning is essential for inspection robots in caged chicken houses, where long straight corridors, sparse textures, and repetitive structures challenge conventional methods. This paper proposes CVIWM (Coupled Visual-Inertial-Wheel Odometry with Markers), a tightly coupled state estimation method that fuses visual, inertial measurement unit (IMU), wheel odometry (WO), and fiducial marker observations within a factor graph optimization framework. Wheel odometry preintegration suppresses IMU horizontal drift and provides absolute scale, while sparse AprilTag markers (10 m spacing) periodically reset accumulated errors. Experiments in an 80 m corridor of a commercial caged chicken house at 0.116 m/s and 0.232 m/s showed that CVIWM achieves average positioning errors of 2.402 cm and 3.253 cm. This high precision ensured reliable image acquisition (image shift <83 pixels), enabling 95.7% dead hen detection and 98.9% egg detection accuracy. CVIWM offers a low-cost, easy-to-deploy, high-accuracy solution for automated poultry house inspection, supporting smart livestock farming.
Proton therapy can achieve high radiation dose to the tumor while sparing normal tissue beyond the dose fall-off. Accurate estimation of proton stopping power ratio (SPR) and range from computed tomography (CT) data is a prerequisite for minimizing range uncertainty and treatment margins. Photon-counting CT (PCCT) could potentially improve the accuracy of SPR estimation with the increased number of energy measurements. This work aims to assess the fundamental limits of SPR and range estimation via material decomposition (MD) with PCCT, while evaluating the feasibility for higher MD dimensionalities for proton treatment planning. Eigentissue decomposition is used for computing optimal basis materials for elemental composition estimation. We model a water phantom with a centered test tissue insert, and use Cramér-Rao Lower Bound to estimate the covariance of basis material sinogram noise for MD dimensionalities 1, 2, 3, and 4. We model an ideal photon-counting detector with 1 mm2detector pixels, and fluence corresponding to 260 mAs at 120 kV. For each dimensionality, the noise is propagated to SPR level and corresponding SPR bias error is calculated. Range uncertainties are estimated through simulations in RayStation, with SPR volumes corresponding to calculated bias and noise for each dimensionality. The three-material decomposition was optimal for soft tissues, while two-material decomposition was optimal for bone when estimating proton range. The noise increase for the three-material decomposition did not generally translate to greater deviations at range level. The lowest range RMSEs with an ideal photon-counting detector were <0.1-0.7 %, depending on tissue type and depth. This work indicates the limits for SPR and range estimation accuracy using PCCT with an MD-based approach, and shows that multimaterial decomposition through quantiative imaging with PCCT could improve tissue characterization and range estimation in proton therapy.
We present a curated dataset of planar displacement fields from eight fatigue crack growth experiments obtained via full-field digital image correlation (DIC). The dataset covers multiple aerospace-grade aluminium alloys, specimen geometries, material orientations, and load configurations, providing a diverse experimental basis for data-driven fracture mechanics research. Crack tip locations are consistently annotated using an iterative correction procedure applied to all measurements, and fracture mechanical descriptors like stress-intensity factors are provided as additional labels. The dataset comprises 8,794 unique experimentally observed displacement fields and a total of 70,352 supervised samples generated through standardized interpolation and augmentation. DIC data is provided as uniformly interpolated displacement grids at three standardized resolutions (28 × 28, 64 × 64, and 128 × 128 pixels), each available in three dataset sizes to support scalable use cases ranging from educational applications to high-capacity model development. Accompanying metadata and a Python interface facilitate filtering, loading, and integration into reproducible machine learning and fracture mechanics workflows.
Pixelated detectors based on inorganic scintillation materials are widely used in radiation detection systems for medical imaging and many other fields of science and technology. A substantial application is X-ray scanning using flat-panel detectors (FPDs) for both fluorography and mammography. In this article, the detection properties of the monolithic planar ceramic scintillation elements are reported for the first time. A high-light yield (Gd,Y)3Al2Ga3O12:Ce,Mg garnet-type scintillation material was used to form square-shaped pixels, while a material of similar composition was used as a substrate. Green bodies were successfully fabricated by a digital light processing (DLP) 3D printing method. Subsequent debinding and pressureless high-temperature sintering resulted in composite elements consisting of two layers with different chemical compositions. The lower bulk layer consisted of transparent, non-luminescent garnet, whereas the upper pixelated layer, with pixel dimensions of 230 × 230 µm, was made of scintillation material. The spatial resolution of the matrices under UV light and alpha-particle excitation was evaluated. It was confirmed that the spatial resolution of the matrices produced by the developed technology is approximately 0.4 times the pixel size. The proven ability of the integrated technology of inorganic scintillation matrix production opens the way for future improvement in spatial resolution through optimizing the printed pixel dimensions.
With the rapid expansion of global photovoltaic (PV) installed capacity, hot spot defects have become a major hidden danger that reduces power generation efficiency and threatens the safe and stable operation of PV stations. Unmanned aerial vehicle (UAV) infrared remote sensing is a key technology for the efficient intelligent monitoring of large-scale PV stations. However, detecting tiny hotspots in such infrared images poses severe challenges. Most of these defects are ultra-small targets with extremely low pixel size and weak contrast, which are easily submerged by complex background noise, leading to prominent issues including low detection accuracy and high miss rates. To address these issues, we propose a lightweight detection network based on YOLO11n, named PHSNet, for PV hotspot detection in UAV infrared images. Its core designs include the dynamic convolution integrated C3k2 (Dy-C3k2) for small target feature enhancement, context-guided downsampling (CG-Down) to alleviate feature loss during downsampling, optimized detection layers, and a lightweight shared deconvolutional detection head (LSDECD) for small target adaptation in low-altitude aerial scenes, forming a full-link optimization architecture for tiny target feature perception. Experiments on a dedicated dataset (4025 images, 25,181 annotations, 92% targets < 20 pixels) show that PHSNet achieves 0.73 AP50 and 0.315 AP, surpassing YOLO11n by 0.1 in AP50 and 0.058 in AP, respectively. With only 1.8 M parameters and 98.8 FPS, it outperforms mainstream lightweight models, including YOLOv8n and RT-DETR-R18, strikes a superior accuracy-efficiency balance, and provides an efficient solution for real-time intelligent monitoring and edge deployment of PV stations.
The use of image-sensing and real-time processing in Intelligent Transportation Systems (ITS) has introduced a sudden surge in transmitting and gathering high-resolution visual data from vehicle cameras, road infrastructures, and user devices. Such image data are, however, exceedingly susceptible to interception, tampering, and privacy breaches with regard to imminent quantum computing attacks that can break classical encryption algorithms. With such constraints in view, the paper presents a new Hybrid Quantum-Classical Image Encryption Framework that integrates chaos-based bit-level image encryption and quantum-resistant encryption measures to ensure high-security protection of image information in ITS infrastructures. The new framework integrates a customized bit-level chaotic permutation scheme using a Rearranged Arnold Cat Map (R-ACM) and 2D Logistic-Sine Chaotic Maps for confusion and diffusion, and the inclusion of a Quantum Key Distribution (QKD) or post-quantum lattice-based Kyber Key Encapsulation Mechanism (KEM) for secure key negotiation. The two-pyramidal security architecture enhances sensitivity to key and plaintext variations, offers chosen-plaintext, differential, noise, and occlusion attack immunity, and supports efficient encryption of RGB and grayscale image information without excessively large time overhead. Experimental results on representative ITS-relevant image data sets verify superior performance with mean NPCR > 99.60%, UACI ≈ 33.5%, entropy measures close to 8.0, and significantly suppressed correlation between neighboring pixels. Further, key space analysis demonstrates a combinatorial complexity of over 2²⁵⁶, making brute-force and quantum-type attacks computationally infeasible. The new framework is extremely suitable for real-time implementation in autonomous vehicles, roadside edge nodes, and intelligent traffic monitoring systems, thereby enabling secure, intelligent, and privacy-preserving ITS infrastructure in the post-quantum era.
With the increasing reliance on intelligent transportation systems, securing traffic images against unauthorized access and tampering has become a critical concern. The work introduces a novel 4D hybrid chaotic system, integrating the memristive Rucklidge system and a discrete nonlinear map and an image encryption scheme as its application. The proposed encryption framework ensures high security and resistance against cryptographic attacks by leveraging chaotic dynamics for both permutation and diffusion processes. The encryption scheme primarily has three phases: (i) the generation of the Quadrant Hybrid Chaotic Matrix (QHCM), a dynamically structured chaotic matrix divided into four quadrants, each governed by different chaotic equations; (ii) concentric ring permutation, where traffic image pixels are rearranged within concentric rings using QHCM to enhance security; and (iii) QHCM-based chain diffusion, ensuring a nonlinear and spatially distributed transformation of pixel values by processing rings from the outermost region towards the center. The number of permutation and diffusion rounds is dynamically determined based on the discrete evolution of the proposed chaotic system. Extensive experimental analysis demonstrates the robustness of the encryption scheme in terms of statistical, differential properties, key space analysis and randomness. In particular, the method achieves a Net Pixel Change Rate (NPCR) of 99.62% and Unified Average Changing Intensity (UACI) of 50.09%, key space [Formula: see text] ,with a highest entropy 7.999769 outperforming other existing systems. The proposed approach is well-suited for real-time encryption of traffic surveillance images, ensuring secure transmission and storage in intelligent transportation networks.
Building extraction from disaster scenes is critical for emergency response and post-disaster assessment. Unlike conventional static remote sensing imagery, multi-phase disaster imagery contains scenes spanning early, middle, and late disaster stages, where building morphology, class distribution, and boundary characteristics exhibit significant cross-phase heterogeneity. Such phase-dependent variations substantially increase the difficulty of stable semantic segmentation, particularly under complex damage conditions. To address these challenges, we propose MSS-MambaNet for building extraction from multi-phase disaster imagery. A multi-scale architecture is designed to overcome the limitations of single-scale scanning in Mamba, enabling more effective perception of diverse building morphologies. To enhance feature discrimination, a Dual-Domain Cross-Gated Fusion (DDCGF) module is introduced through complementary interactions between spatial and frequency-domain representations. In addition, a Pixel-Aware Dynamic Weighting (PADW) strategy is developed to adaptively emphasize imbalanced foreground pixels and ambiguous boundary regions, thereby improving segmentation consistency under complex disaster conditions. Extensive experiments demonstrate that MSS-MambaNet consistently outperforms state-of-the-art methods, achieving an average mIoU of 92.78% and mF1 of 96.25% with only 12.37 M parameters. These results indicate that the proposed method effectively handles the heterogeneity of multi-phase data, providing a stable and efficient solution for building extraction from multi-phase disaster imagery.
Cerebrovascular diseases are major causes of death and disability worldwide, highlighting the critical importance of early diagnosis and accurate acquisition of vascular information. However, conventional imaging techniques using direct subtraction computed tomography (CT) and dual energy CT have limitations, including invasiveness, radiation exposure, and artifacts caused by metal and bone. This study investigated the feasibility of cerebral artery segmentation in single-exposure CT angiography (CTA) using a projection-domain framework derived from patient-based CTA data. The proposed method employs the DeepLab V3+ model to segment brain vessels directly in the projection. A total of 103 patients were included in the dataset, and 61, 17, and 25 patients were allocated to the training, validation, and test sets, respectively. This approach eliminates the risk of double exposure and motion artifacts while preserving clinical information. Additionally, this approach minimizes beam-hardening artifacts from high-density materials and reduces the operator's workload. The cerebral artery images reconstructed using the proposed method were quantitatively compared to those of the labeled images, and the intersection over union, Dice similarity coefficient, bidirectional Hausdorff distance, bfscore, F1-score, and precision were measured to be approximately 0.89 [95% CI, 0.87-0.91], 0.90 [0.88-0.91], 219.12 pixels [201.54-236.70], 0.90 [0.89-0.92], 0.90 [0.89-0.92], and 0.89 [0.88-0.91], respectively. Performance metrics consistently demonstrated high agreement between reconstructed vessel maps and reference labels. In addition, the proposed method confirmed that reconstructing cerebral arteries and metallic implant components may yield clinically relevant vascular image information with limited information loss. These results support the feasibility of the proposed method for generating cerebral arterial 3D images in CTA systems and suggest potential utility for improved vascular visualization and image quality, pending further clinical validation.
Mechanical metasurfaces are deformable surfaces that reconfigure their shape in response to external stimuli, with growing applications in interactive displays and human-machine interfaces. However, many existing systems are either too fragile for physical interaction or too slow to match human response. Moreover, their incompatibility for integrating functional systems constrains seamless interaction with humans. We developed a robust, rapidly responsive, and multifunctional soft metasurface capable of intuitive interaction with humans. The soft surface reconfigures through magnetic actuation, enabling mechanical metasurface functions with simultaneous visual and tactile feedback for human interaction. The system features a six-by-six array of elastomeric pixels, each actuated by attractive or repulsive magnetic forces from an underlying electromagnet array. The induced magnetic force modulates the surface height up to 8.5 millimeters (-5 to 3.5 millimeters), enabling coordinated actuation of 36 electromagnets to reconstruct over 1030 discrete surface morphologies. Its mechanics compliant design allows human interaction such as pushing, pulling, and pinching. Embedded inertial measurement unit sensors reconstruct surface shape in real time, and an integrated light-emitting diode array provides immediate visual feedback. This platform enables fast, reversible, and intuitive interaction between users and programmable surfaces, laying the groundwork for next-generation human-machine interaction systems.
Efficient textile recycling depends on accurate identification of fibre types and compositions to support high-value material recovery and automated sorting. Existing commercial systems based on near-infrared (NIR) spectroscopy offer robust performance, but their model architectures and development methods are proprietary, and they often struggle to detect materials when carbon-black (graphite-based) dyes suppress the spectral signatures. This paper presents a hyperspectral imaging approach for textile fibre identification, combined with an artificial intelligence model capable of detecting cotton, polyester, elastane, and regions affected by carbon-black dye. Sixty-five textile samples were laboratory-verified to determine constituent materials and compositions, with 52 used in model development and testing. A semi-automatic algorithm detected textile boundaries and sampled 100 spectral patches per image. For materials exhibiting two distinct spectral signatures, typically due to carbon-black dye regions, 100 samples were collected for each signature, producing a database of 6500 spectra. A convolutional neural network model was trained using these signatures to predict fibre composition and identify any regions with carbon-black dye. The system achieved mean absolute errors below 2.1% for cotton, polyester, and elastane. A spatial clustering step groups pixels with similar spectra prior to detection, enabling region-wise material identification and allowing the model to classify clusters likely affected by carbon-black dye. This approach demonstrates high precision in fibre identification and reliable detection of carbon-black regions, highlighting its suitability for real-world textile analysis workflows.
The programmed cell death-1/programmed cell death ligand-1 (PD-1/PD-L1) pathway contributes to tumor immune evasion and represents an emerging topic in canine oncology. In canine urothelial carcinoma (UC), PD-L1 immunohistochemical evaluation may be challenging because of staining heterogeneity, and only membranous labeling is considered specific. This study assessed the feasibility of a computer-assisted image analysis workflow incorporating supervised classification tools for PD-L1 scoring on whole-slide images of canine UC. Immunohistochemistry was performed using a validated anti-PD-L1 antibody. Digital slides were analyzed with QuPath software using object- and pixel-based classifiers to identify PD-L1-positive tumor and immune cells and to calculate the combined positive score (CPS). Ten of 48 tumors (21%) showed PD-L1 expression. CPS values obtained by computer-assisted analysis correlated with manual counts. Interobserver agreement for manual CPS assessment among multiple pathologists was good (intraclass correlation coefficient = 0.795). These results indicate that computer-assisted whole-slide analysis is a feasible supportive approach for PD-L1 scoring in canine UC under pathologist supervision.