As the data volume of astronomical imaging surveys rapidly increases, traditional methods for image anomaly detection, such as visual inspection by human experts, are becoming impractical. We introduce a machine-learning-based approach to detect poor-quality exposures in large imaging surveys, with a focus on the DECam Legacy Survey (DECaLS) in regions of low extinction (i.e., $E(B-V)<0.04$). Our semi-supervised pipeline integrates a vision transformer (ViT), trained via self-supervised learning (SSL), with a k-Nearest Neighbor (kNN) classifier. We train and validate our pipeline using a small set of labeled exposures observed by surveys with the Dark Energy Camera (DECam). A clustering-space analysis of where our pipeline places images labeled in ``good'' and ``bad'' categories suggests that our approach can efficiently and accurately determine the quality of exposures. Applied to new imaging being reduced for DECaLS Data Release 11, our pipeline identifies 780 problematic exposures, which we subsequently verify through visual inspection. Being highly efficient and adaptable, our method offers a scalable solution for quality control in other large imaging surveys.
Eye tracking has emerged as a powerful tool for examining visual perception and search strategies in various domains, including medicine. While it is relatively straightforward to apply in 2D settings, its use in 3D medical imaging remains challenging and not yet well explored. This gap is particularly relevant for radiology, where volumetric images such as computed tomography (CT) scans are routinely read by medical experts. Radiologists typically interpret these images by navigating through hundreds of 2D slices, most often viewed in the axial projection. A taxonomy of eye movement data during navigation through a CT volume could be valuable to understand how radiologists approach diagnostic tasks. As an example of the derived taxonomy, we asked two radiologists to search abdominal CTs of the pancreas. We collect eye tracking data and align eye gaze movements with slice navigation to visualize the representation of the pancreas through volume and analyze clinicians' gaze behavior in both space and time.
Accurate histopathologic interpretation is key for clinical decision-making; however, current deep learning models for digital pathology are often overconfident and poorly calibrated in out-of-distribution (OOD) settings, which limit trust and clinical adoption. Safety-critical medical imaging workflows benefit from intrinsic uncertainty-aware properties that can accurately reject OOD input. We implement the Spectral-normalized Neural Gaussian Process (SNGP), a set of lightweight modifications that apply spectral normalization and replace the final dense layer with a Gaussian process layer to improve single-model uncertainty estimation and OOD detection. We evaluate SNGP vs. deterministic and MonteCarlo dropout on six datasets across three biomedical classification tasks: white blood cells, amyloid plaques, and colorectal histopathology. SNGP has comparable in-distribution performance while significantly improving uncertainty estimation and OOD detection. Thus, SNGP or related models offer a useful framework for uncertainty-aware classification in digital pathology, supporting safe deployment and building trust with pathologists.
When conducting large-scale studies that collect brain MR images from multiple facilities, the impact of differences in imaging equipment and protocols at each site cannot be ignored, and this domain gap has become a significant issue in recent years. In this study, we propose a new low-dimensional representation (LDR) acquisition method called style encoder adversarial domain adaptation (SE-ADA) to realize content-based image retrieval (CBIR) of brain MR images. SE-ADA reduces domain differences while preserving pathological features by separating domain-specific information from LDR and minimizing domain differences using adversarial learning. In evaluation experiments comparing SE-ADA with recent domain harmonization methods on eight public brain MR datasets (ADNI1/2/3, OASIS1/2/3/4, PPMI), SE-ADA effectively removed domain information while preserving key aspects of the original brain structure and demonstrated the highest disease search accuracy.
Lensless illumination single-pixel imaging with a multicore fiber (MCF) is a computational imaging technique that enables potential endoscopic observations of biological samples at cellular scale. In this work, we show that this technique is tantamount to collecting multiple symmetric rank-one projections (SROP) of an interferometric matrix--a matrix encoding the spectral content of the sample image. In this model, each SROP is induced by the complex sketching vector shaping the incident light wavefront with a spatial light modulator (SLM), while the projected interferometric matrix collects up to $O(Q^2)$ image frequencies for a $Q$-core MCF. While this scheme subsumes previous sensing modalities, such as raster scanning (RS) imaging with beamformed illumination, we demonstrate that collecting the measurements of $M$ random SLM configurations--and thus acquiring $M$ SROPs--allows us to estimate an image of interest if $M$ and $Q$ scale log-linearly with the image sparsity level This demonstration is achieved both theoretically, with a specific restricted isometry analysis of the sensing scheme, and with extensive Monte Carlo experiments. On a practical side, we perform a single ca
We outline some basics of imaging using both fully-coherent and partially-coherent X-ray beams, with an emphasis on phase-contrast imaging. We open with some of the basic notions of X-ray imaging, including the vacuum wave equations and the physical meaning of the intensity and phase of complex scalar fields. The projection approximation is introduced, together with the concepts of attenuation contrast and phase contrast. We also outline the multi-slice approach to X-ray propagation through thick samples or optical elements, together with the Fresnel scaling theorem. Having introduced the fundamentals, we then consider several aspects of the forward problem, of modelling the formation of phase-contrast X-ray images. Several topics related to this forward problem are considered, including the transport-of-intensity equation, arbitrary linear imaging systems, shift-invariant linear imaging systems, the transfer-function formalism, blurring induced by finite source size, the space-frequency model for partially-coherent fields, and the Fokker-Planck equation for paraxial X-ray imaging. Having considered these means for modelling the formation of X-ray phase-contrast images, we then con
For acute ischemic stroke (AIS) patients with large vessel occlusions, clinicians must decide if the benefit of mechanical thrombectomy (MTB) outweighs the risks and potential complications following an invasive procedure. Pre-treatment computed tomography (CT) and angiography (CTA) are widely used to characterize occlusions in the brain vasculature. If a patient is deemed eligible, a modified treatment in cerebral ischemia (mTICI) score will be used to grade how well blood flow is reestablished throughout and following the MTB procedure. An estimation of the likelihood of successful recanalization can support treatment decision-making. In this study, we proposed a fully automated prediction of a patient's recanalization score using pre-treatment CT and CTA imaging. We designed a spatial cross attention network (SCANet) that utilizes vision transformers to localize to pertinent slices and brain regions. Our top model achieved an average cross-validated ROC-AUC of 77.33 $\pm$ 3.9\%. This is a promising result that supports future applications of deep learning on CT and CTA for the identification of eligible AIS patients for MTB.
In this paper, we present a novel approach that can exactly recover extended targets in wave-based multistatic interferometric imaging, based on Generalized Wirtinger Flow (GWF) theory [1]. Interferometric imaging is a generalization of phase retrieval, which arises from cross-correlation of measurements from pairs of receivers in multistatic configuration. Unlike standard Wirtinger Flow, GWF theory guarantees exact recovery for arbitrary lifted forward models that satisfy the restricted isometry property over rank-1, positive semi-definite (PSD) matrices with a sufficiently small restricted isometry constant (RIC). To this end, we design a deterministic, lifted forward model for interferometric multistatic radar satisfying the exact recovery conditions of the GWF theory. Our results quantify a lower limit on the pixel spacing and the minimal sample complexity for exact multistatic radar imaging via GWF. We provide a numerical study of our RIC and pixel spacing bounds, which shows that GWF can achieve exact recovery with super-resolution. While our primary interest lies in radar imaging, our method is also applicable to other multistatic wave-based imaging problems such as those ar
Spectral imaging is a fundamental diagnostic technique with widespread application. Conventional spectral imaging approaches have intrinsic limitations on spatial and spectral resolutions due to the physical components they rely on. To overcome these physical limitations, in this paper, we develop a novel multi-spectral imaging modality that enables higher spatial and spectral resolutions. In the developed computational imaging modality, we exploit a diffractive lens, such as a photon sieve, for both dispersing and focusing the optical field, and achieve measurement diversity by changing the focusing behavior of this lens. Because the focal length of a diffractive lens is wavelength-dependent, each measurement is a superposition of differently blurred spectral components. To reconstruct the individual spectral images from these superimposed and blurred measurements, model-based fast reconstruction algorithms are developed with deep and analytical priors using alternating minimization and unrolling. Finally, the effectiveness and performance of the developed technique is illustrated for an application in astrophysical imaging under various observation scenarios in the extreme ultrav
Fiber-like features are an important aspect of breast imaging. Vessels and ducts are present in all breast images, and spiculations radiating from a mass can indicate malignancy. Accordingly, fiber objects are one of the three types of signals used in the American College of Radiology digital mammography (ACR-DM) accreditation phantom. This work focuses on the image properties of fiber-like structures in digital breast tomosynthesis (DBT) and how image reconstruction can affect their appearance. The impact of DBT image reconstruction algorithm and regularization strength on the conspicuity of fiber-like signals of various orientations is investigated in simulation. A metric is developed to characterize this orientation dependence and allow for quantitative comparison of algorithms and associated parameters in the context of imaging fiber signals. The imaging properties of fibers, characterized in simulation, are then demonstrated in detail with physical DBT data of the ACR-DM phantom. The characterization of imaging of fiber signals is used to explain features of an actual clinical DBT case. For the algorithms investigated, at low regularization setting, the results show a striking
In recent years, the minimum variance (MV) beamforming has been widely studied due to its high resolution and contrast in B-mode Ultrasound imaging (USI). However, the performance of the MV beamformer is degraded at the presence of noise, as a result of the inaccurate covariance matrix estimation which leads to a low quality image. Second harmonic imaging (SHI) provides many advantages over the conventional pulse-echo USI, such as enhanced axial and lateral resolutions. However, the low signal-to-noise ratio (SNR) is a major problem in SHI. In this paper, Eigenspace-based minimum variance (EIBMV) beamformer has been employed for second harmonic USI. The Tissue Harmonic Imaging (THI) is achieved by Pulse Inversion (PI) technique. Using the EIBMV weights, instead of the MV ones, would lead to reduced sidelobes and improved contrast, without compromising the high resolution of the MV beamformer (even at the presence of a strong noise). In addition, we have investigated the effects of variations of the important parameters in computing EIBMV weights, i.e., K, L, and δ, on the resolution and contrast obtained in SHI. The results are evaluated using numerical data (using point target and
The potential benefit of hybrid X-ray and MR imaging in the interventional environment is large due to the combination of fast imaging with high contrast variety. However, a vast amount of existing image enhancement methods requires the image information of both modalities to be present in the same domain. To unlock this potential, we present a solution to image-to-image translation from MR projections to corresponding X-ray projection images. The approach is based on a state-of-the-art image generator network that is modified to fit the specific application. Furthermore, we propose the inclusion of a gradient map in the loss function to allow the network to emphasize high-frequency details in image generation. Our approach is capable of creating X-ray projection images with natural appearance. Additionally, our extensions show clear improvement compared to the baseline method.
Ultraviolet imaging of nearby disk galaxies reveals the star-forming activity in these systems with unprecedented clarity. UV images recently obtained with the Shuttle-borne Ultraviolet Imaging Telescope (UIT) reveal a remarkable variety of star-forming morphologies. The respective roles of tides, waves, and resonances in orchestrating the observed patterns of starbirth activity are discussed in terms of the extant UV data.
A multispectral filter array (MSFA) is one solution for capturing a multispectral image (MSI) in a single shot at low cost. We introduce our optimization method of the spectral sensitivity of the MSFAs and demosaicking, and show a new prototype filter array for snapshot imaging based on a photonic crystal.
Automatic organ segmentation is an important prerequisite for many computer-aided diagnosis systems. The high anatomical variability of organs in the abdomen, such as the pancreas, prevents many segmentation methods from achieving high accuracies when compared to other segmentation of organs like the liver, heart or kidneys. Recently, the availability of large annotated training sets and the accessibility of affordable parallel computing resources via GPUs have made it feasible for "deep learning" methods such as convolutional networks (ConvNets) to succeed in image classification tasks. These methods have the advantage that used classification features are trained directly from the imaging data. We present a fully-automated bottom-up method for pancreas segmentation in computed tomography (CT) images of the abdomen. The method is based on hierarchical coarse-to-fine classification of local image regions (superpixels). Superpixels are extracted from the abdominal region using Simple Linear Iterative Clustering (SLIC). An initial probability response map is generated, using patch-level confidences and a two-level cascade of random forest classifiers, from which superpixel regions wi
Segmenting curvilinear structures in medical images is essential for analyzing morphological patterns in clinical applications. Integrating topological properties, such as connectivity, improves segmentation accuracy and consistency. However, extracting and embedding such properties - especially from Persistence Diagrams (PD) - is challenging due to their non-differentiability and computational cost. Existing approaches mostly encode topology through handcrafted loss functions, which generalize poorly across tasks. In this paper, we propose PIs-Regressor, a simple yet effective module that learns persistence image (PI) - finite, differentiable representations of topological features - directly from data. Together with Topology SegNet, which fuses these features in both downsampling and upsampling stages, our framework integrates topology into the network architecture itself rather than auxiliary losses. Unlike existing methods that depend heavily on handcrafted loss functions, our approach directly incorporates topological information into the network structure, leading to more robust segmentation. Our design is flexible and can be seamlessly combined with other topology-based meth
Color quantization represents an image using a fraction of its original number of colors while only minimally losing its visual quality. The $k$-means algorithm is commonly used in this context, but has mostly been applied in the machine-based RGB colorspace composed of the three primary colors. However, some recent studies have indicated its improved performance in human perception-based colorspaces. We investigated the performance of $k$-means color quantization at four quantization levels in the RGB, CIE-XYZ, and CIE-LUV/CIE-HCL colorspaces, on 148 varied digital images spanning a wide range of scenes, subjects and settings. The Visual Information Fidelity (VIF) measure numerically assessed the quality of the quantized images, and showed that in about half of the cases, $k$-means color quantization is best in the RGB space, while at other times, and especially for higher quantization levels ($k$), the CIE-XYZ colorspace is where it usually does better. There are also some cases, especially at lower $k$, where the best performance is obtained in the CIE-LUV colorspace. Further analysis of the performances in terms of the distributions of the hue, chromaticity and luminance in an
Current methods for searching brain MR images rely on text-based approaches, highlighting a significant need for content-based image retrieval (CBIR) systems. Directly applying 3D brain MR images to machine learning models offers the benefit of effectively learning the brain's structure; however, building the generalized model necessitates a large amount of training data. While models that consider depth direction and utilize continuous 2D slices have demonstrated success in segmentation and classification tasks involving 3D data, concerns remain. Specifically, using general 2D slices may lead to the oversight of pathological features and discontinuities in depth direction information. Furthermore, to the best of the authors' knowledge, there have been no attempts to develop a practical CBIR system that preserves the entire brain's structural information. In this study, we propose an interpretable CBIR method for brain MR images, named iCBIR-Sli (Interpretable CBIR with 2D Slice Embedding), which, for the first time globally, utilizes a series of 2D slices. iCBIR-Sli addresses the challenges associated with using 2D slices by effectively aggregating slice information, thereby achie
Alzheimer's disease (AD) is a progressive neurodegenerative disorder leading to cognitive decline. [$^{18}$F]-Fluorodeoxyglucose positron emission tomography ([$^{18}$F]-FDG PET) is used to monitor brain metabolism, aiding in the diagnosis and assessment of AD over time. However, the feasibility of multi-time point [$^{18}$F]-FDG PET scans for diagnosis is limited due to radiation exposure, cost, and patient burden. To address this, we have developed a predictive image-to-image translation (I2I) model to forecast future [$^{18}$F]-FDG PET scans using baseline and year-one data. The proposed model employs a convolutional neural network architecture with long-short term memory and was trained on [$^{18}$F]-FDG PET data from 161 individuals from the Alzheimer's Disease Neuroimaging Initiative. Our I2I network showed high accuracy in predicting year-two [18F]-FDG PET scans, with a mean absolute error of 0.031 and a structural similarity index of 0.961. Furthermore, the model successfully predicted PET scans up to seven years post-baseline. Notably, the predicted [$^{18}$F]-FDG PET signal in an AD-susceptible meta-region was highly accurate for individuals with mild cognitive impairment
Data limitation is a significant challenge in applying deep learning to medical images. Recently, the diffusion probabilistic model (DPM) has shown the potential to generate high-quality images by converting Gaussian random noise into realistic images. In this paper, we apply the DPM to augment the deep ultraviolet fluorescence (DUV) image dataset with an aim to improve breast cancer classification for intraoperative margin assessment. For classification, we divide the whole surface DUV image into small patches and extract convolutional features for each patch by utilizing the pre-trained ResNet. Then, we feed them into an XGBoost classifier for patch-level decisions and then fuse them with a regional importance map computed by Grad-CAM++ for whole surface-level prediction. Our experimental results show that augmenting the training dataset with the DPM significantly improves breast cancer detection performance in DUV images, increasing accuracy from 93% to 97%, compared to using Affine transformations and ProGAN.