Confocal laser endomicroscopy (CLE), although capable of obtaining images at cellular resolution during surgery of brain tumors in real time, creates as many non-diagnostic as diagnostic images. Non-useful images are often distorted due to relative motion between probe and brain or blood artifacts. Many images, however, simply lack diagnostic features immediately informative to the physician. Examining all the hundreds or thousands of images from a single case to discriminate diagnostic images from nondiagnostic ones can be tedious. Providing a real-time diagnostic value assessment of images (fast enough to be used during the surgical acquisition process and accurate enough for the pathologist to rely on) to automatically detect diagnostic frames would streamline the analysis of images and filter useful images for the pathologist/surgeon. We sought to automatically classify images as diagnostic or non-diagnostic. AlexNet, a deep-learning architecture, was used in a 4-fold cross validation manner. Our dataset includes 16,795 images (8572 nondiagnostic and 8223 diagnostic) from 74 CLE-aided brain tumor surgery patients. The ground truth for all the images is provided by the pathologi
Eye tracking has emerged as a powerful tool for examining visual perception and search strategies in various domains, including medicine. While it is relatively straightforward to apply in 2D settings, its use in 3D medical imaging remains challenging and not yet well explored. This gap is particularly relevant for radiology, where volumetric images such as computed tomography (CT) scans are routinely read by medical experts. Radiologists typically interpret these images by navigating through hundreds of 2D slices, most often viewed in the axial projection. A taxonomy of eye movement data during navigation through a CT volume could be valuable to understand how radiologists approach diagnostic tasks. As an example of the derived taxonomy, we asked two radiologists to search abdominal CTs of the pancreas. We collect eye tracking data and align eye gaze movements with slice navigation to visualize the representation of the pancreas through volume and analyze clinicians' gaze behavior in both space and time.
Spectral imaging is a fundamental diagnostic technique with widespread application. Conventional spectral imaging approaches have intrinsic limitations on spatial and spectral resolutions due to the physical components they rely on. To overcome these physical limitations, in this paper, we develop a novel multi-spectral imaging modality that enables higher spatial and spectral resolutions. In the developed computational imaging modality, we exploit a diffractive lens, such as a photon sieve, for both dispersing and focusing the optical field, and achieve measurement diversity by changing the focusing behavior of this lens. Because the focal length of a diffractive lens is wavelength-dependent, each measurement is a superposition of differently blurred spectral components. To reconstruct the individual spectral images from these superimposed and blurred measurements, model-based fast reconstruction algorithms are developed with deep and analytical priors using alternating minimization and unrolling. Finally, the effectiveness and performance of the developed technique is illustrated for an application in astrophysical imaging under various observation scenarios in the extreme ultrav
We have set up a diagnostic magnet (D-Mag) laboratory for a wide range of applications in plasma physics. It consists of a superconducting magnet for field strengths of up to 5.9 T. The main purpose is to provide an experimental environment for the development of plasma diagnostics for nuclear fusion studies and the investigation of dusty plasmas in strong magnetic fields. We describe in the article the setup and operation of the D-Mag. Some applications are presented for the development of plasma diagnostics, such as neutral pressure gauges and Langmuir probes that have to be operated in strong magnetic fields. Among the examples is the test of the long-pulse capability and stability of the diagnostic pressure gauge (DPG) for the ITER device.
Accurate histopathologic interpretation is key for clinical decision-making; however, current deep learning models for digital pathology are often overconfident and poorly calibrated in out-of-distribution (OOD) settings, which limit trust and clinical adoption. Safety-critical medical imaging workflows benefit from intrinsic uncertainty-aware properties that can accurately reject OOD input. We implement the Spectral-normalized Neural Gaussian Process (SNGP), a set of lightweight modifications that apply spectral normalization and replace the final dense layer with a Gaussian process layer to improve single-model uncertainty estimation and OOD detection. We evaluate SNGP vs. deterministic and MonteCarlo dropout on six datasets across three biomedical classification tasks: white blood cells, amyloid plaques, and colorectal histopathology. SNGP has comparable in-distribution performance while significantly improving uncertainty estimation and OOD detection. Thus, SNGP or related models offer a useful framework for uncertainty-aware classification in digital pathology, supporting safe deployment and building trust with pathologists.
Learning systems deployed in nonstationary and safety-critical environments often suffer from instability, slow convergence, or brittle adaptation when learning dynamics evolve over time. While modern optimization, reinforcement learning, and meta-learning methods adapt to gradient statistics, they largely ignore the temporal structure of the error signal itself. This paper proposes a diagnostic-driven adaptive learning framework that explicitly models error evolution through a principled decomposition into bias, capturing persistent drift; noise, capturing stochastic variability; and alignment, capturing repeated directional excitation leading to overshoot. These diagnostics are computed online from lightweight statistics of loss or temporal-difference (TD) error trajectories and are independent of model architecture or task domain. We show that the proposed bias-noise-alignment decomposition provides a unifying control backbone for supervised optimization, actor-critic reinforcement learning, and learned optimizers. Within this framework, we introduce three diagnostic-driven instantiations: the Human-inspired Supervised Adaptive Optimizer (HSAO), Hybrid Error-Diagnostic Reinforce
As the data volume of astronomical imaging surveys rapidly increases, traditional methods for image anomaly detection, such as visual inspection by human experts, are becoming impractical. We introduce a machine-learning-based approach to detect poor-quality exposures in large imaging surveys, with a focus on the DECam Legacy Survey (DECaLS) in regions of low extinction (i.e., $E(B-V)<0.04$). Our semi-supervised pipeline integrates a vision transformer (ViT), trained via self-supervised learning (SSL), with a k-Nearest Neighbor (kNN) classifier. We train and validate our pipeline using a small set of labeled exposures observed by surveys with the Dark Energy Camera (DECam). A clustering-space analysis of where our pipeline places images labeled in ``good'' and ``bad'' categories suggests that our approach can efficiently and accurately determine the quality of exposures. Applied to new imaging being reduced for DECaLS Data Release 11, our pipeline identifies 780 problematic exposures, which we subsequently verify through visual inspection. Being highly efficient and adaptable, our method offers a scalable solution for quality control in other large imaging surveys.
The muon anomalous magnetic moment, $a_μ=\frac{g-2}{2}$, is a low-energy observable which can be both measured and computed to high precision, making it a sensitive test of the Standard Model and a probe for new physics. This anomaly was measured with a precision of $0.20$~parts per million (ppm) by the Fermilab's Muon g-2 (E989) experiment. The final goal of the E989 experiment is to reach a precision of $0.14$~ppm. The experiment is based on the measurement of the muon spin anomalous precession frequency, $ω_a$, based on the arrival time distribution of high-energy decay positrons observed by 24 electromagnetic calorimeters, placed around the inner circumference of a $14$~m diameter storage ring, and on the precise knowledge of the storage ring magnetic field and of the beam time and space distribution. Achieving this level of precision requires strict control over systematics, which is ensured through several diagnostic devices. At the accelerator level, these devices monitor the quality of the injected beam (e.g., verifying that it has the correct momentum), while at the detector level, they track both the magnetic field and the gain of the calorimeters. In this work the device
Color quantization represents an image using a fraction of its original number of colors while only minimally losing its visual quality. The $k$-means algorithm is commonly used in this context, but has mostly been applied in the machine-based RGB colorspace composed of the three primary colors. However, some recent studies have indicated its improved performance in human perception-based colorspaces. We investigated the performance of $k$-means color quantization at four quantization levels in the RGB, CIE-XYZ, and CIE-LUV/CIE-HCL colorspaces, on 148 varied digital images spanning a wide range of scenes, subjects and settings. The Visual Information Fidelity (VIF) measure numerically assessed the quality of the quantized images, and showed that in about half of the cases, $k$-means color quantization is best in the RGB space, while at other times, and especially for higher quantization levels ($k$), the CIE-XYZ colorspace is where it usually does better. There are also some cases, especially at lower $k$, where the best performance is obtained in the CIE-LUV colorspace. Further analysis of the performances in terms of the distributions of the hue, chromaticity and luminance in an
We consider the problem of answering observational, interventional, and counterfactual queries in a causally sufficient setting where only observational data and the causal graph are available. Utilizing the recent developments in diffusion models, we introduce diffusion-based causal models (DCM) to learn causal mechanisms, that generate unique latent encodings. These encodings enable us to directly sample under interventions and perform abduction for counterfactuals. Diffusion models are a natural fit here, since they can encode each node to a latent representation that acts as a proxy for exogenous noise. Our empirical evaluations demonstrate significant improvements over existing state-of-the-art methods for answering causal queries. Furthermore, we provide theoretical results that offer a methodology for analyzing counterfactual estimation in general encoder-decoder models, which could be useful in settings beyond our proposed approach.
When conducting large-scale studies that collect brain MR images from multiple facilities, the impact of differences in imaging equipment and protocols at each site cannot be ignored, and this domain gap has become a significant issue in recent years. In this study, we propose a new low-dimensional representation (LDR) acquisition method called style encoder adversarial domain adaptation (SE-ADA) to realize content-based image retrieval (CBIR) of brain MR images. SE-ADA reduces domain differences while preserving pathological features by separating domain-specific information from LDR and minimizing domain differences using adversarial learning. In evaluation experiments comparing SE-ADA with recent domain harmonization methods on eight public brain MR datasets (ADNI1/2/3, OASIS1/2/3/4, PPMI), SE-ADA effectively removed domain information while preserving key aspects of the original brain structure and demonstrated the highest disease search accuracy.
With the proliferation of image-based applications in various domains, the need for accurate and interpretable image similarity measures has become increasingly critical. Existing image similarity models often lack transparency, making it challenging to understand the reasons why two images are considered similar. In this paper, we propose the concept of explainable image similarity, where the goal is the development of an approach, which is capable of providing similarity scores along with visual factual and counterfactual explanations. Along this line, we present a new framework, which integrates Siamese Networks and Grad-CAM for providing explainable image similarity and discuss the potential benefits and challenges of adopting this approach. In addition, we provide a comprehensive discussion about factual and counterfactual explanations provided by the proposed framework for assisting decision making. The proposed approach has the potential to enhance the interpretability, trustworthiness and user acceptance of image-based systems in real-world image similarity applications. The implementation code can be found in https://github.com/ioannislivieris/Grad_CAM_Siamese.git.
Purpose: To demonstrate the feasibility and performance of a fully automated deep learning framework to estimate myocardial strain from short-axis cardiac magnetic resonance tagged images. Methods and Materials: In this retrospective cross-sectional study, 4508 cases from the UK Biobank were split randomly into 3244 training and 812 validation cases, and 452 test cases. Ground truth myocardial landmarks were defined and tracked by manual initialization and correction of deformable image registration using previously validated software with five readers. The fully automatic framework consisted of 1) a convolutional neural network (CNN) for localization, and 2) a combination of a recurrent neural network (RNN) and a CNN to detect and track the myocardial landmarks through the image sequence for each slice. Radial and circumferential strain were then calculated from the motion of the landmarks and averaged on a slice basis. Results: Within the test set, myocardial end-systolic circumferential Green strain errors were -0.001 +/- 0.025, -0.001 +/- 0.021, and 0.004 +/- 0.035 in basal, mid, and apical slices respectively (mean +/- std. dev. of differences between predicted and manual stra
For acute ischemic stroke (AIS) patients with large vessel occlusions, clinicians must decide if the benefit of mechanical thrombectomy (MTB) outweighs the risks and potential complications following an invasive procedure. Pre-treatment computed tomography (CT) and angiography (CTA) are widely used to characterize occlusions in the brain vasculature. If a patient is deemed eligible, a modified treatment in cerebral ischemia (mTICI) score will be used to grade how well blood flow is reestablished throughout and following the MTB procedure. An estimation of the likelihood of successful recanalization can support treatment decision-making. In this study, we proposed a fully automated prediction of a patient's recanalization score using pre-treatment CT and CTA imaging. We designed a spatial cross attention network (SCANet) that utilizes vision transformers to localize to pertinent slices and brain regions. Our top model achieved an average cross-validated ROC-AUC of 77.33 $\pm$ 3.9\%. This is a promising result that supports future applications of deep learning on CT and CTA for the identification of eligible AIS patients for MTB.
We outline some basics of imaging using both fully-coherent and partially-coherent X-ray beams, with an emphasis on phase-contrast imaging. We open with some of the basic notions of X-ray imaging, including the vacuum wave equations and the physical meaning of the intensity and phase of complex scalar fields. The projection approximation is introduced, together with the concepts of attenuation contrast and phase contrast. We also outline the multi-slice approach to X-ray propagation through thick samples or optical elements, together with the Fresnel scaling theorem. Having introduced the fundamentals, we then consider several aspects of the forward problem, of modelling the formation of phase-contrast X-ray images. Several topics related to this forward problem are considered, including the transport-of-intensity equation, arbitrary linear imaging systems, shift-invariant linear imaging systems, the transfer-function formalism, blurring induced by finite source size, the space-frequency model for partially-coherent fields, and the Fokker-Planck equation for paraxial X-ray imaging. Having considered these means for modelling the formation of X-ray phase-contrast images, we then con
Lensless illumination single-pixel imaging with a multicore fiber (MCF) is a computational imaging technique that enables potential endoscopic observations of biological samples at cellular scale. In this work, we show that this technique is tantamount to collecting multiple symmetric rank-one projections (SROP) of an interferometric matrix--a matrix encoding the spectral content of the sample image. In this model, each SROP is induced by the complex sketching vector shaping the incident light wavefront with a spatial light modulator (SLM), while the projected interferometric matrix collects up to $O(Q^2)$ image frequencies for a $Q$-core MCF. While this scheme subsumes previous sensing modalities, such as raster scanning (RS) imaging with beamformed illumination, we demonstrate that collecting the measurements of $M$ random SLM configurations--and thus acquiring $M$ SROPs--allows us to estimate an image of interest if $M$ and $Q$ scale log-linearly with the image sparsity level This demonstration is achieved both theoretically, with a specific restricted isometry analysis of the sensing scheme, and with extensive Monte Carlo experiments. On a practical side, we perform a single ca
Self-supervised pretraining has been observed to be effective at improving feature representations for transfer learning, leveraging large amounts of unlabelled data. This review summarizes recent research into its usage in X-ray, computed tomography, magnetic resonance, and ultrasound imaging, concentrating on studies that compare self-supervised pretraining to fully supervised learning for diagnostic tasks such as classification and segmentation. The most pertinent finding is that self-supervised pretraining generally improves downstream task performance compared to full supervision, most prominently when unlabelled examples greatly outnumber labelled examples. Based on the aggregate evidence, recommendations are provided for practitioners considering using self-supervised learning. Motivated by limitations identified in current research, directions and practices for future study are suggested, such as integrating clinical knowledge with theoretically justified self-supervised learning methods, evaluating on public datasets, growing the modest body of evidence for ultrasound, and characterizing the impact of self-supervised pretraining on generalization.
Image guidance for minimally invasive interventions is usually performed by acquiring fluoroscopic images using a C-arm system. However, the projective data provide only limited information about the spatial structure and position of interventional tools such as stents, guide wires or coils. In this work we propose a deep learning-based pipeline for real-time tomographic (four-dimensional) interventional guidance at acceptable dose levels. In the first step, interventional tools are extracted from four cone-beam CT projections using a deep convolutional neural network (CNN). These projections are then reconstructed and fed into a second CNN, which maps this highly undersampled reconstruction to a segmentation of the interventional tools. Our pipeline is capable of reconstructing interventional tools from only four x-ray projections without the need for a patient prior with very high accuracy. Therefore, the proposed approach is capable of overcoming the drawbacks of today's interventional guidance and could enable the development of new minimally invasive radiological interventions by providing full spatiotemporal information about the interventional tools.
Fiber-like features are an important aspect of breast imaging. Vessels and ducts are present in all breast images, and spiculations radiating from a mass can indicate malignancy. Accordingly, fiber objects are one of the three types of signals used in the American College of Radiology digital mammography (ACR-DM) accreditation phantom. This work focuses on the image properties of fiber-like structures in digital breast tomosynthesis (DBT) and how image reconstruction can affect their appearance. The impact of DBT image reconstruction algorithm and regularization strength on the conspicuity of fiber-like signals of various orientations is investigated in simulation. A metric is developed to characterize this orientation dependence and allow for quantitative comparison of algorithms and associated parameters in the context of imaging fiber signals. The imaging properties of fibers, characterized in simulation, are then demonstrated in detail with physical DBT data of the ACR-DM phantom. The characterization of imaging of fiber signals is used to explain features of an actual clinical DBT case. For the algorithms investigated, at low regularization setting, the results show a striking
Photoacoustic imaging (PAI) is a novel medical imaging modality that uses the advantages of the spatial resolution of ultrasound imaging and the high contrast of pure optical imaging. Analytical algorithms are usually employed to reconstruct the photoacoustic (PA) images as a result of their simple implementation. However, they provide a low accurate image. Model-based (MB) algorithms are used to improve the image quality and accuracy while a large number of transducers and data acquisition are needed. In this paper, we have combined the theory of compressed sensing (CS) with MB algorithms to reduce the number of transducer. Smoothed version of L0-norm (SL0) was proposed as the reconstruction method, and it was compared with simple iterative reconstruction (IR) and basis pursuit. The results show that S$\ell_0$ provides a higher image quality in comparison with other methods while a low number of transducers were. Quantitative comparison demonstrates that, at the same condition, the SL0 leads to a peak-signal-to-noise ratio for about two times of the basis pursuit.