At the heart of radiological practice is the challenge of integrating complex imaging data with clinical information to produce actionable insights. Nuanced application of language is key for various activities, including managing requests, describing and interpreting imaging findings in the context of clinical data, and concisely documenting and communicating the outcomes. The emergence of large language models (LLMs) offers an opportunity to improve the management and interpretation of the vast data in radiology. Despite being primarily general-purpose, these advanced computational models demonstrate impressive capabilities in specialized language-related tasks, even without specific training. Unlocking the potential of LLMs for radiology requires basic understanding of their foundations and a strategic approach to navigate their idiosyncrasies. This review, drawing from practical radiology and machine learning expertise and recent literature, provides readers insight into the potential of LLMs in radiology. It examines best practices that have so far stood the test of time in the rapidly evolving landscape of LLMs. This includes practical advice for optimizing LLM characteristic
Vascular segmentation in medical images is crucial for disease diagnosis and surgical navigation. However, the segmented vascular structure is often discontinuous due to its slender nature and inadequate prior modeling. In this paper, we propose a novel Serpentine Window Mamba (SWinMamba) to achieve accurate vascular segmentation. The proposed SWinMamba innovatively models the continuity of slender vascular structures by incorporating serpentine window sequences into bidirectional state space models. The serpentine window sequences enable efficient feature capturing by adaptively guiding global visual context modeling to the vascular structure. Specifically, the Serpentine Window Tokenizer (SWToken) adaptively splits the input image using overlapping serpentine window sequences, enabling flexible receptive fields (RFs) for vascular structure modeling. The Bidirectional Aggregation Module (BAM) integrates coherent local features in the RFs for vascular continuity representation. In addition, dual-domain learning with Spatial-Frequency Fusion Unit (SFFU) is designed to enhance the feature representation of vascular structure. Extensive experiments on three challenging datasets demons
The diffusion of minimally invasive, endovascular interventions motivates the development of visualization methods for complex vascular networks. We propose a planar representation of blood vessel trees which preserves the properties that are most relevant to catheter navigation: topology, length and curvature. Taking as input a three-dimensional digital angiography, our algorithm produces a faithful two-dimensional map of the patient's vessels within a few seconds. To this end, we propose optimized implementations of standard morphological filters and a new recursive embedding algorithm that preserves the global orientation of the vascular network. We showcase our method on peroperative images of the brain, pelvic and knee artery networks. On the clinical side, our method simplifies the choice of devices prior to and during the intervention. This lowers the risk of failure during navigation or device deployment and may help to reduce the gap between expert and common intervention centers. From a research perspective, our method simulates the cadaveric display of artery trees from anatomical dissections. This opens the door to large population studies on the branching patterns and
Small Language Models (SLMs) have shown remarkable performance in general domain language understanding, reasoning and coding tasks, but their capabilities in the medical domain, particularly concerning radiology text, is less explored. In this study, we investigate the application of SLMs for general radiology knowledge specifically question answering related to understanding of symptoms, radiological appearances of findings, differential diagnosis, assessing prognosis, and suggesting treatments w.r.t diseases pertaining to different organ systems. Additionally, we explore the utility of SLMs in handling text-related tasks with respect to radiology reports within AI-driven radiology workflows. We fine-tune Phi-2, a SLM with 2.7 billion parameters using high-quality educational content from Radiopaedia, a collaborative online radiology resource. The resulting language model, RadPhi-2-Base, exhibits the ability to address general radiology queries across various systems (e.g., chest, cardiac). Furthermore, we investigate Phi-2 for instruction tuning, enabling it to perform specific tasks. By fine-tuning Phi-2 on both general domain tasks and radiology-specific tasks related to chest
Most natural language tasks in the radiology domain use language models pre-trained on biomedical corpus. There are few pretrained language models trained specifically for radiology, and fewer still that have been trained in a low data setting and gone on to produce comparable results in fine-tuning tasks. We present RadLing, a continuously pretrained language model using Electra-small (Clark et al., 2020) architecture, trained using over 500K radiology reports, that can compete with state-of-the-art results for fine tuning tasks in radiology domain. Our main contribution in this paper is knowledge-aware masking which is a taxonomic knowledge-assisted pretraining task that dynamically masks tokens to inject knowledge during pretraining. In addition, we also introduce an knowledge base-aided vocabulary extension to adapt the general tokenization vocabulary to radiology domain.
Recent advances in artificial intelligence have witnessed the emergence of large-scale deep learning models capable of interpreting and generating both textual and imaging data. Such models, typically referred to as foundation models, are trained on extensive corpora of unlabeled data and demonstrate high performance across various tasks. Foundation models have recently received extensive attention from academic, industry, and regulatory bodies. Given the potentially transformative impact that foundation models can have on the field of radiology, this review aims to establish a standardized terminology concerning foundation models, with a specific focus on the requirements of training data, model training paradigms, model capabilities, and evaluation strategies. We further outline potential pathways to facilitate the training of radiology-specific foundation models, with a critical emphasis on elucidating both the benefits and challenges associated with such models. Overall, we envision that this review can unify technical advances and clinical needs in the training of foundation models for radiology in a safe and responsible manner, for ultimately benefiting patients, providers, a
The segmentation of 2D vascular structures via deep learning holds significant clinical value but is hindered by the scarcity of annotated data, severely limiting its widespread application. Developing a universal few-shot vascular segmentation model is highly desirable, yet remains challenging due to the need for extensive training and the inherent complexities of vascular imaging. In this work, we propose UniVG (Generative Data-engine Foundation Model for Universal Few-shot 2D Vascular Image Segmentation), a novel approach that learns the compositionality of vascular images and constructing a generative foundation model for robust vascular segmentation. UniVG enables the synthesis and learning of diverse and realistic vascular images through two key innovations: 1) Compositional learning for flexible and diverse vascular synthesis: It decomposes and recombines vascular structures with varying morphological features and diverse foreground-background configurations to generate richly diverse synthetic image-label pairs. 2) Few-shot generative adaptation for transferable segmentation: It fine-tunes pre-trained models with minimal annotated data to bridge the gap between synthetic an
Cerebral blood flow regulation is critical for brain function, and its disruption is implicated in various neurological disorders. Many existing models do not fully capture the complex, multiscale interactions among neuronal activity, astrocytic signaling, and vascular dynamics--especially in key brainstem regions. In this work, we present a 3D-1D-0D multiscale computational framework for modeling the neuro-glial-vascular unit (NGVU) in the dorsal vagal complex (DVC). Our approach integrates a quadripartite synapse model--which represents the interplay among excitatory and inhibitory neurons, astrocytes, and vascular smooth muscle cells--with a hierarchical description of vascular dynamics that couples a three-dimensional microcirculatory network with a one-dimensional macrocirculatory representation and a zero-dimensional synaptic component. By linking neuronal spiking, astrocytic calcium and gliotransmitter signaling, and vascular tone regulation, our model reproduces key features of functional hyperemia and elucidates the feedback loops that help maintain cerebral blood flow. Simulation results demonstrate that neurotransmitter release triggers astrocytic responses that modulate
We present three variants of a lightweight, fully connected artificial neural network, suited for interactive estimation of three-dimensional, spatially resolved volumes of scattered radiation fields and a corresponding training pipeline for radiation protection dosimetry in medical radiation fields, such as those found in interventional radiology and cardiology. Accompanying, we present three different synthetically generated datasets with increasing complexity for training, generated using RadField3D, a Monte Carlo simulation application based on Geant4. As the primary scatter object, we employed the torso of a male Alderson RANDO phantom. On those datasets, we evaluate convolutional and fully connected architectures of neural networks to demonstrate which design decisions work well for reconstructing the fluence and spectra distributions over the spatial domain of such radiation fields. All our datasets, as well as our training pipeline, are published as open source in separate repositories. To evaluate the presented neural networks, we define and assess several metrics. Across these measures, the model variants demonstrate good spatial agreement between predicted and ground-tru
Autonomous microrobots in blood vessels could enable minimally invasive therapies, but navigation is challenged by dense, moving obstacles. We propose a real-time path planning framework that couples an analytic geometry global planner (AGP) with two reactive local escape controllers, one based on rules and one based on reinforcement learning, to handle sudden moving obstacles. Using real-time imaging, the system estimates the positions of the microrobot, obstacles, and targets and computes collision-free motions. In simulation, AGP yields shorter paths and faster planning than weighted A* (WA*), particle swarm optimization (PSO), and rapidly exploring random trees (RRT), while maintaining feasibility and determinism. We extend AGP from 2D to 3D without loss of speed. In both simulations and experiments, the combined global planner and local controllers reliably avoid moving obstacles and reach targets. The average planning time is 40 ms per frame, compatible with 25 fps image acquisition and real-time closed-loop control. These results advance autonomous microrobot navigation and targeted drug delivery in vascular environments.
Retinal vascular diseases affect the well-being of human body and sometimes provide vital signs of otherwise undetected bodily damage. Recently, deep learning techniques have been successfully applied for detection of diabetic retinopathy (DR). The main obstacle of applying deep learning techniques to detect most other retinal vascular diseases is the limited amount of data available. In this paper, we propose a transfer learning technique that aims to utilize the feature similarities for detecting retinal vascular diseases. We choose the well-studied DR detection as a source task and identify the early detection of retinopathy of prematurity (ROP) as the target task. Our experimental results demonstrate that our DR-pretrained approach dominates in all metrics the conventional ImageNet-pretrained transfer learning approach, currently adopted in medical image analysis. Moreover, our approach is more robust with respect to the stochasticity in the training process and with respect to reduced training samples. This study suggests the potential of our proposed transfer learning approach for a broad range of retinal vascular diseases or pathologies, where data is limited.
This paper presents a new multimodal interventional radiology dataset, called PoCaP (Port Catheter Placement) Corpus. This corpus consists of speech and audio signals in German, X-ray images, and system commands collected from 31 PoCaP interventions by six surgeons with average duration of 81.4 $\pm$ 41.0 minutes. The corpus aims to provide a resource for developing a smart speech assistant in operating rooms. In particular, it may be used to develop a speech controlled system that enables surgeons to control the operation parameters such as C-arm movements and table positions. In order to record the dataset, we acquired consent by the institutional review board and workers council in the University Hospital Erlangen and by the patients for data privacy. We describe the recording set-up, data structure, workflow and preprocessing steps, and report the first PoCaP Corpus speech recognition analysis results with 11.52 $\%$ word error rate using pretrained models. The findings suggest that the data has the potential to build a robust command recognition system and will allow the development of a novel intervention support systems using speech and image processing in the medical domain
The understanding of the mechanisms driving vascular development is still limited. Techniques to generate vascular trees synthetically have been developed to tackle this problem. However, most algorithms are limited to single trees inside convex perfusion volumes. We introduce a new framework for generating multiple trees inside general nonconvex perfusion volumes. Our framework combines topology optimization and global geometry optimization into a single algorithmic approach. Our first contribution is defining a baseline problem based on Murray's original formulation, which accommodates efficient solution algorithms. The problem of finding the global minimum is cast into a nonlinear optimization problem (NLP) with merely super-linear solution effort. Our second contribution extends the NLP to constrain multiple vascular trees inside any nonconvex boundary while avoiding intersections. We test our framework against a benchmark of an anatomic region of brain tissue and a vasculature of the human liver. In all cases, the total tree energy is improved significantly compared to local approaches. By avoiding intersections globally, we can reproduce key physiological features such as par
Estimating clinically-relevant vascular features following vessel segmentation is a standard pipeline for retinal vessel analysis, which provides potential ocular biomarkers for both ophthalmic disease and systemic disease. In this work, we integrate these clinical features into a novel vascular feature optimised loss function (VAFO-Loss), in order to regularise networks to produce segmentation maps, with which more accurate vascular features can be derived. Two common vascular features, vessel density and fractal dimension, are identified to be sensitive to intra-segment misclassification, which is a well-recognised problem in multi-class artery/vein segmentation particularly hindering the estimation of these vascular features. Thus we encode these two features into VAFO-Loss. We first show that incorporating our end-to-end VAFO-Loss in standard segmentation networks indeed improves vascular feature estimation, yielding quantitative improvement in stroke incidence prediction, a clinical downstream task. We also report a technically interesting finding that the trained segmentation network, albeit biased by the feature optimised loss VAFO-Loss, shows statistically significant impro
MR vascular Fingerprinting proposes to use the MR Fingerprinting framework to quantitatively and simultaneously map several microvascular characteristics at a sub-voxel scale. The initial implementation assessed the local blood oxygenation saturation (SO 2), blood volume fraction (BVf) and vessel averaged radius (R) in humans and rodent brains using simple 2D representations of the vascular network during dictionary generation. In order to improve the results and possibly extend the approach to pathological environments and other biomarkers, we propose in this study to use 3D realistic vascular geometries in the numerical simulations. 28,000 different synthetic voxels containing vascular networks segmented from whole brain healthy mice microscopy images were created. A Bayesian-based regression model was used for map reconstruction. We show on 8 healthy and 9 tumor bearing rats that realistic vascular representations yield microvascular estimates in better agreement with the literature than 2D or 3D cylindrical models. Furthermore, tumoral blood oxygenation estimates obtained with the proposed approach are the only ones correlating with in vivo optic-fiber measurements performed in
Accurate identification of brain function is necessary to understand the neurobiology of cognitive ageing, and thereby promote well-being across the lifespan. A common tool used to investigate neurocognitive ageing is functional magnetic resonance imaging (fMRI). However, although fMRI data are often interpreted in terms of neuronal activity, the blood-oxygen-level-dependent (BOLD) signal measured by fMRI includes contributions of both vascular and neuronal factors, which change differentially with age. While some studies investigate vascular ageing factors, the results of these studies are not well known within the field of neurocognitive ageing and therefore vascular confounds in neurocognitive fMRI studies are common. In contrast to over 10,000 BOLD-fMRI papers on ageing, fewer than 20 have applied techniques to correct for vascular effects. However, neurovascular ageing is not only a confound in fMRI, but an important feature in its own right, to be assessed alongside measures of neuronal ageing. We review current approaches to dissociate neuronal and vascular components of BOLD-fMRI of regional activity and functional connectivity. We highlight emerging evidence that vascular
In recent years, the use of expressive surface visualizations in the representation of vascular structures has gained significant attention. These visualizations provide a comprehensive understanding of complex anatomical structures and are crucial for treatment planning and medical education. However, to aid decision-making, physicians require visualizations that accurately depict anatomical structures and their spatial relationships in a clear and well-perceivable manner. This work extends a previous paper and presents a thorough examination of common techniques for encoding distance information of 3D vessel surfaces and provides an implementation of these visualizations. A Unity environment and detailed implementation instructions for sixteen different visualizations are provided. These visualizations can be classified into four categories: fundamental, surface-based, auxiliary, and illustrative. Furthermore, this extension includes tools to generate endpoint locations for vascular models. Overall this framework serves as a valuable resource for researchers in the field of vascular surface visualization by reducing the barrier to entry and promoting further research in this area
Angiography imaging is a medical imaging technique that enhances the visibility of blood vessels within the body by using contrast agents. Angiographic images can effectively assist in the diagnosis of vascular diseases. However, contrast agents may bring extra radiation exposure which is harmful to patients with health risks. To mitigate these concerns, in this paper, we aim to automatically generate angiography from non-angiographic inputs, by leveraging and enhancing the inherent physical properties of vascular structures. Previous methods relying on 2D slice-based angiography synthesis struggle with maintaining continuity in 3D vascular structures and exhibit limited effectiveness across different imaging modalities. We propose VasTSD, a 3D vascular tree-state space diffusion model to synthesize angiography from 3D non-angiographic volumes, with a novel state space serialization approach that dynamically constructs vascular tree topologies, integrating these with a diffusion-based generative model to ensure the generation of anatomically continuous vasculature in 3D volumes. A pre-trained vision embedder is employed to construct vascular state space representations, enabling co
Image guidance for minimally invasive interventions is usually performed by acquiring fluoroscopic images using a C-arm system. However, the projective data provide only limited information about the spatial structure and position of interventional tools such as stents, guide wires or coils. In this work we propose a deep learning-based pipeline for real-time tomographic (four-dimensional) interventional guidance at acceptable dose levels. In the first step, interventional tools are extracted from four cone-beam CT projections using a deep convolutional neural network (CNN). These projections are then reconstructed and fed into a second CNN, which maps this highly undersampled reconstruction to a segmentation of the interventional tools. Our pipeline is capable of reconstructing interventional tools from only four x-ray projections without the need for a patient prior with very high accuracy. Therefore, the proposed approach is capable of overcoming the drawbacks of today's interventional guidance and could enable the development of new minimally invasive radiological interventions by providing full spatiotemporal information about the interventional tools.
Radiology report generation aims to automatically provide clinically meaningful descriptions of radiology images such as MRI and X-ray. Although great success has been achieved in natural scene image captioning tasks, radiology report generation remains challenging and requires prior medical knowledge. In this paper, we propose PromptRRG, a method that utilizes prompt learning to activate a pretrained model and incorporate prior knowledge. Since prompt learning for radiology report generation has not been explored before, we begin with investigating prompt designs and categorise them based on varying levels of knowledge: common, domain-specific and disease-enriched prompts. Additionally, we propose an automatic prompt learning mechanism to alleviate the burden of manual prompt engineering. This is the first work to systematically examine the effectiveness of prompt learning for radiology report generation. Experimental results on the largest radiology report generation benchmark, MIMIC-CXR, demonstrate that our proposed method achieves state-of-the-art performance. Code will be available upon the acceptance.