Radiologic technology (RT) education faces challenges in bridging theory and practice due to limited clinical opportunities. While virtual reality (VR) enables safe and repeatable practice, a systematic instructional design framework is needed to develop scalable, procedure-focused modules. This study evaluates the Radiologic Technology Virtual Reality (RTVR) framework that integrates 360-degree video capture, instructional overlays, interactive assets, and an immersive content authoring platform to deliver a contrast-enhanced computed tomography (CECT) brain scan module. In this open-label, parallel-group, randomized controlled trial, 36 year-2 and year-3 RT students with no prior clinical training in diagnostic radiology at a university hospital in Thailand were randomly allocated (1:1) to a VR group or a conventional document-based instruction (control) group. The VR group completed the VR module, a grounded instructional design framework using 360-degree videos and a structured prebrief and debrief, for 20 minutes using a head-mounted display. The control group studied standard curriculum materials for the same duration. Blinding of participants was not possible. Outcome assessment was blinded. The primary outcome was declarative knowledge gain, assessed using a 20-item multiple-choice test before and after intervention. Secondary outcomes included technology acceptance, student satisfaction, and physiological responses during VR immersion. All 36 randomized participants (VR: n=18, control: n=18) completed the study and were included in the analysis. Experts validated the module as suitable and highly appropriate. Students reported high technology acceptance and satisfaction. Both VR and conventional methods produced substantial gains in declarative knowledge. No statistically significant difference in knowledge gain was detected between groups (test × group: unstandardized regression coefficient β=.056, 95% CI -1.360 to 1.473, P=.94). Year-2 students, who had less prior clinical exposure, showed larger pretest to posttest knowledge gains compared to year-3 students. Physiological monitoring showed a reduction in heart rate across the session, while blood pressure remained stable. No adverse events or VR-related discomfort requiring discontinuation was observed. The RTVR framework, which uses a real 360-degree video of authentic clinical settings, offers a scalable approach to procedural VR content creation without requiring specialist technical skills, distinguishing it from prior VR studies in radiography. These findings support the RTVR framework as a feasible, evidence-informed supplement to RT curricula for knowledge-focused procedural teaching, with learning outcomes comparable to those of conventional instruction in this context.
暂无摘要(点击查看详情)
Background Radiologic diagnosis accuracy and efficiency depend on effective metacognitive processes (eg, ongoing confidence monitoring and regulatory decisions). Identifying pitfalls in metacognitive processes is essential for improving diagnostic accuracy and workflow. Purpose To examine how the metacognitive bias termed the hard-easy effect and temporal thinking dynamics manifest in radiologic decision-making and differ with expertise levels. Materials and Methods In this metacognitive analysis, radiologists (at residency year 2 or later) and nonradiology medical professionals reviewed bone radiographs. They determined whether bone lesions were present or absent, which were compared with a reference standard based on CT or MRI findings validated by a panel of two board-certified musculoskeletal radiologists. For each image, participants rated confidence (50% [ie, a guess] to 100%) and recommended the next step (submitting, ordering additional imaging, and consulting a senior colleague). Success, confidence, overconfidence, response time (RT), resolution (confidence-accuracy correlation), efficiency, control sensitivity (confidence-decision correlation), and RT association with confidence were determined. Results Among 28 radiologists and 50 nonradiologists (2806 decisions), the task was challenging, and overconfidence was common (14.73%). Radiologists demonstrated higher success (73.59% [989 of 1344] vs 64.71% [946 of 1462]; P < .001) and confidence (85.12% vs 81.24%; P = .01), less overconfidence (overconfidence of 11.52% vs 16.52%; P = .03), and better resolution (γ = 0.37 vs 0.23; P = .04) than nonradiologists. In both groups, confidence predicted subsequent actions (γ = 0.32 vs 0.53; P = .05). A consistent hard-easy effect emerged: confidence was well-calibrated for easy cases but showed substantial overconfidence for difficult cases (>40%). Fast decisions were accurate and well-calibrated, whereas prolonged deliberation predicted mistakes and increased overconfidence. Lesion-present diagnoses were associated with lower confidence and more deferrals to seniors. Conclusion Metacognitive analysis of radiologic decision-making exposed the hard-easy effect and detrimental temporal dynamics. Although radiologic expertise increased success and confidence in monitoring accuracy, radiologists and nonradiologists were susceptible to overconfidence, particularly in challenging lengthy cases. © RSNA, 2026 Supplemental material is available for this article. See also the article by Krupinski in this issue.
Background: Minimally invasive interventional radiology (IR) offers effective, uterus-preserving treatments for several gynecologic and obstetric conditions such as uterine fibroids, adenomyosis and postpartum hemorrhage. Despite their efficacy, these methods remain underused, partly to limited awareness among clinicians and patients. Large language models (LLMs) may help bridge this gap by providing accessible, reliable information. Objective: To evaluate how current LLMs address knowledge gaps and promote awareness of minimally invasive IR methods in gynecology and obstetrics. Methods: A structured ten-question instrument was used to query three publicly available LLMs (OpenEvidence, ChatGPT, and Google Gemini). Responses were analyzed for accuracy, completeness, safety considerations, and patient-centered communication. Results: All three models accurately identified a range of medical, minimally invasive, and surgical treatments for uterine fibroids, adenomyosis, and postpartum hemorrhage, with OpenEvidence and ChatGPT providing more detailed and clinically nuanced responses. OpenEvidence achieved the highest scores overall, closely followed by ChatGPT, while Google Gemini scored lower, particularly in completeness and patient-centered communication. In more complex scenarios, performance differences became more pronounced, with OpenEvidence again leading, ChatGPT performing strongly, and Google Gemini lagging behind. Overall, OpenEvidence and ChatGPT demonstrated higher accuracy, completeness, and safety considerations, whereas Google Gemini showed comparatively weaker and less consistent performance. Conclusions: LLMs may endorse the promotion of minimally invasive IR methods in gynecology and obstetrics, but their outputs vary considerably in quality. Ongoing refinement and integration of evidence-based sources are essential before routine use in clinical practice. Therefore, effective collaboration between artificial intelligence (AI) developers and medical professionals is essential to harness this technology's full potential.
Appendiceal diverticular disease (ADD) is an uncommon condition that may clinically and radiologically mimic acute appendicitis and it is associated with a high risk of perforation. Limited awareness and diagnostic challenges frequently result in a diagnosis only after histopathological examination. Among 6,853 emergency appendectomies performed between January 2015 and June 2024, 153 patients with histopathologically confirmed ADD were retrospectively analyzed in this study. The demographic data, clinical presentation, imaging findings, operative outcomes, and pathological characteristics were reviewed. All CT scans and pathology slides were re-evaluated by gastrointestinal radiologists and a gastrointestinal pathologist. The diverticula were classified according to the Lipton system. The frequency of ADD detection in histopathological reports was 2.23%. The mean age was 40.5 ± 15.0 years, and 60.8% of the patients were male. Acquired diverticula predominated (99.3%), with type 2 being the most common subtype (68.6% of cases). Perforation was identified intraoperatively in 32.7% and histologically in 40.5% of the cases, occurring almost exclusively in type 1 and type 2 diverticula. ADD was not diagnosed on initial emergency CT reports. However, a retrospective review identified diverticula in 59.1% of cases, most commonly cystic and located at the appendiceal apex. No malignancies were detected. ADD is a rare but clinically important entity with high perforation rates and substantial diagnostic limitations in routine imaging. Multidisciplinary reassessment improves the diagnostic accuracy and it may support optimized management strategies.
MRI can be challenging for patients with autism spectrum disorder (ASD) due to sensory sensitivities and radiologic technologists (RTs) play a key role in managing these challenges. Therefore, this study aims to assess Saudi RTs' knowledge, practices, and experiences in managing individuals with ASD undergoing MRI, as well as identify barriers and facilitators of effective practice. A cross-sectional, mixed-methods study was conducted using an online questionnaire (n = 335) and semi-structured interviews (n = 8). Data collection occurred between April and May 2023, and the results were analyzed using SPSS v25 and thematic analysis. Only 13% of RTs had formal training in ASD care. While knowledge scores were relatively high (61% high-level), practice and training scores were weak (51% and 50% low-level, respectively). In addition, a modest positive correlation was observed between training and practice domains (r = 0.121, p = 0.027). Moreover, thematic analysis highlighted three themes: RTs' knowledge gaps, variability in practice, and the need for clear guidelines. These results indicate that there is a significant gap between RTs' knowledge and practical readiness in handling individuals with ASD during MRI. Consequently, tailored training programs and standardized protocols are recommended.
Ductal carcinoma in situ (DCIS) diagnosed on core biopsy is frequently upgraded to invasive carcinoma at surgery, which may change indications for sentinel lymph node biopsy. Routine breast MRI has limited ability to detect occult invasion preoperatively. This study aimed to develop and externally validate an MRI-based model combining clinical variables, conventional MRI findings, and dynamic contrast-enhanced (DCE) MRI radiomics to predict invasive upgrade in biopsy-proven DCIS. This retrospective multicenter study enrolled 478 patients from three hospitals (2014-2019). Center 1 contributed 314 patients, randomly split into a training set (n = 251) and an internal test set (n = 63); Centers 2 (n = 39) and 3 (n = 62) formed two independent external test sets. Radiologists assessed conventional MRI features, including lesion size, enhancement descriptors, and diffusion-derived apparent diffusion coefficient metrics. Tumors were segmented on DCE MRI. Radiomics features with intraclass correlation coefficient > 0.85 were z-score normalized, selected using least absolute shrinkage and selection operator regression, and used to train multiple machine learning classifiers; the best-performing model generated a radiomics score. Model selection and hyperparameter tuning were performed by cross-validation within the training set only. Clinico-radiologic, radiomics, and combined models were evaluated using receiver operating characteristic (ROC) curve analysis, calibration, and decision curve analysis, the area under the curve (AUC) was calculated. Six clinico-radiologic factors and 13 radiomic features were retained. In the two external test sets, the clinico-radiologic, radiomics, and combined models achieved AUCs of 0.61 (95% CI, 0.43-0.79) and 0.71 (0.58-0.83), 0.70 (0.54-0.86) and 0.71 (0.58-0.84), and 0.76 (0.60-0.91) and 0.77 (0.65-0.89), respectively. The combined model provided the highest net benefit on decision curve analysis. A combined clinico-radiologic and DCE-MRI radiomics model showed multicenter, externally validated performance for preoperative prediction of invasive upgrade in DCIS, supporting risk stratification for surgical planning.
Computed tomography (CT) is a major source of medical radiation exposure, and radiologic technologists play a central role in applying dose-minimization principles. To date, no nationwide data exist on Moroccan technologists' knowledge, attitudes, and practices regarding CT dose optimization. To assess awareness of dose-minimization concepts, implementation of optimization techniques, and perceived barriers among radiologic technologists in Morocco, and to identify educational and structural interventions to enhance radiation safety. We conducted a cross-sectional, web-based survey from May 2025 to June 2025, targeting certified CT technologists across public and private hospitals in all 12 administrative regions of Morocco. The 22-item questionnaire covered demographics, familiarity with as low as reasonably achievable and dosimetric indices (CT dose index [CTDIvol], dose length product (DLP), size-specific dose estimate), use of dose-modulation techniques, adherence to diagnostic reference levels (DRLs), protocol adjustment practices (including pediatric and obese patients), and training needs. Data from 168 complete responses were analyzed using descriptive statistics and Chi-square tests (SPSS 21.0), with P < 0.05 denoting significance. Of respondents, 66.9% were aged 25-35 years, 57% were female, and 50% had 1-5 years of experience. Although 82.6% reported familiarity with dose-minimization principles, only 46.1% understood dose-modulation technology, and 28.7% fully grasped CTDIvol/DLP metrics. Awareness of national DRLs was low (23.5%), with just 4.2% routinely comparing their practice to reference benchmarks. While 44.3% "always" considered dose reduction when selecting protocols, only 26.3% "regularly" modified exposure parameters. Key barriers included insufficient training (69.1%), technical limitations of existing CT scanners (53.3%), and high workload (39.4%). Half of the participants (50.9%) expressed a strong interest in further dose-optimization training. This first national survey reveals substantial gaps between general awareness and practical mastery of CT dose minimization among Moroccan technologists. To strengthen patient protection, we recommend implementing structured continuous education programs, hands-on optimization workshops, integration of medical physicists for protocol auditing, and development of context-specific DRLs and standardized CT protocols. These measures will be critical for advancing radiation safety and ensuring compliance with international best-practice guidelines.
Anterior cervical discectomy and fusion (ACDF) with plate and screw fixation is a standard treatment for degenerative cervical disc disease. However, the advent of 3D-printed porous titanium cages (3D-PTCs) has prompted ongoing attempts to integrate this technology into ACDF procedures. To compare the 2-year clinical and radiologic outcomes of single-level ACDF using a stand-alone 3D-PTC versus conventional ACDF with allograft and anterior plating. In this retrospective, propensity-matched cohort study, 30 patients per group were analyzed. Clinical outcomes included visual analog scale (VAS) scores for neck and arm pain, neck disability index (NDI), and dysphagia. Radiologic outcomes were fusion rate, segmental range of motion, and subsidence. Intraoperative blood loss, drainage, and fluoroscopy counts were recorded. Neck pain improvement was greater in the 3D-PTC group at all follow-ups (all P < 0.05). Dysphagia was less frequent at day 3 (10 vs. 19, P = 0.020) and 1 month (3 vs. 10, P = 0.028). Prevertebral swelling was smaller at day 3 (11.53 vs. 18.37 mm, P < 0.001), 1 month (P = 0.001), and 3 months (P = 0.007). Fluoroscopy counts were lower (5.7 vs. 7.9, P = 0.031). Fusion at 2 years was 100% in both groups, with similar subsidence (33.3% vs. 30.0%, P = 0.781). No device-related complications occurred. Stand-alone 3D-PTC ACDF achieved comparable fusion and safety to conventional plated ACDF, with reduced dysphagia, swelling, and fluoroscopy use, and greater neck pain relief, supporting its viability as an alternative technique.
Histotripsy is a novel, non-invasive, non-ionising, non-thermal method of mechanical tumour disruption that received US FDA approval in October 2023 for the treatment of liver tumours. This study aims to summarise and evaluate the safety and outcomes data following histotripsy of primary and secondary liver tumours. This systematic review and meta-analysis followed PRISMA guidelines. Records were identified through review of PubMed, Embase, and Scopus (database inception to December 2025) and manual review. Eligible studies were prospective or retrospective cohort studies and clinical trials published in English between Jan 1, 2015 and Dec 31, 2025 reporting clinical outcomes of histotripsy for liver tumours in three or more patients. Literature reviews, editorials, conference abstracts, animal studies, and case reports of two or fewer patients were excluded. Evaluated outcomes included post-procedural complications, local tumour control (LTC), overall survival, procedural technical success, tumour volume reduction, and off-target effects. Study quality was assessed using the Risk Of Bias In Non-randomised Studies of Interventions (ROBINS-I) tool. Heterogeneity was quantified using the I2 statistic and Cochran's Q test. Publication bias was assessed using funnel plot visual inspection and Egger's regression test. Finally, the histotripsy technology was assessed using the IDEAL framework to inform the design of future trials. This work was registered with PROSPERO (CRD420261299804). Ten studies (553 patients) met inclusion criteria. No studies exhibited a high risk of bias; seven demonstrated a moderate risk of bias in at least one domain. Using a random-effects model, the pooled technical success rate was 94.1% (95% CI: 90.4%-96.4%; I2 = 0%). Four studies reported major complications (grade ≥3), with a pooled event rate of 7.0% (95% CI: 2.0%-21.5%; I2 = 76.4%). The pooled event rate for absence of nodular enhancement was 89.3% (95% CI: 48.2%-98.7%; I2 = 61.6%). The pooled mean tumour volume reduction was 49.4% (95% CI: 9.0%-89.8%; I2 = 99.6%). The pooled off-target tumour effect rate was 10.0% (95% CI: 5.0%-20.0%; I2 = 6.6%). No significant publication bias was detected for mortality and safety outcomes; however, formal assessment of publication bias was limited by the small number of studies for these and all outcomes. All radiological control outcomes showed substantial heterogeneity across studies. Although there is notable heterogeneity across studies, pooled results indicate that histotripsy has high rates of technical feasibility and local control with a favourable side effect profile. Interpretation of these findings is limited by the small number of available studies, variability in outcome definitions and imaging assessment methods, and short follow-up durations. These results underscore the need for larger, prospectively designed studies with standardised reporting frameworks and longer follow-up to more precisely characterise the clinical, radiologic, and quantitative imaging outcomes following histotripsy. None.
Background Studies have demonstrated that large language models (LLMs) can perform differential diagnosis based on textual radiologic findings; however, it is unclear how variations in reader-generated inputs affect LLM performance and clinical utility. Purpose To evaluate how reader experience influences the diagnostic benefit of LLM assistance in brain MRI differential diagnosis. Materials and Methods In this retrospective multireader study, neuroradiologists (n = 4), radiology residents (n = 4), and neurology/neurosurgery residents (n = 4) provided textual radiologic findings and their top three differential diagnoses for brain MRI scans with confirmed diagnoses obtained between January 2009 and April 2024 from a single academic center. Confirmed diagnoses were established histopathologically or through consensus of at least two neuroradiologists. Three LLMs (GPT-4.1 [OpenAI], Gemini 2.5 Pro [Google DeepMind], and DeepSeek-R1 [Hangzhou DeepSeek Artificial Intelligence Basic Technology Research]) generated differential diagnoses based on reader-provided findings. Readers revised their diagnoses after reviewing the suggestions of GPT-4.1. A cumulative link mixed model was fitted to evaluate the association between reader experience and diagnostic benefit, with change in diagnostic result as an ordinal outcome, reader experience as a predictor, and random intercepts for rater and patient. Results Forty brain MRI scans (mean patient age, 50 years ± 18 [SD]; 23 female) were included. LLM-generated diagnoses achieved the highest top-three accuracy based on imaging findings from neuroradiologists (78.8%-83.8% across LLMs), followed by radiology residents (71.8%-77.6%) and neurology/neurosurgery residents (63.2%-67.1%). Mean absolute gains in top-three accuracy with LLM assistance diminished with increasing experience: +19.4% for neurology/neurosurgery residents (from 43.2% to 62.6%), +14.7% for radiology residents (from 59.6% to 74.4%), and +4.4% for neuroradiologists (from 83.1% to 87.5%). Models demonstrated a negative association between reader experience and diagnostic benefit from LLM assistance (β = -0.10; P = .005) and a positive association of reader experience with correctness (β = 0.11; P < .001) and completeness (β = 0.18; P = .002) of imaging findings. Conclusion With increasing reader experience, LLM accuracy with reader-generated input improved, whereas accuracy gains from LLM assistance diminished. © RSNA, 2026 Supplemental material is available for this article. See also the editorial by McMillan in this issue.
Accurate differentiation between benign and malignant pulmonary nodules ≤ 3 cm remains a clinical challenge. This study aimed to develop and internally validate a clinically interpretable nomogram integrating clinical variables and quantitative computed tomography (CT) features for predicting malignancy in pulmonary nodules. This retrospective single-center study included 1,419 patients with pulmonary nodules ≤ 3 cm who underwent surgical resection between January 2012 and July 2025 with pathologic confirmation. The cohort was randomly divided into a training set (n = 994) for model development and a validation set (n = 425) for internal validation. Clinical data, conventional imaging findings, serum biomarkers, and quantitative CT measurements from preoperative thin-section CT were collected. Multivariable logistic regression was used to identify variables associated with malignancy and construct the nomogram. Among the 1,419 nodules, 1,150 (81.0%) were malignant and 269 (19.0%) were benign. The final nomogram incorporated seven variables: suspicious radiologic features, nodule size, sex, symptoms at detection, consolidation-to-tumor ratio, minimum CT attenuation, and age. Age was retained in the final model on clinical grounds despite lacking statistical significance in multivariable analysis. Suspicious radiologic features (adjusted odds ratio [aOR] = 6.61, 95% confidence interval [CI]: 4.51-9.84; P < 0.001), nodule diameter > 2 cm (aOR = 4.07, 95% CI: 2.16-7.62; P < 0.001), female sex (aOR = 1.69, 95% CI: 1.23-2.33; P = 0.001), asymptomatic presentation (aOR = 0.48, 95% CI: 0.34-0.69; P < 0.001), consolidation-to-tumor ratio > 0.50 (aOR = 0.20, 95% CI: 0.06-0.61; P = 0.005), and minimum CT attenuation per 100-HU increase (aOR = 0.82, 95% CI: 0.74-0.92; P < 0.001) were independently associated with malignancy. The nomogram showed good discrimination, with area under the receiver operating characteristic curve values of 0.809 in the training set and 0.782 in the validation set. Calibration analysis showed agreement between predicted and observed risks, and decision curve analysis supported usefulness. We developed and internally validated a clinical nomogram incorporating quantitative CT features for malignancy risk estimation in surgically resected pulmonary nodules ≤ 3 cm. The model showed good discrimination, calibration, and potential utility in a malignancy-enriched preoperative cohort. External validation in broader, less selected, screening-detected, incidental, and multicenter populations is warranted before routine clinical application.
Report turnaround time (R-TAT) is widely used as a radiology quality metric and is increasingly considered for national benchmarking. Beginning in 2026, CMS will no longer accept continuous R-TAT reporting in the Merit-Based Incentive Payment System and instead recommends threshold-based benchmarks. This study evaluates how radiology practices have used R-TAT in quality improvement and examines the evidence supporting implementation of national threshold-based R-TAT measures. A systematic literature review was performed in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines and included English-language studies published between 2000 and 2024 that reported radiology R-TAT defined as time from examination completion to report finalization. Two independent reviewers screened titles, abstracts, and full texts. Eligible studies were analyzed for practice setting and intervention type. Descriptive statistics were generated, and a narrative review discusses risks and feasibility of threshold-based benchmarking. Of 1,722 screened records, 13 studies met inclusion criteria. Most originated from US academic practices (92%) and evaluated technology adoption (46%), targeted improvement projects (31%), or operational changes (23%). All studies relied on local pre- and postintervention comparisons; none reported use of the ACR General Radiology Improvement Database TAT registry for benchmarking. Reported improvements in R-TAT were context specific and heterogeneous. The narrative review identified important concerns regarding clinical risk, unintended consequences, and measurement validity associated with unadjusted threshold-based R-TAT metrics. R-TAT is a useful internal performance indicator for evaluating workflow and operational interventions. However, existing evidence does not support adoption of uniform national threshold-based R-TAT benchmarks without adjustment for case complexity, modality, practice setting, and workflow factors. Continuous R-TAT measures remain better suited for local quality improvement, and future policy implementation should incorporate balancing measures to ensure patient-centered, high-quality radiologic care.
Intestinal-type and pancreatobiliary-type periampullary carcinomas exhibit distinct biological behaviours and prognostic outcomes, yet accurate preoperative subtyping remains a major clinical challenge. This study aimed to evaluate the value of clinical variables and computed tomography (CT) imaging features in the differential diagnosis of intestinal-type and pancreatobiliary-type periampullary carcinoma, and to develop a subtype prediction model. This retrospective study included 83 patients with pathologically confirmed periampullary carcinoma, comprising 21 intestinal-type and 62 pancreatobiliary-type periampullary carcinomas. Clinical variables and conventional CT imaging features were evaluated using univariable and multivariable logistic regression analyses to identify predictors associated with PAC subtype. Based on these predictors, clinical, radiologic, and combined prediction models were developed, and a nomogram was constructed from the combined model. Model performance was evaluated using receiver operating characteristic (ROC) curves and the area under the curve (AUC). The DeLong test was employed to compare the diagnostic performance among different models. The AUCs of the clinical model and radiologic model were 0.82 (95%CI, 0.70-0.92) and 0.85 (95%CI, 0.79-0.93), respectively. The combined model showed significantly better diagnostic performance than either model alone, with an AUC of 0.92 (95%CI, 0.84-0.97), a sensitivity of 87.1%, and a specificity of 90.5%. Serum carbohydrate antigen 19 - 9, total bilirubin, tumor location, and enhancement degree of lesion were identified as the most important predictors in the combined model. In addition, the nomogram derived from the combined model demonstrated good discriminative ability for predicting histologic subtype. The combined model integrating clinical and CT imaging features enables more accurate preoperative differentiation between intestinal-type and pancreatobiliary-type periampullary carcinoma and yields higher sensitivity and specificity than models based on clinical or imaging features alone.
Cognitive and psychiatric impairments are common in patients with frontotemporal meningiomas. While meningiomas are often histologically benign, they can cause significant morbidity through mass effect and peritumoral edema. Compared to gliomas, the reversibility of these impairments following surgical resection is relatively under-investigated. This study aimed to evaluate postoperative neuropsychological changes and identify tumor-related factors associated with recovery in patients with frontotemporal meningiomas. We retrospectively reviewed 29 patients who underwent surgical resection for frontotemporal meningiomas and completed both pre- and postoperative neuropsychological assessments (Neurooncologic Psychological Test [NOPT], Seoul Neuropsychological Screening Battery-II [SNSB-II], or Bundang Neuropsychological Testing Protocol-M1 [BNTP-M1]). Exploratory multivariable linear regression analysis was performed using Δ scores (postoperative minus preoperative) to identify radiologic factors associated with recovery in each domain. Postoperatively, patients demonstrated significant improvements across all tested domains: attention (P = 0.002), language (P = 0.041), memory (P < 0.001), visuospatial function (P = 0.024), executive function (P < 0.001), and psychiatric symptoms (P = 0.043). In exploratory multivariable analysis, mass effect was associated with greater improvement in attention (B = 1.33, 95% CI, 0.36 to 2.30; P = 0.009), whereas frontal lobe involvement was associated with less improvement in language function (B = -3.00, 95% CI, -5.06 to -0.94; P = 0.006). Convexity origin was associated with less improvement in Stroop test performance (B = -29.79, 95% CI, -59.48 to -0.11; P = 0.049), and frontal base origin was associated with less improvement in psychiatric symptoms (B = -14.91, 95% CI, -24.83 to -4.98; P = 0.005). Edema index was not significantly associated with executive function recovery in the final exploratory model. Neuropsychological impairments in patients with frontotemporal meningiomas demonstrated statistically significant postoperative improvements following surgical resection. Recovery trajectories may be associated with tumor-related factors such as mass effect and anatomical origin. These findings should be interpreted cautiously given the limited sample size and the exploratory nature of the analyses, and support further investigation of tailored rehabilitation strategies based on preoperative radiologic characteristics.
Olanzapine as a cause of necrotizing pancreatitis is very rare. Unheard of in the literature is the development of a pancreatico-pleural fistula following olanzapine induced necrotizing pancreatitis requiring minimally invasive pancreatectomy to resolve the disease. This report seeks to publish the history, findings and strategies used to resolve this extremely unique case. A 53-year-old male presented with over a year of chronic abdominal pain. He was diagnosed with pancreatitis. His presentation worsened, leading to necrosis, peripancreatic fluid collections and ultimately a pancreatico-pleural fistula. Despite a low-fat diet, confirmation of alcohol abstinence, absence of gallstones, pancreatic duct stenting, peritoneal and pleural drainage, the fistula persisted and his weight loss worsened. Hepatobiliary surgery was consulted, performed a successful minimally invasive pancreatectomy. His drains were removed sequentially and at final follow up was free of symptoms, regaining weight. Olanzapine as a cause of acute necrotizing pancreatitis is rare. For it to result in necrosis, peripancreatic fluid collections and a pancreatico-pleural fistula is even rarer. This case failed to resolve with nonsurgical management. Robotic technology combined with appropriate endoscopic, interventional radiologic and nutritional therapy resolved this case, highlighting strategies that will be useful to physicians and their patients, as well as being of great interest.
Introduction Large language models (LLMs) have demonstrated promising performance on standardized medical examinations, yet systematic comparisons of contemporary multimodal and text-only models on radiology-specific assessments remain limited. Updated and newly released LLMs, including Grok 4.1 (xAI, San Francisco, USA), Bing Copilot GPT-5 (Microsoft, Redmond, USA), DeepSeek V3.2 (DeepSeek AI, Beijing, China), and OpenEvidence (Chalmers University of Technology, Gothenburg, Sweden), have not been evaluated on the American College of Radiology Diagnostic Imaging In-Training (ACR DXIT) examination. This study aimed to compare the performance of seven contemporary LLMs on the 2022 DXIT examination, stratified by question format and radiology subject domain. Methods Seven LLMs were evaluated on all 106 multiple-choice questions from the 2022 DXIT examination, comprising 42 written-only and 64 image-based questions. Five multimodal models [ChatGPT-5.1 (OpenAI, San Francisco, USA), Gemini 3 Pro (Google, Mountain View, USA), Claude Sonnet 4.5 (Anthropic, San Francisco, USA), Grok 4.1, and Bing Copilot GPT-5] were assessed on all questions. Two text-only models (DeepSeek V3.2 and OpenEvidence) were evaluated on written-only questions. A standardized orientation prompt was applied uniformly across all models. Statistical comparisons accounted for the paired nature of the data, as all models answered identical questions; Cochran's Q test was used for comparisons across three or more models, and McNemar's test for two-model comparisons. Ninety-five percent confidence intervals for accuracy proportions were calculated using the Wilson score method. For subgroups with fewer than 10 questions, p-values were not reported, and descriptive statistics only are presented. Results Overall accuracy among multimodal models ranged from 65.1% [Claude Sonnet 4.5; 95% confidence interval (CI): 55.6%-73.5%] to 76.4% (Gemini 3 Pro; 95% CI: 67.5%-83.5%), with no statistically significant differences among models (Cochran's Q=5.07, df=4, p=0.281). All multimodal models performed substantially better on written-only questions (88.1%-95.2%) than on image-based questions (46.9%-64.1%), representing an average gap of approximately 35 percentage points. Neither written-only nor image-based comparisons reached significance (p=0.813 and p=0.226, respectively). Domain-level analysis identified consistent strengths in ultrasound (80%-90%; p=0.948) and chest radiology (70%-90%; p=0.870), and persistent weakness in musculoskeletal imaging (40%-60%; p=0.898). Among text-only models, OpenEvidence and DeepSeek V3.2 achieved overall accuracies of 83.3% (95% CI: 69.4%-91.7%) and 88.1% (95% CI: 75.0%-94.8%), respectively, with no significant difference between them (McNemar's p=0.773). Conclusion Contemporary multimodal LLMs achieve moderately high accuracy on radiology in-training examination questions, exceeding earlier-generation model benchmarks and junior resident performance levels, yet no single model demonstrated statistically significant superiority. A consistent and substantial performance gap between written and image-based questions persists across all architectures, underscoring unresolved limitations in radiologic image interpretation. These findings suggest that current LLMs may support circumscribed roles in radiology education, particularly for conceptual and non-interpretive content, but remain unsuitable for tasks requiring visual diagnostic reasoning.
Artificial intelligence (AI) has the potential to profoundly transform surgical decision-making (SDM) by enabling more predictive, personalized, and data-driven care. Its integration across the surgical pathway can improve clinical outcomes, efficiency, and patient safety. This narrative review provides an overview of the current and emerging applications of AI in SDM. A structured search of electronic databases was conducted using PubMed, Scopus, Web of Science, and Google Scholar. The search primarily focused on peer-reviewed publications from 2015 to 2025. AI applications include radiologic image analysis for preoperative planning, electronic health record mining for individualized surgical strategies, risk and immunological response prediction, and genomic analysis to guide treatment selection. Intraoperative, AI-based video, image, and physiological data processing can support real-time decision-making by improving precision, identifying anatomical targets, and predicting complications earlier. Postoperatively, AI systems can monitor patient data to detect complications, evaluate outcomes, and tailor follow-up therapy. Despite these advantages, challenges remain, including data quality and availability, model explainability, and others. Overcoming these barriers requires explainable and secure AI models, scalable infrastructures, clinician engagement, and robust regulatory frameworks. Advances in AI-assisted robotics and interpretability are expected to support safer, more ethical, and more effective surgical decision-making.
To evaluate the feasibility, accuracy, and reproducibility of an AR interface for enhancing spatial surface-level tracking on a canine head hologram. We hypothesized that spatial accuracy improves when virtual guides are overlayed directly onto the head rather than when annotations are transferred from a separate screen. We developed an augmented reality (AR) application in Unity compatible with XREAL's glasses that allowed users to annotate and interact with a realistic 3-D hologram of a dog head. Coordinate (distance) and outline (area) annotations were tested under 2 conditions: (1) transfer-seeing targets from a computer screen-and (2) direct-seeing the targets directly on the head. The mean distance error was significantly lower for direct versus transfer coordinates (2.73 vs 3.42 mm). Similarly, accurate area coverage was higher (83.7% vs 63.3%) with AR guidance. Completion times differed significantly between the groups for coordinate tasks (11.2 vs 8.19 seconds) but not for area tracing (25.7 vs 26.8 seconds). AR-guided visualization improved spatial accuracy for both distance and area metrics without reducing speed. These findings show that AR may be used to optimizing surface-level spatial tracking on the canine head. Future work will focus on whether these findings can be translated to a clinical setting and if diagnostic imaging scans can be similarly overlayed onto the surgical plane to expand the medical applications of this technology.
While guidelines exist for returning genetic, biomarker, and radiologic research results, there is little guidance regarding the return of patient-reported outcome (PRO) research results. As a supplement to an ongoing trial of palliative care for Parkinson's disease, we conducted thirty qualitative, semi-structured interviews of persons with Parkinson's disease, their carepartners, community neurologists, primary care physicians, and researchers to elicit their perceptions of returning PRO research data to persons with Parkinson's disease and carepartners. Interviews were audio-taped, transcribed, double-coded, and analyzed for themes. Participants supported returning tailored PRO research results when feasible. Clinicians wanted clinically-actionable results. Researchers thought that returning aggregate results should be standard practice. However, our persons with Parkinson's disease and carepartners reported privacy concerns. Clinicians had mixed reactions about receiving non-actionable research results. Researchers worried whether returning individual results during a study could impact research integrity. Additional research is needed to develop optimal guidelines for future research studies.