搜索结果：Medical

共找到 20 条结果

高级筛选 ▾

A Benchmark for Long-Form Medical Question Answering

arXiv

There is a lack of benchmarks for evaluating large language models (LLMs) in long-form medical question answering (QA). Most existing medical QA evaluation benchmarks focus on automatic metrics and multiple-choice questions. While valuable, these benchmarks fail to fully capture or assess the complexities of real-world clinical applications where LLMs are being deployed. Furthermore, existing studies on evaluating long-form answer generation in medical QA are primarily closed-source, lacking access to human medical expert annotations, which makes it difficult to reproduce results and enhance existing baselines. In this work, we introduce a new publicly available benchmark featuring real-world consumer medical questions with long-form answer evaluations annotated by medical doctors. We performed pairwise comparisons of responses from various open and closed-source medical and general-purpose LLMs based on criteria such as correctness, helpfulness, harmfulness, and bias. Additionally, we performed a comprehensive LLM-as-a-judge analysis to study the alignment between human judgments and LLMs. Our preliminary results highlight the strong potential of open LLMs in medical QA compared t

Medical Knowledge Intervention Prompt Tuning for Medical Image Classification

arXiv2025-11-16作者：Ye Du, Nanxi Yu, Shujun Wang

Vision-language foundation models (VLMs) have shown great potential in feature transfer and generalization across a wide spectrum of medical-related downstream tasks. However, fine-tuning these models is resource-intensive due to their large number of parameters. Prompt tuning has emerged as a viable solution to mitigate memory usage and reduce training time while maintaining competitive performance. Nevertheless, the challenge is that existing prompt tuning methods cannot precisely distinguish different kinds of medical concepts, which miss essentially specific disease-related features across various medical imaging modalities in medical image classification tasks. We find that Large Language Models (LLMs), trained on extensive text corpora, are particularly adept at providing this specialized medical knowledge. Motivated by this, we propose incorporating LLMs into the prompt tuning process. Specifically, we introduce the CILMP, Conditional Intervention of Large Language Models for Prompt Tuning, a method that bridges LLMs and VLMs to facilitate the transfer of medical knowledge into VLM prompts. CILMP extracts disease-specific representations from LLMs, intervenes within a low-ra

搜索结果：Medical

A Benchmark for Long-Form Medical Question Answering

Medical Knowledge Intervention Prompt Tuning for Medical Image Classification

M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models

Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation

Adaptive Differential Privacy for Federated Medical Image Segmentation Across Diverse Modalities

GAN-GA: A Generative Model based on Genetic Algorithm for Medical Image Generation

A comprehensive survey on deep active learning in medical image analysis

Ambient-Pix2PixGAN for Translating Medical Images from Noisy Data

DiffBoost: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model

Invariant Scattering Transform for Medical Imaging

Active Learning on Medical Image

Test-time generative augmentation for medical image segmentation

Introduction of Medical Imaging Modalities

The Need for Medically Aware Video Compression in Gastroenterology

MedIAnomaly: A comparative study of anomaly detection in medical images

Segment Anything Model for Medical Image Analysis: an Experimental Study

Towards objective and systematic evaluation of bias in artificial intelligence for medical imaging

Fréchet Radiomic Distance (FRD): A Versatile Metric for Comparing Medical Imaging Datasets

HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation

AutoML Systems For Medical Imaging