搜索 — ResearchTracker

The practical deployment of medical vision-language models (Med-VLMs) necessitates seamless integration of textual data with diverse visual modalities, including 2D/3D images and videos, yet existing models typically employ separate encoders for different modalities. To address this limitation, we present OmniV-Med, a unified framework for multimodal medical understanding. Our technical contributions are threefold: First, we construct OmniV-Med-Instruct, a comprehensive multimodal medical dataset containing 252K instructional samples spanning 14 medical image modalities and 11 clinical tasks. Second, we devise a rotary position-adaptive encoder that processes multi-resolution 2D/3D images and videos within a unified architecture, diverging from conventional modality-specific encoders. Third, we introduce a medical-aware token pruning mechanism that exploits spatial-temporal redundancy in volumetric data (e.g., consecutive CT slices) and medical videos, effectively reducing 60\% of visual tokens without performance degradation. Empirical evaluations demonstrate that OmniV-Med-7B achieves state-of-the-art performance on 7 benchmarks spanning 2D/3D medical imaging and video understand

Med-U1: Incentivizing Unified Medical Reasoning in LLMs via Large-scale Reinforcement Learning

arXiv2025-06-14作者：Xiaotian Zhang, Yuan Wang, Zhaopeng Feng

Medical Question-Answering (QA) encompasses a broad spectrum of tasks, including multiple choice questions (MCQ), open-ended text generation, and complex computational reasoning. Despite this variety, a unified framework for delivering high-quality medical QA has yet to emerge. Although recent progress in reasoning-augmented large language models (LLMs) has shown promise, their ability to achieve comprehensive medical understanding is still largely unexplored. In this paper, we present Med-U1, a unified framework for robust reasoning across medical QA tasks with diverse output formats, ranging from MCQs to complex generation and computation tasks. Med-U1 employs pure large-scale reinforcement learning with mixed rule-based binary reward functions, incorporating a length penalty to manage output verbosity. With multi-objective reward optimization, Med-U1 directs LLMs to produce concise and verifiable reasoning chains. Empirical results reveal that Med-U1 significantly improves performance across multiple challenging Med-QA benchmarks, surpassing even larger specialized and proprietary models. Furthermore, Med-U1 demonstrates robust generalization to out-of-distribution (OOD) tasks.

搜索结果：JMIRx med

OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding

Med-U1: Incentivizing Unified Medical Reasoning in LLMs via Large-scale Reinforcement Learning

DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI

Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models

VIVID-Med: LLM-Supervised Structured Pretraining for Deployable Medical ViTs

LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation

Clinical Data Goes MEDS? Let's OWL make sense of it

Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards

Med-Art: Diffusion Transformer for 2D Medical Text-to-Image Generation

Med-REFL: Medical Reasoning Enhancement via Self-Corrected Fine-grained Reflection

Shap-MeD

Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding

Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution

Med-CAM: Minimal Evidence for Explaining Medical Decision Making

Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents

PRS-Med: Position Reasoning Segmentation in Medical Imaging

EAFP-Med: An Efficient Adaptive Feature Processing Module Based on Prompts for Medical Image Detection

Med-PU: Point Cloud Upsampling for High-Fidelity 3D Medical Shape Reconstruction

ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models

Med-Flamingo: a Multimodal Medical Few-shot Learner