搜索 — ResearchTracker

Large-scale reinforcement learning (RL) methods have proven highly effective in enhancing the reasoning abilities of large language models (LLMs), particularly for tasks with verifiable solutions such as mathematics and coding. However, applying this idea to machine translation (MT), where outputs are flexibly formatted and difficult to automatically evaluate with explicit rules, remains underexplored. In this work, we introduce MT-R1-Zero, the first open-source adaptation of the R1-Zero RL framework for MT without supervised fine-tuning or cold-start. We propose a rule-metric mixed reward mechanism to guide LLMs towards improved translation quality via emergent reasoning. On the WMT 24 English-Chinese benchmark, our MT-R1-Zero-3B-Mix achieves competitive performance, surpassing TowerInstruct-7B-v0.2 by an average of 1.26 points. Meanwhile, our MT-R1-Zero-7B-Mix attains a high average score of 62.25 across all metrics, placing it on par with advanced proprietary models such as GPT-4o and Claude-3.5-Sonnet, while the MT-R1-Zero-7B-Sem variant achieves state-of-the-art scores on semantic metrics. Moreover, our work exhibits strong generalization capabilities on out-of-distribution MT

CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models

arXiv2025-03-24作者：Weichen Fan, Amber Yijia Zheng, Raymond A. Yeh

Classifier-Free Guidance (CFG) is a widely adopted technique in diffusion/flow models to improve image fidelity and controllability. In this work, we first analytically study the effect of CFG on flow matching models trained on Gaussian mixtures where the ground-truth flow can be derived. We observe that in the early stages of training, when the flow estimation is inaccurate, CFG directs samples toward incorrect trajectories. Building on this observation, we propose CFG-Zero*, an improved CFG with two contributions: (a) optimized scale, where a scalar is optimized to correct for the inaccuracies in the estimated velocity, hence the * in the name; and (b) zero-init, which involves zeroing out the first few steps of the ODE solver. Experiments on both text-to-image (Lumina-Next, Stable Diffusion 3, and Flux) and text-to-video (Wan-2.1) generation demonstrate that CFG-Zero* consistently outperforms CFG, highlighting its effectiveness in guiding Flow Matching models. (Code is available at github.com/WeichenFan/CFG-Zero-star)

搜索结果：Zero

MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning

CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models

Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations

Cyber-Zero: Training Cybersecurity Agents without Runtime

VLN-Zero: Rapid Exploration and Cache-Enabled Neurosymbolic Vision-Language Planning for Zero-Shot Transfer in Robot Navigation

The Zero Slice of Quaternionic Real Bordism

SecureBank: A Financially-Aware Zero Trust Architecture for High-Assurance Banking Systems

Intent-Aware Authorization for Zero Trust CI/CD

Video models are zero-shot learners and reasoners

Identity Control Plane: The Unifying Layer for Zero Trust Infrastructure

Modularized Zero-shot VQA with Pre-trained Models

Agent3D-Zero: An Agent for Zero-shot 3D Understanding

FlashSpeech: Efficient Zero-Shot Speech Synthesis

DFVEdit: Conditional Delta Flow Vector for Zero-shot Video Editing

Zoom-Zero: Reinforced Coarse-to-Fine Video Understanding via Temporal Zoom-in

Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion

GET-Zero: Graph Embodiment Transformer for Zero-shot Embodiment Generalization

On the Zero-Error Capacity of Semantic Channels with Input and Output Memories

Hecke algebras and local Langlands correspondence for non-singular depth-zero representations

AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description