搜索 — ResearchTracker

Recent work synthesizes agentic tasks for post-training tool-using LLMs, yet robust generalization under shifts in tasks and toolsets remains an open challenge. We trace this brittleness to insufficient diversity in synthesized tasks. Scaling diversity is difficult because training requires tasks to remain executable and verifiable, while generalization demands coverage of diverse tool types, toolset combinations, and heterogeneous tool-use patterns. We propose DIVE, an evidence-driven recipe that inverts synthesis order, executing diverse, real-world tools first and reverse-deriving tasks strictly entailed by the resulting traces, thereby providing grounding by construction. DIVE scales structural diversity along two controllable axes, tool-pool coverage and per-task toolset variety, and an Evidence Collection--Task Derivation loop further induces rich multi-step tool-use patterns across 373 tools in five domains. Training Qwen3-8B on DIVE data (48k SFT + 3.2k RL) improves by +22 average points across 9 OOD benchmarks and outperforms the strongest 8B baseline by +68. Remarkably, controlled scaling analysis reveals that diversity scaling consistently outperforms quantity scaling fo

DIVE: Towards Descriptive and Diverse Visual Commonsense Generation

arXiv2024-08-15作者：Jun-Hyung Park, Hyuntae Park, Youjin Kang

Towards human-level visual understanding, visual commonsense generation has been introduced to generate commonsense inferences beyond images. However, current research on visual commonsense generation has overlooked an important human cognitive ability: generating descriptive and diverse inferences. In this work, we propose a novel visual commonsense generation framework, called DIVE, which aims to improve the descriptiveness and diversity of generated inferences. DIVE involves two methods, generic inference filtering and contrastive retrieval learning, which address the limitations of existing visual commonsense resources and training objectives. Experimental results verify that DIVE outperforms state-of-the-art models for visual commonsense generation in terms of both descriptiveness and diversity, while showing a superior quality in generating unique and novel inferences. Notably, DIVE achieves human-level descriptiveness and diversity on Visual Commonsense Graphs. Furthermore, human evaluations confirm that DIVE aligns closely with human judgments on descriptiveness and diversity\footnote{Our code and dataset are available at https://github.com/Park-ing-lot/DIVE.

搜索结果：Dive

DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use

DIVE: Towards Descriptive and Diverse Visual Commonsense Generation

DIVE: Diversified Iterative Self-Improvement

DIVE into MoE: Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts

RAG-DIVE: A Dynamic Approach for Multi-Turn Dialogue Evaluation in Retrieval-Augmented Generation

DIVE: Taming DINO for Subject-Driven Video Editing

Pose-dIVE: Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification

"DIVE" into Hydrogen Storage Materials Discovery with AI Agents

DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer

Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

DIVE: Deep-search Iterative Video Exploration A Technical Report for the CVRR Challenge at CVPR 2025

Learning To Dive In Branch And Bound

DiVE-k: Differential Visual Reasoning for Fine-grained Image Recognition

A Deep Dive into the Impact of Solar Storms on LEO Satellite Networks

A Deep Dive into Large Language Models for Automated Bug Localization and Repair

DIVE: A spatiotemporal progression model of brain pathology in neurodegenerative disorders

A Deep Dive Into How Open-Source Project Maintainers Review and Resolve Bug Bounty Reports

DIVE: Subgraph Disagreement for Graph Out-of-Distribution Generalization

Dispersal and dive patterns in gravid leatherback turtles during the nesting season in French Guiana

Short-term effect of hyperbaric exposure on Ventilation: A Control Study of 12m-depth Single No-decompression Dive Experiment