搜索 — ResearchTracker

Video understanding requires identifying and reasoning over semantically discriminative visual objects across frames, yet existing object-agnostic solutions struggle to effectively handle substantial object variations over time. To address this, we introduce Chain-of-Glimpse, a search-guided progressive object-grounded reasoning framework that explicitly anchors each reasoning step to specific visual evidence regions, enabling compositional and multi-step decision-making. Formally, Chain-of-Glimpse formulates video reasoning as a step-by-step process that incrementally builds spatially grounded traces around task-relevant visual objects, thereby mitigating over-reliance on saliency-driven cues. Specifically, Chain-of-Glimpse features a search-guided controller, optimized via reinforcement learning with a format reward that significantly incentivizes grounding capability, to iteratively ground visual evidence regions and form reliable reasoning trajectories, yielding accurate and interpretable multi-step decisions. Extensive evaluations on both in domain NExTQA and out-of-domain Video-Holmes, CG-Bench Reasoning, and VRBench benchmarks demonstrate consistent performance gains, robust

AdaGlimpse: Active Visual Exploration with Arbitrary Glimpse Position and Scale

arXiv2024-04-04作者：Adam Pardyl, Michał Wronka, Maciej Wołczyk

Active Visual Exploration (AVE) is a task that involves dynamically selecting observations (glimpses), which is critical to facilitate comprehension and navigation within an environment. While modern AVE methods have demonstrated impressive performance, they are constrained to fixed-scale glimpses from rigid grids. In contrast, existing mobile platforms equipped with optical zoom capabilities can capture glimpses of arbitrary positions and scales. To address this gap between software and hardware capabilities, we introduce AdaGlimpse. It uses Soft Actor-Critic, a reinforcement learning algorithm tailored for exploration tasks, to select glimpses of arbitrary position and scale. This approach enables our model to rapidly establish a general awareness of the environment before zooming in for detailed analysis. Experimental results demonstrate that AdaGlimpse surpasses previous methods across various visual tasks while maintaining greater applicability in realistic AVE scenarios.

搜索结果：glimpse

Chain-of-Glimpse: Search-Guided Progressive Object-Grounded Reasoning for Video Understanding

AdaGlimpse: Active Visual Exploration with Arbitrary Glimpse Position and Scale

Other red dots: A possible GLIMPSE of normal AGB stars at Cosmic Noon through extreme lensing

GLIMPSE-D: Metallicity Decline in Faint Galaxies: Implications for [O III]+Hb Luminosity Function and Reionisation Budget

JWST's GLIMPSE: an overview of the deepest probe of early galaxy formation and cosmic reionization

A GLIMPSE into the UV Continuum Slopes of the Faintest Galaxies in the Epoch of Reionization

A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models

A Glimpse of the Low-Mass End of the Direct Mass-Metallicity Relation at $z\sim6-8$

GLIMPSE: Do Large Vision-Language Models Truly Think With Videos or Just Glimpse at Them?

GLIMPSE: Holistic Cross-Modal Explainability for Large Vision-Language Models

GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online Action Prediction

A glimpse into the magical world of quantum gravity

Glimpse: Generalized Locality for Scalable and Robust CT

Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text Detection

Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning

A Glimpse of the New Redshift Frontier Through Abell S1063

A GLIMPSE into the very faint-end of the H$β$+[OIII]$λλ$4960,5008 luminosity function at z=7-9 behind Abell S1063

Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points

GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews

Near-infrared spectra of Galactic stellar clusters detected on Spitzer/GLIMPSE images