搜索 — ResearchTracker

Human communication is inherently multimodal, where language is often accompanied by non-verbal cues such as gestures to convey intentions. However, current Vision-Language-Action (VLA) models treat robotic manipulation as a pure text-driven task, overlooking the important role of gestures in Human-Robot Interaction (HRI). This often leads to inaccurate intent grounding and unreliable manipulation when language instructions are ambiguous or underspecified. To address this challenge, we propose GIVE (Gesture Intent via Visual-Semantic Enhancement), an effective approach that enhances pre-trained VLA models with human gesture understanding without architectural modifications. Specifically, GIVE incorporates gesture information through two complementary pathways: a visual pathway that overlays hand skeletons and fingertip rays onto robot observations for explicit object grounding, and a semantic pathway that generates high-level descriptions of human gestures and task instructions for robust intent grounding. By jointly leveraging visual and semantic guidance, GIVE enables VLA policies to better associate gestures with manipulation behaviors and adapt to dynamic interaction intents. I

GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation

arXiv2024-10-11作者：Jiashu He, Mingyu Derek Ma, Jinxuan Fan

Existing approaches based on context prompting or reinforcement learning (RL) to improve the reasoning capacities of large language models (LLMs) depend on the LLMs' internal knowledge to produce reliable Chain-Of-Thought (CoT). However, no matter the size of LLMs, certain problems cannot be resolved in a single forward pass. Meanwhile, agent-based reasoning systems require access to a comprehensive nonparametric knowledge base, which is often costly or not feasible for use in scientific and niche domains. We present Graph Inspired Veracity Extrapolation (GIVE), a novel reasoning method that merges parametric and non-parametric memories to improve accurate reasoning with minimal external input. GIVE guides the LLM agent to select the most pertinent expert data (observe), engage in query-specific divergent thinking (reflect), and then synthesize this information to produce the final output (speak). Extensive experiments demonstrated the following benefits of our framework: (1) GIVE boosts the performance of LLMs across various sizes. (2) In some scenarios, GIVE allows smaller LLMs to surpass larger, more sophisticated ones in scientific tasks (GPT3.5T + GIVE > GPT4). (3) GIVE is

搜索结果：give

GIVE: Grounding Human Gestures in Vision-Language-Action Models

GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation

Curvature batching gives single-exponential integer quadratic programming

Recommendations to OSCE/ODIHR (on how to give better recommendations for Internet voting)

It's Safer to Give Personhood to Bears than to Artificial Intelligence

Quantum oracles give an advantage for identifying classical counterfactuals

Weakly elliptic damping gives sharp decay

Driver Heterogeneity in Willingness to Give Control to Conditional Automation

GiVE: Guiding Visual Encoder to Perceive Overlooked Information

How many adjunctions give rise to the same monad?

On allocations that give intersecting groups their fair share

Cellular automata that generate symmetrical patterns give singular functions

The Weierstrass Representation always gives a minimal surface

Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients?

Can Maxwell's fish eye lens really give perfect imaging?

Does Distributionally Robust Supervised Learning Give Robust Classifiers?

Positive Hamiltonians can give purely exponential decay

Why do experts give simple advice?

Strategy in Ulam's Game and Tree Code Give Error-Resistant Protocols

A closed quantum system giving ergodicity