搜索 — ResearchTracker

The adoption of AI agents is increasing rapidly. Terminal AI agents, i.e., AI agents that run in terminal environments, are a widely used type of AI agents. Terminal AI agents rely heavily on shell command execution to interact with the host systems. They adopt a three-list command-gating mechanism to mitigate security risks introduced by command execution, with denylists serving as the load-bearing component. However, modern operating systems often ship a large, ever-expanding set of shell commands with complex functionalities. Our observation is that even a built-in denylist of Claude Code, well-maintained by its developers, can overlook bypass commands that invalidate its effectiveness. Such negligence leads to fragile command denylists that cannot even block operations that practitioners expect them to block. This paper presents the first systematic characterization of command denylist fragility in terminal AI agents. The paper formalizes the command denylist fragility problem and proposes an LLM-driven pipeline, ShellSieve, to detect such fragility. It prompts the LLM to propose possible bypasses and iteratively repairs them using feedback from a validator that executes them i

Commander-GPT: Dividing and Routing for Multimodal Sarcasm Detection

arXiv2025-06-24作者：Yazhou Zhang, Chunwang Zou, Bo Wang

Multimodal sarcasm understanding is a high-order cognitive task. Although large language models (LLMs) have shown impressive performance on many downstream NLP tasks, growing evidence suggests that they struggle with sarcasm understanding. In this paper, we propose Commander-GPT, a modular decision routing framework inspired by military command theory. Rather than relying on a single LLM's capability, Commander-GPT orchestrates a team of specialized LLM agents where each agent will be selectively assigned to a focused sub-task such as keyword extraction, sentiment analysis, etc. Their outputs are then routed back to the commander, which integrates the information and performs the final sarcasm judgment. To coordinate these agents, we introduce three types of centralized commanders: (1) a trained lightweight encoder-based commander (e.g., multi-modal BERT); (2) four small autoregressive language models, serving as moderately capable commanders (e.g., DeepSeek-VL); (3) two large LLM-based commander (Gemini Pro and GPT-4o) that performs task routing, output aggregation, and sarcasm decision-making in a zero-shot fashion. We evaluate Commander-GPT on the MMSD and MMSD 2.0 benchmarks, c

搜索结果：Command

One Goal, Many Commands: Characterizing Denylist Fragility in AI Agents

Commander-GPT: Dividing and Routing for Multimodal Sarcasm Detection

CmdCaliper: A Semantic-Aware Command-Line Embedding Model and Dataset for Security Research

Revealing NVIDIA Closed-Source Driver Command Streams for CPU-GPU Runtime Behavior Insight

SCENIC: Semantic-Conditioned Edge-Aware Neural Framework for Structured IoT Command Generation

RACONTEUR: A Knowledgeable, Insightful, and Portable LLM-Powered Shell Command Explainer

A Numerical Investigation of Extremum-Seeking-Based Command Generation for Adaptively Controlled Systems

Improving Pretrained YAMNet for Enhanced Speech Command Detection via Transfer Learning

SHREC: a SRE Behaviour Knowledge Graph Model for Shell Command Recommendations

Command-line Risk Classification using Transformer-based Neural Architectures

Admittance-Guided Inverter Dispatch Command Manipulation Attack: A Grid Stability-Oriented Approach

Multimodal Deep Learning for ATCO Command Lifecycle Modeling and Workload Prediction

The Command Line GUIde: Graphical Interfaces from Man Pages via AI

Speech Command + Speech Emotion: Exploring Emotional Speech Commands as a Compound and Playful Modality

Commanding the Foul Shot: A New Ensemble of Free Throw Metrics

CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents

Evaluating Synthetic Command Attacks on Smart Voice Assistants

Hello Afrika: Speech Commands in Kinyarwanda

Advancing Airport Tower Command Recognition: Integrating Squeeze-and-Excitation and Broadcasted Residual Learning

Minimizing Sequential Confusion Error in Speech Command Recognition