搜索 — ResearchTracker

Reinforcement learning (RL) agents under partial observability often condition actions on internally accumulated information such as memory or inferred latent context. We formalise such information-conditioned interaction patterns as behavioural dependency: variation in action selection with respect to internal information under fixed observations. This induces a probe-relative notion of $ε$-behavioural equivalence and a within-policy behavioural distance that quantifies probe sensitivity. We establish three structural results. First, the set of policies exhibiting non-trivial behavioural dependency is not closed under convex aggregation. Second, behavioural distance contracts under convex combination. Third, we prove a sufficient local condition under which gradient ascent on a skewed mixture objective decreases behavioural distance when a dominant-mode gradient aligns with the direction of steepest contraction. Minimal bandit and partially observable gridworld experiments provide controlled witnesses of these mechanisms. In the examined settings, behavioural distance decreases under convex aggregation and under continued optimisation with skewed latent priors, and in these experi

Biased Error Attribution in Multi-Agent Human-AI Systems Under Delayed Feedback

arXiv2026-03-24作者：Teerthaa Parakh, Karen M. Feigh

Human decision-making is strongly influenced by cognitive biases, particularly under conditions of uncertainty and risk. While prior work has examined bias in single-step decisions with immediate outcomes and in human interaction with a single autonomous agent, comparatively little attention has been paid to decision-making under delayed outcomes involving multiple AI agents, where decisions at each step affect subsequent states. In this work, we study how delayed outcomes shape decision-making and responsibility attribution in a multi-agent human-AI task. Using a controlled game-based experiment, we analyze how participants adjust their behavior following positive and negative outcomes. We observe asymmetric responses to gains and losses, with stronger corrective adjustments after negative outcomes. Importantly, participants often fail to correctly identify the actions that caused failure and misattribute responsibility across AI agents, leading to systematic revisions of decisions that are weakly related to the underlying causes of poor performance. We refer to this phenomenon as a form of attribution bias, manifested as biased error attribution under delayed feedback. Our findin

搜索结果：under

On the Structural Non-Preservation of Epistemic Behaviour under Policy Transformation

Biased Error Attribution in Multi-Agent Human-AI Systems Under Delayed Feedback

Coverage Guarantees for Pseudo-Calibrated Conformal Prediction under Distribution Shift

FARM: Few-shot Adaptive Malware Family Classification under Concept Drift

2BRobust -- Overcoming TCP BBR Performance Degradation in Virtual Machines under CPU Contention

Bosses, Kings, and the Commons: Cooperation Under Power Asymmetry in LLM Societies

Confident Learning for Object Detection under Model Constraints

Generative models for decision-making under distributional shift

Training Under Attentional Competition Produces Persistent Biases in Visual Appearance

Physically Grounded 3D Generative Reconstruction under Hand Occlusion using Proprioception and Multi-Contact Touch

Dynamic Incentivized Cooperation under Changing Rewards

Luminol-AIDetect: Fast Zero-shot Machine-Generated Text Detection based on Perplexity under Text Shuffling

Explainability-Aware Evaluation of Transfer Learning Models for IoT DDoS Detection Under Resource Constraints

Net-Zero: A Comparative Study on Neural Network Design for Climate-Economic PDEs Under Uncertainty

Supervisory Control of Discrete Event Systems for Small Language Under Cyber Attacks

Soliton Dynamics of a Gauged Fokas-Lenells Equation Under Varying Effects of Dispersion and Nonlinearity

When Efficiency Backfires: Cascading LLMs Trigger Cascade Failure under Adversarial Attack

Discovering Causal Relationships using Proxy Variables under Unmeasured Confounding

Optimizing Flexible Complex Systems with Coupled and Co-Evolving Subsystems under Operational Uncertainties

Asymptotic breakdown point analysis of the minimum density power divergence estimator under independent non-homogeneous setups