Reinforcement learning (RL) agents under partial observability often condition actions on internally accumulated information such as memory or inferred latent context. We formalise such information-conditioned interaction patterns as behavioural dependency: variation in action selection with respect to internal information under fixed observations. This induces a probe-relative notion of $ε$-behavioural equivalence and a within-policy behavioural distance that quantifies probe sensitivity. We establish three structural results. First, the set of policies exhibiting non-trivial behavioural dependency is not closed under convex aggregation. Second, behavioural distance contracts under convex combination. Third, we prove a sufficient local condition under which gradient ascent on a skewed mixture objective decreases behavioural distance when a dominant-mode gradient aligns with the direction of steepest contraction. Minimal bandit and partially observable gridworld experiments provide controlled witnesses of these mechanisms. In the examined settings, behavioural distance decreases under convex aggregation and under continued optimisation with skewed latent priors, and in these experi
Human decision-making is strongly influenced by cognitive biases, particularly under conditions of uncertainty and risk. While prior work has examined bias in single-step decisions with immediate outcomes and in human interaction with a single autonomous agent, comparatively little attention has been paid to decision-making under delayed outcomes involving multiple AI agents, where decisions at each step affect subsequent states. In this work, we study how delayed outcomes shape decision-making and responsibility attribution in a multi-agent human-AI task. Using a controlled game-based experiment, we analyze how participants adjust their behavior following positive and negative outcomes. We observe asymmetric responses to gains and losses, with stronger corrective adjustments after negative outcomes. Importantly, participants often fail to correctly identify the actions that caused failure and misattribute responsibility across AI agents, leading to systematic revisions of decisions that are weakly related to the underlying causes of poor performance. We refer to this phenomenon as a form of attribution bias, manifested as biased error attribution under delayed feedback. Our findin
Conformal prediction (CP) offers distribution-free marginal coverage guarantees under an exchangeability assumption, but these guarantees can fail if the data distribution shifts. We analyze the use of pseudo-calibration as a tool to counter this performance loss under a bounded label-conditional covariate shift model. Using tools from domain adaptation, we derive a lower bound on target coverage in terms of the source-domain loss of the classifier and a Wasserstein measure of the shift. Using this result, we provide a method to design pseudo-calibrated sets that inflate the conformal threshold by a slack parameter to keep target coverage above a prescribed level. Finally, we propose a source-tuned pseudo-calibration algorithm that interpolates between hard pseudo-labels and randomized labels as a function of classifier uncertainty. Numerical experiments show that our bounds qualitatively track pseudo-calibration behavior and that the source-tuned scheme mitigates coverage degradation under distribution shift while maintaining nontrivial prediction set sizes.
Communities can sustainably manage shared resources (commons) through self-governance and cooperative norms, a central finding of Ostrom's theory of self-governance. However, real-world commons (e.g., fisheries, forests, and irrigation systems) are often governed under asymmetric power structures, where certain individuals or institutions possess disproportionate control over resource extraction and collective outcomes. As Large Language Models (LLMs) are increasingly explored as agents in synthetic governance simulations, understanding how LLM societies behave under asymmetric power structures is becoming increasingly important, yet existing evaluations largely ignore such asymmetries. We introduce Sovereignty over the Commons Simulation (SovSim), a generative multi-agent simulation framework that incorporates an agent with asymmetric power (boss or king) into a society of symmetric agents (workers or peasants), where all agents extract from a shared resource, collectively determining its sustainability over time. Across eleven state-of-the-art models, we find that introducing asymmetric power leads to severe breakdowns in cooperation and sustainability, with up to an 87.3% degrad
We introduce c-Lattice Aggregation, a fault-tolerant reconstruction problem for distributed verification under crash and Byzantine failures. In our setting, n asynchronous processes supervise a concurrent execution I: each process holds a local sample, and must collaboratively reconstruct I from partial, potentially overlapping observations. A protocol solves c-Lattice Aggregation if at least c correct processes output the complete execution I, while all correct outputs are comparable and bounded by I. This strengthens Lattice Agreement [Attiya, Herlihy and Rachman, 1995] and Byzantine Lattice Agreement [Di Luna et al., 2020; Zheng and Garg, 2020]. We parameterize inputs by a redundancy parameter x -- every element of I appears in at least x initial samples -- and establish tight feasibility thresholds. Under crash failures with at most t faulty processes, Lattice Aggregation is solvable if and only if x >= t + 1. Under Byzantine failures with t < n/3, c-Lattice Aggregation is solvable if and only if x >= 2t + c. All bounds are tight: we present matching algorithms based on SCD-broadcast [Imbs et al., 2018; Khanchandani and Wattenhofer, 2024] and indistinguishability-based
Motivated by the recent introduction and large-scale deployment of BBR congestion control algorithms, multiple studies have investigated the performance and fairness implications of this shift from loss-based to delay-based congestion control. Given the potential Internet-wide adoption of BBR, we must also consider its robustness in network and system scenarios. One such scenario is Cloud-based Virtual Machine (VM) networking - highly relevant in today's CDN-centric Internet. Interestingly, previous work has shown significant performance problems of BBRv1-2 running in Xen VMs, with BBR performance dropping to almost zero when CPU credit is low. In this paper, we develop a framework for measuring TCP throughput under fully controlled CPU contention, which uses Linux deadline scheduling to emulate generalized CPU contention conditions. Our measurements reveal that - in stark contrast to Cubic! - BBR throughput can break down during CPU contention under any hypervisor and all tested BDP conditions. Characterizing this performance degradation on a fine-granular level, we show that CPU limited BBR senders are capped at very low throughput levels below 10-20 Mbps. This finding implies th
Malware classification models often suffer performance degradation under concept drift due to evolving threat landscapes and the emergence of novel malware families. This paper presents FARM (Few-shot Adaptive Recognition of Malware), a unified framework for detecting and adapting to both covariate drift and label drift in Windows Portable Executable (PE) malware family classification. FARM uses a triplet autoencoder to project samples into a discriminative latent space, enabling unsupervised drift detection through DBSCAN clustering and dynamic thresholding. To enable rapid adaptation, the framework employs a few-shot strategy that can incorporate new classes from only a small number of labeled samples. FARM also supports full retraining when sufficient drifted samples accumulate, allowing longer-term model updating. Experiments on the BenchMFC dataset show that FARM improves classification performance under covariate drift by 5.6%, and achieves an average F1 score of 0.85 on unseen malware families using few-shot adaptation, increasing to 0.94 after retraining. These results indicate that FARM provides an effective approach for drift-aware malware family classification in dynamic
Selective attention can momentarily alter visual appearance, but can such effects be learned? We tested whether training attention under sensory competition produces lasting changes in perceived contrast. Across seven days, participants trained on an orientation task with a fixed target location, with or without a salient distractor. Before and after training, we measured the point of subjective equality (PSE). Training under competition produced a reliable push-pull shift. Stimuli at the trained location appeared higher in contrast, whereas stimuli at the untrained location appeared lower. Conversely, training without distractors improved performance but did not alter appearance. Crucially, these opponent shifts were robust to task variations, persisting even in equality judgments designed to minimize response bias. Furthermore, the effect generalized to stimuli with novel orientation and contrast levels. These findings demonstrate that resolving sensory competition does not merely improve discrimination, but durably recalibrates the subjective appearance of the visual world.
Polymer chain scission is a key mechanism for fracture of soft materials. It is well known from single-molecule force spectroscopy experiments that the critical condition for chain scission depends on the loading rate and other environmental effects (e.g., temperature and solvent). Common approaches to describing the kinetics of chain scission often assume force-controlled conditions, that is, when a polymer chain is stretched by a prescribed force. As a result of this assumption, chain scission is irreversible, excluding the possibility of healing. In many soft materials, however, self-healing has been observed after fracture, suggesting possibly reversible chain scission. Here, we show that reversible chain scission is possible under displacement-controlled conditions, that is, when a polymer chain is stretched with a prescribed end-to-end distance. We present a breakable freely-jointed chain model, assuming that a polymer chain breaks when one of its links breaks while the other links remain nearly rigid. At a prescribed end-to-end distance, the free energy of the chain has two local minima and a local maximum (the transition state), giving rise to energy barriers for chain scis
Distributed denial-of-service (DDoS) attacks threaten the availability of Internet of Things (IoT) infrastructures, particularly under resource-constrained deployment conditions. Although transfer learning models have shown promising detection accuracy, their reliability, computational feasibility, and interpretability in operational environments remain insufficiently explored. This study presents an explainability-aware empirical evaluation of seven pre-trained convolutional neural network architectures for multi-class IoT DDoS detection using the CICDDoS2019 dataset and an image-based traffic representation. The analysis integrates performance metrics, reliability-oriented statistics (MCC, Youden Index, confidence intervals), latency and training cost assessment, and interpretability evaluation using Grad-CAM and SHAP. Results indicate that DenseNet and MobileNet-based architectures achieve strong detection performance while demonstrating superior reliability and compact, class-consistent attribution patterns. DenseNet169 offers the strongest reliability and interpretability alignment, whereas MobileNetV3 provides an effective latency-accuracy trade-off for fog-level deployment.
Large Language Model (LLM) cascade systems are designed to balance efficiency and performance by processing queries with lightweight models while selectively escalating complex cases to more powerful ones. Such systems seek to reduces computational cost and latency while maintaining task performance, making it an appealing choice for large-scale deployment. However, the cascade design introduces new vulnerabilities through an expanded attack surface: the inclusion of lightweight front-end models and internal decision mechanisms introduces new weaknesses. In this work, we present the first study demonstrating that LLM cascade systems are susceptible to targeted adversarial manipulation, which disrupts both performance objectives and the intended cost advantages of the cascade design. We propose a novel attack framework that employs constrained sequential collaborative optimization of adversarial suffix under cascade dependencies, enabling simultaneous exploitation of lightweight models and decision mechanisms. This framework adapts to adversaries with varying capabilities, inducing controllable degradation in both cost-efficiency and accuracy. Unlike prior attacks targeting standalo
Peer incentivization (PI) is a popular multi-agent reinforcement learning approach where all agents can reward or penalize each other to achieve cooperation in social dilemmas. Despite their potential for scalable cooperation, current PI methods heavily depend on fixed incentive values that need to be appropriately chosen with respect to the environmental rewards and thus are highly sensitive to their changes. Therefore, they fail to maintain cooperation under changing rewards in the environment, e.g., caused by modified specifications, varying supply and demand, or sensory flaws - even when the conditions for mutual cooperation remain the same. In this paper, we propose Dynamic Reward Incentives for Variable Exchange (DRIVE), an adaptive PI approach to cooperation in social dilemmas with changing rewards. DRIVE agents reciprocally exchange reward differences to incentivize mutual cooperation in a completely decentralized way. We show how DRIVE achieves mutual cooperation in the general Prisoner's Dilemma and empirically evaluate DRIVE in more complex sequential social dilemmas with changing rewards, demonstrating its ability to achieve and maintain cooperation, in contrast to curr
We generalize the convergence results of an explosive autoregression, pioneered in Anderson (1959), in three ways: First, we demonstrate that the centered least-squares estimator converges geometrically to a ratio of limits, even in settings where the innovations are correlated and not centered around zero. Secondly, we demonstrate that the requirement of independent innovations in Anderson (1959), Theorem 2.3, can be relaxed to $α$-mixing. Third, we provide an autocorrelation-robust feasible test statistic for the explosive parameter under Gaussian ARMA innovations.
We introduce Ergodic Deviation-Robust Equilibrium (EDRE), a dynamics-relative equilibrium concept for repeated finite games in which agents learn via entropic mirror descent (EMD). EDRE requires three properties to hold simultaneously for the same profile and learning run: (E1) the limit profile is an $\varepsilon$-Nash equilibrium at a product distribution; (E2) along the entire learning trajectory, every fixed coalition's cumulative aggregate (summed-unilateral) deviation gain is $\tilde{\mathcal{O}}(\sqrt{T})$ with high probability; and (E3) the limit profile is a fixed point of the EMD map, so that it is selected by the dynamics rather than merely certified as an equilibrium. We prove that the $\sqrt{T}$ deviation-regret rate is order-tight, establish existence in exact-potential games (via Nash's theorem, with a constructive proximal route under concavity) together with Lyapunov monotonicity of EMD (and pointwise convergence when the fixed-point set is a singleton), and extend the selection property to monotone polymatrix games through variational inequalities. Although a static EDRE coincides with an $\varepsilon$-Nash equilibrium, its content is dynamic: robust (positive-mea
Inferring causal relationships between variable pairs in the observational study is crucial but challenging, due to the presence of unmeasured confounding. While previous methods employed the negative controls to adjust for the confounding bias, they were either restricted to the discrete setting (i.e., all variables are discrete) or relied on strong assumptions for identification. To address these problems, we develop a general nonparametric approach that accommodates both discrete and continuous settings for testing causal hypothesis under unmeasured confounders. By using only a single negative control outcome (NCO), we establish a new identification result based on a newly proposed integral equation that links the outcome and NCO, requiring only the completeness and mild regularity conditions. We then propose a kernel-based testing procedure that is more efficient than existing moment-restriction methods. We derive the asymptotic level and power properties for our tests. Furthermore, we examine cases where our procedure using only NCO fails to achieve identification, and introduce a new procedure that incorporates a negative control exposure (NCE) to restore identifiability. We
Agricultural weed detection on edge devices is subject to strict constraints on model capacity, computational resources, and real-time inference latency, which prevent performance improvements through model scaling or ensembling. This paper proposes Model-Driven Data Correction (MDDC), a data-centric framework that enhances detection performance by iteratively diagnosing and correcting data quality deficiencies. An automated error analysis procedure categorizes detection failures into four types: false negatives, false positives, class confusion, and localization errors. These error patterns are systematically addressed through a structured train-fix-retrain pipeline with version-controlled data management. Experimental results on multiple weed detection datasets demonstrate consistent improvements of 5-25 percent in mAP at 0.5 using a fixed lightweight detector (YOLOv8n), indicating that systematic data quality optimization can effectively alleviate performance bottlenecks under fixed model capacity constraints.
Large Language Models (LLMs) have shown remarkable progress in multiple-choice question answering (MCQA), but their inherent unreliability, such as hallucination and overconfidence, limits their application in high-risk domains. To address this, we propose a frequency-based uncertainty quantification method under black-box settings, leveraging conformal prediction (CP) to ensure provable coverage guarantees. Our approach involves multiple independent samplings of the model's output distribution for each input, with the most frequent sample serving as a reference to calculate predictive entropy (PE). Experimental evaluations across six LLMs and four datasets (MedMCQA, MedQA, MMLU, MMLU-Pro) demonstrate that frequency-based PE outperforms logit-based PE in distinguishing between correct and incorrect predictions, as measured by AUROC. Furthermore, the method effectively controls the empirical miscoverage rate under user-specified risk levels, validating that sampling frequency can serve as a viable substitute for logit-based probabilities in black-box scenarios. This work provides a distribution-free model-agnostic framework for reliable uncertainty quantification in MCQA with guaran
Machine-generated text (MGT) detection requires identifying structurally invariant signals across generation models, rather than relying on model-specific fingerprints. In this respect, we hypothesize that while large language models excel at local semantic consistency, their autoregressive nature results in a specific kind of structural fragility compared to human writing. We propose Luminol-AIDetect, a novel, zero-shot statistical approach that exposes this fragility through coherence disruption. By applying a simple randomized text-shuffling procedure, we demonstrate that the resulting shift in perplexity serves as a principled, model-agnostic discriminant, as MGT displays a characteristic dispersion in perplexity-under-shuffling that differs markedly from the more stable structural variability of human-written text. Luminol-AIDetect leverages this distinction to inform its decision process, where a handful of perplexity-based scalar features are extracted from an input text and its shuffled version, then detection is performed via density estimation and ensemble-based prediction. Evaluated across 8 content domains, 11 adversarial attack types, and 18 languages, Luminol-AIDetect
Cyber attacks are unavoidable in networked discrete event systems where the plant and the supervisor communicate with each other via networks. Because of the nondeterminism in observation and control caused by cyber attacks, the language generated by the supervised system becomes nondeterministic. The small language is defined as the lower bound on all possible languages that can be generated by the supervised system, which is needed for a supervised system to perform some required tasks under cyber attacks. In this paper, we investigate supervisory control for the small language. After introducing CA-S-controllability and CA-S-observability, we prove that the supervisory control problem of achieving a required small language is solvable if and only if the given language is CA-Scontrollable and CA-S-observable. If the given language is not CA-S controllable and/or CA-S-observable, we derive conditions under which the infimal CA-S-controllable and CA-S-observable superlanguage exists and can be used to design a supervisor satisfying the given requirement.
The paper develops a novel design optimization framework and associated computational techniques for staged deployment optimization of complex systems under operational uncertainties. It proposes a local scenario discretization method that offers a computationally efficient approach to optimize staged co-deployment of multiple coupled subsystems by decoupling weak dynamic interaction among subsystems. The proposed method is applied to case studies and is demonstrated to provide an effective and scalable strategy to determine the optimal and flexible systems design under uncertainty. The developed optimization framework is expected to improve the staged deployment design of various complex engineering systems, such as water, energy, food, and other infrastructure systems.