Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions: binary-only feedback, no gradient access, and strict query budgets. We formalize this strict black-box threat model and propose a two-agent evasion framework operating in a semantic perturbation space. An Attacker Agent generates meaning-preserving rewrites while a Prompt Optimization Agent refines the attack strategy using only binary decision feedback within a 10-query budget. Evaluated against four evidence-based misinformation detection pipelines, the framework achieves evasion rates of 19.95 to 40.34% on modern large language model (LLM) based systems, compared to at most 3.90% for token-level perturbation baselines that rely on surrogate models because they cannot operate under our threat model. A legacy system relying on static lexical retrieval exhibits near-total vulnerability 97.02%, establishing a lower bound that exposes how architectural choices govern the attack surface. Evasion effectiveness is associated with three architectural properties: evidence retrieval mechanis
Stylistic personalization - making LLMs write in a specific individual's style, rather than merely adapting to task preferences - lacks evaluation grounded in authorship science. We show that grounding evaluation in authorship verification theory transforms what benchmarks can measure. Drawing on three measurement traditions - LUAR, a trained authorship verification model; an LLM-as-judge with decoupled trait matching; and classical function-word stylometrics - we evaluate four inference-time personalization methods across 50 authors and 1,000 generations. The theory-grounded metric, LUAR, provides what ad hoc alternatives cannot: calibrated baselines, with a human ceiling of 0.756 and a cross-author floor of 0.626, that give scores absolute meaning. All methods score below this floor, from 0.484 to 0.508, exposing an authorship gap invisible to uncalibrated metrics. The three metrics produce near-zero pairwise correlations, with absolute r less than 0.07, confirming that without theoretical grounding, metric choice determines conclusions: an LLM judge declares a clear winner while LUAR finds no meaningful differentiation. These findings demonstrate the theory-benchmark cycle in ac
The SWE-Bench Verified leaderboard is approaching saturation, with the top system achieving 78.80%. However, we show that this performance is inflated. Our re-evaluation reveals that one in five "solved" patches from the top-30 agents are semantically incorrect, passing only because weak test suites fail to expose their errors. We present SWE-ABS, an adversarial framework that strengthens test suites through a two-stage pipeline: (1) coverage-driven augmentation using program slicing to target untested code regions, and (2) mutation-driven adversarial testing that synthesizes plausible but incorrect patches to expose semantic blind spots. On SWE-Bench Verified (500 instances), SWE-ABS strengthens 50.2% of instances, a 25.1x improvement over prior work, and rejects 19.71% of previously passing patches. As a result, the top agent's score decreases from 78.80% to 62.20%, leading to significant leaderboard reshuffling, with the previous top-ranked agent dropping to fifth place.
The ACLU is suing two Florida police departments over the arrest of a Fort Myers man in a child-abduction case, saying officers treated a flawed face-recognition match as a near-certain ID
We illustrate shape mode analysis as a simple, yet powerful technique to concisely describe complex biological shapes and their dynamics. We characterize undulatory bending waves of beating flagella and reconstruct a limit cycle of flagellar oscillations, paying particular attention to the periodicity of angular data. As a second example, we analyze non-convex boundary outlines of gliding flatworms, which allows us to expose stereotypic body postures that can be related to two different locomotion mechanisms. Further, shape mode analysis based on principal component analysis allows to discriminate different flatworm species, despite large motion-associated shape variability. Thus, complex shape dynamics is characterized by a small number of shape scores that change in time. We present this method using descriptive examples, explaining abstract mathematics in a graphic way.
Retrieval benchmarks are increasingly saturating, but we argue that efficient search is far from a solved problem. We identify a class of queries we call oblique, which seek documents that instantiate a latent pattern, like finding all tweets that express an implicit stance, chat logs that demonstrate a particular failure mode, or transcripts that match an abstract scenario. We study three mechanisms through which obliqueness may arise and introduce OBLIQ-Bench, a suite of five oblique search problems over real long-tail corpora. OBLIQ-Bench exposes an overlooked asymmetry between retrieval and verification, where reasoning LLMs reliably recognize latent relevance whenever relevant documents are surfaced, but even sophisticated retrieval pipelines fail to surface most relevant documents in the first place. We hope that OBLIQ-Bench will drive research into retrieval architectures that efficiently capture latent patterns and implicit signals in large corpora.
The widespread adoption of NoSQL databases has made digital forensics increasingly difficult as storage formats are diverse and often opaque, and audit logs cannot be assumed trustworthy when privileged insiders, such as DevOps or administrators, can disable, suppress, or manipulate logging to conceal activity. We present RADAR (Record & Artifact Detection, Alignment & Reporting), a log-adversary-aware framework that derives forensic ground truth by cross-referencing low-level storage artifacts against high-level application logs. RADAR analyzes artifacts reconstructed by the Automated NoSQL Carver (ANOC), which infers layouts and carves records directly from raw disk bytes, bypassing database APIs and the management system entirely, thereby treating physical storage as the independent evidence source. RADAR then reconciles carved artifacts with the audit log to identify delta artifacts such as unlogged insertions, silent deletions, and field-level updates that exist on disk but are absent from the logical history. We evaluate RADAR across ten NoSQL engines, including BerkeleyDB, LMDB, MDBX, etcd, ZODB, Durus, LiteDB, Realm, RavenDB, and NitriteDB, spanning key-value and do
The emergence of a hantavirus variant aboard a commercial cruise ship presents a significant public health concern. This study develops a discrete-time stochastic Susceptible-Exposed-Infectious-Recovered-Dead model to estimate transmission dynamics, hidden exposed infections, and outbreak risk among passengers and crew. Epidemiological parameters and latent disease states were inferred using an Ensemble Adjustment Kalman Filter calibrated to reported case data from WHO and ECDC situation reports. The estimated basic reproduction number was 2.76, with a 95\% confidence interval of 2.52-2.99, indicating substantial potential for sustained onboard transmission before strict quarantine measures. Simulations further suggest that several exposed individuals may remain unidentified during the early outbreak phase, creating a hidden reservoir that symptom-based surveillance alone may fail to detect. These findings highlight the importance of rapid surveillance, widespread testing, targeted quarantine, and active monitoring of exposed individuals in confined travel settings. The proposed modeling framework can support timely outbreak assessment and intervention planning for infectious-disea
Mixture-of-Experts (MoE) has demonstrated strong performance in video understanding tasks, yet its adversarial robustness remains underexplored. Existing attack methods often treat MoE as a unified architecture, overlooking the independent and collaborative weaknesses of key components such as routers and expert modules. To fill this gap, we propose Temporal Lipschitz-Guided Attacks (TLGA) to thoroughly investigate component-level vulnerabilities in video MoE models. We first design attacks on the router, revealing its independent weaknesses. Building on this, we introduce Joint Temporal Lipschitz-Guided Attacks (J-TLGA), which collaboratively perturb both routers and experts. This joint attack significantly amplifies adversarial effects and exposes the Achilles' Heel (collaborative weaknesses) of the MoE architecture. Based on these insights, we further propose Joint Temporal Lipschitz Adversarial Training (J-TLAT). J-TLAT performs joint training to further defend against collaborative weaknesses, enhancing component-wise robustness. Our framework is plug-and-play and reduces inference cost by more than 60% compared with dense models. It consistently enhances adversarial robustnes
Public debate links worsening job prospects for AI-exposed occupations to the release of ChatGPT in late 2022. Using monthly U.S. unemployment insurance records, we measure occupation- and location-specific unemployment risk and find that risk rose in AI-exposed occupations beginning in early 2022, months before ChatGPT. Analyzing millions of LinkedIn profiles, we show that graduate cohorts from 2021 onward entered AI-exposed jobs at lower rates than earlier cohorts, with gaps opening before late 2022. Finally, from millions of university syllabi, we find that graduates taking more AI-exposed curricula had higher first-job pay and shorter job searches after ChatGPT. Together, these results point to forces pre-dating generative AI and to the ongoing value of LLM-relevant education.
Electrodynamic balances (EDBs) have been widely used to investigate reactions between levitated particles and background gases. In this paper, we report the development of an EDB that exposes trapped particles to alkali-metal vapor. The apparatus was developed principally to investigate the interactions between such vapor and the paraffin used as a spin anti-relaxation coating for alkali-metal vapor cells by atomic physicists. The trap electrodes of the EDB were installed in a vacuum glass cell. Particles were loaded via laser launching, without venting or contaminating the cell. Alkali-metal vapor was released from a dedicated dispenser. We found changes in the charge-to-mass ratios of trapped particles irradiated with ultraviolet light after exposure to alkali-metal vapor. These results demonstrate the utility of the apparatus.
A convex cone is said to be projectionally exposed (p-exposed) if every face arises as a projection of the original cone. It is known that, in dimension at most four, the intersection of two p-exposed cones is again p-exposed. In this paper we construct two p-exposed cones in dimension $5$ whose intersection is not p-exposed. This construction also leads to the first example of an amenable cone that is not projectionally exposed, showing that these properties, which coincide in dimension at most $4$, are distinct in dimension $5$. In order to achieve these goals, we develop a new technique for constructing arbitrarily tight inner convex approximations of compact convex sets with desired facial structure. These inner approximations have the property that all proper faces are extreme points, with the exception of a specific exposed face of the original set.
The criterion for a point in the unit ball to be a strongly exposed point is given. The necessity and sufficiency conditions for Orlicz-Lorentz spaces to possess strongly exposed property are given. Besides, some useful methods are obtained to handle issues related to decreasing rearrangement.
We provide a complete and explicit characterization of the exposed extreme rays of the cone of sums of nonnegative circuit (SONC) polynomials. The criterion we derive is purely combinatorial and depends only on the existence of certain circuits within the ground set and on the nature of the corresponding extreme ray. Our constructive proofs also yield explicit exposing functionals, offering a basis for algorithmic detection of exposed rays in SONC-based optimization.
The rise of deep learning in image classification has brought unprecedented accuracy but also highlighted a key issue: the use of 'shortcuts' by models. Such shortcuts are easy-to-learn patterns from the training data that fail to generalise to new data. Examples include the use of a copyright watermark to recognise horses, snowy background to recognise huskies, or ink markings to detect malignant skin lesions. The explainable AI (XAI) community has suggested using instance-level explanations to detect shortcuts without external data, but this requires the examination of many explanations to confirm the presence of such shortcuts, making it a labour-intensive process. To address these challenges, we introduce Counterfactual Frequency (CoF) tables, a novel approach that aggregates instance-based explanations into global insights, and exposes shortcuts. The aggregation implies the need for some semantic concepts to be used in the explanations, which we solve by labelling the segments of an image. We demonstrate the utility of CoF tables across several datasets, revealing the shortcuts learned from them.
Backdoor attacks covertly implant triggers into deep neural networks (DNNs) by poisoning a small portion of the training data with pre-designed backdoor triggers. This vulnerability is exacerbated in the era of large models, where extensive (pre-)training on web-crawled datasets is susceptible to compromise. In this paper, we introduce a novel two-step defense framework named Expose Before You Defend (EBYD). EBYD unifies existing backdoor defense methods into a comprehensive defense system with enhanced performance. Specifically, EBYD first exposes the backdoor functionality in the backdoored model through a model preprocessing step called backdoor exposure, and then applies detection and removal methods to the exposed model to identify and eliminate the backdoor features. In the first step of backdoor exposure, we propose a novel technique called Clean Unlearning (CUL), which proactively unlearns clean features from the backdoored model to reveal the hidden backdoor features. We also explore various model editing/modification techniques for backdoor exposure, including fine-tuning, model sparsification, and weight perturbation. Using EBYD, we conduct extensive experiments on 10 im
Trusted Platform Modules constitute an integral building block of modern security features. Moreover, as Windows 11 made a TPM 2.0 mandatory, they are subject to an ever-increasing academic challenge. While discrete TPMs - as found in higher-end systems - have been susceptible to attacks on their exposed communication interface, more common firmware TPMs (fTPMs) are immune to this attack vector as they do not communicate with the CPU via an exposed bus. In this paper, we analyze a new class of attacks against fTPMs: Attacking their Trusted Execution Environment can lead to a full TPM state compromise. We experimentally verify this attack by compromising the AMD Secure Processor, which constitutes the TEE for AMD's fTPMs. In contrast to previous dTPM sniffing attacks, this vulnerability exposes the complete internal TPM state of the fTPM. It allows us to extract any cryptographic material stored or sealed by the fTPM regardless of authentication mechanisms such as Platform Configuration Register validation or passphrases with anti-hammering protection. First, we demonstrate the impact of our findings by - to the best of our knowledge - enabling the first attack against Full Disk Enc
Generative text-to-image (TTI) models produce high-quality images from short textual descriptions and are widely used in academic and creative domains. Like humans, TTI models have a worldview, a conception of the world learned from their training data and task that influences the images they generate for a given prompt. However, the worldviews of TTI models are often hidden from users, making it challenging for users to build intuition about TTI outputs, and they are often misaligned with users' worldviews, resulting in output images that do not match user expectations. In response, we introduce DiffusionWorldViewer, an interactive interface that exposes a TTI model's worldview across output demographics and provides editing tools for aligning output images with user perspectives. In a user study with 18 diverse TTI users, we find that DiffusionWorldViewer helps users represent their varied viewpoints in generated images and challenge the limited worldview reflected in current TTI models.
A menu description exposes strategyproofness by presenting a mechanism to player $i$ in two steps. Step (1) uses others' reports to describe $i$'s menu of potential outcomes. Step (2) uses $i$'s report to select $i$'s favorite outcome from her menu. We provide novel menu descriptions of the Deferred Acceptance (DA) and Top Trading Cycles (TTC) matching mechanisms. For TTC, our description additionally yields a proof of the strategyproofness of TTC's traditional description, in a way that we prove is impossible for DA.
In situations with non-manipulable exposures, interventions can be targeted to shift the distribution of intermediate variables between exposure groups to define interventional disparity indirect effects. In this work, we present a theoretical study of identification and nonparametric estimation of the interventional disparity indirect effect among the exposed. The targeted estimand is intended for applications examining the outcome risk among an exposed population for which the risk is expected to be reduced if the distribution of a mediating variable was changed by a (hypothetical) policy or health intervention that targets the exposed population specifically. We derive the nonparametric efficient influence function, study its double robustness properties and present a targeted minimum loss-based estimation (TMLE) procedure. All theoretical results and algorithms are provided for both uncensored and right-censored survival outcomes. With offset in the ongoing discussion of the interpretation of non-manipulable exposures, we discuss relevant interpretations of the estimand under different sets of assumptions of no unmeasured confounding and provide a comparison of our estimand to