搜索 — ResearchTracker

Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions: binary-only feedback, no gradient access, and strict query budgets. We formalize this strict black-box threat model and propose a two-agent evasion framework operating in a semantic perturbation space. An Attacker Agent generates meaning-preserving rewrites while a Prompt Optimization Agent refines the attack strategy using only binary decision feedback within a 10-query budget. Evaluated against four evidence-based misinformation detection pipelines, the framework achieves evasion rates of 19.95 to 40.34% on modern large language model (LLM) based systems, compared to at most 3.90% for token-level perturbation baselines that rely on surrogate models because they cannot operate under our threat model. A legacy system relying on static lexical retrieval exhibits near-total vulnerability 97.02%, establishing a lower bound that exposes how architectural choices govern the attack surface. Evasion effectiveness is associated with three architectural properties: evidence retrieval mechanis

Theory-Grounded Evaluation Exposes the Authorship Gap in LLM Personalization

arXiv2026-04-29作者：Yash Ganpat Sawant

Stylistic personalization - making LLMs write in a specific individual's style, rather than merely adapting to task preferences - lacks evaluation grounded in authorship science. We show that grounding evaluation in authorship verification theory transforms what benchmarks can measure. Drawing on three measurement traditions - LUAR, a trained authorship verification model; an LLM-as-judge with decoupled trait matching; and classical function-word stylometrics - we evaluate four inference-time personalization methods across 50 authors and 1,000 generations. The theory-grounded metric, LUAR, provides what ad hoc alternatives cannot: calibrated baselines, with a human ceiling of 0.756 and a cross-author floor of 0.626, that give scores absolute meaning. All methods score below this floor, from 0.484 to 0.508, exposing an authorship gap invisible to uncalibrated metrics. The three metrics produce near-zero pairwise correlations, with absolute r less than 0.07, confirming that without theoretical grounding, metric choice determines conclusions: an LLM judge declares a clear winner while LUAR finds no meaningful differentiation. These findings demonstrate the theory-benchmark cycle in ac

搜索结果：Exposes

Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines

Theory-Grounded Evaluation Exposes the Authorship Gap in LLM Personalization

SWE-ABS: Adversarial Benchmark Strengthening Exposes Inflated Success Rates on Test-based Benchmark

Wrongful Arrest Exposes Failures in One of the Oldest Police Face-Recognition Tools in the US

Shape mode analysis exposes movement patterns in biology: flagella and flatworms as case studies

OBLIQ-Bench: Exposing Overlooked Bottlenecks in Modern Retrievers with Latent and Implicit Queries

RADAR: Exposing Unlogged NoSQL Operations

Modeling the Impact of Exposed Cases in a Hantavirus Outbreak on a Cruise Ship

Exposing and Defending the Achilles' Heel of Video Mixture-of-Experts

AI-exposed jobs deteriorated before ChatGPT

Development of an electrodynamic balance to study single levitated particles exposed to alkali-metal vapor

Inner approximations of convex sets and intersections of projectionally exposed cones

Strongly exposed points in Orlicz-Lorentz spaces equipped with the Orlicz norm

Exposed extreme rays of the SONC cone

Exposing Image Classifier Shortcuts with Counterfactual Frequency (CoF) Tables

Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models

faulTPM: Exposing AMD fTPMs' Deepest Secrets

DiffusionWorldViewer: Exposing and Broadening the Worldview Reflected by Generative Text-to-Image Models

Strategyproofness-Exposing Descriptions of Matching Mechanisms

Nonparametric estimation of the interventional disparity indirect effect among the exposed