Permutation tests are a popular choice for distinguishing distributions and testing independence, due to their exact, finite-sample control of false positives and their minimax optimality when paired with U-statistics. However, standard permutation tests are also expensive, requiring a test statistic to be computed hundreds or thousands of times to detect a separation between distributions. In this work, we offer a simple approach to accelerate testing: group your datapoints into bins and permute only those bins. For U and V-statistics, we prove that these cheap permutation tests have two remarkable properties. First, by storing appropriate sufficient statistics, a cheap test can be run in time comparable to evaluating a single test statistic. Second, cheap permutation power closely approximates standard permutation power. As a result, cheap tests inherit the exact false positive control and minimax optimality of standard permutation tests while running in a fraction of the time. We complement these findings with improved power guarantees for standard permutation testing and experiments demonstrating the benefits of cheap permutations over standard maximum mean discrepancy (MMD), H
As large language models (LLMs) continue to grow in size, fewer users are able to host and run models locally. This has led to increased use of third-party hosting services. However, in this setting, there is a lack of guarantees on the computation performed by the inference provider. For example, a dishonest provider may replace an expensive large model with a cheaper-to-run weaker model and return the results from the weaker model to the user. Existing tools to verify inference typically rely on methods from cryptography such as zero-knowledge proofs (ZKPs), but these add significant computational overhead, and remain infeasible for use for large models. In this work, we develop a new insight -- that given a method for performing private LLM inference, one can obtain forms of verified inference at marginal extra cost. Specifically, we propose two new protocols which leverage privacy-preserving LLM inference in order to provide guarantees over the inference that was carried out. Our approaches are cheap, requiring the addition of a few extra tokens of computation, and have little to no downstream impact. As the fastest privacy-preserving inference methods are typically faster than
Machine learning training places immense demands on cluster networks, motivating specialized architectures and co-design with parallelization strategies. Recent designs incorporating optical circuit switches (OCSes) are promising, offering improved cost, power efficiency, and long-term bandwidth scaling than packet switches. However, most existing approaches rely on costly high-radix OCSes and/or combine them with packet switches to achieve competitive performance at scale. Unfortunately, high-radix OCSes are both expensive and slow to reconfigure, limiting both scalability and performance. We propose Arrays of Cheap Optical Switches (ACOS), which bring application co-design directly to the structure of the reconfigurable fabric. Using low-radix OCSes as building blocks, ACOS supports the forms of reconfiguration needed in training clusters including topology selection, workload adaptation, and failure resilience. The cost of ACOS scales with supported topologies and adaptations rather than with port count, breaking past the scalability barriers of current specialized ML networks. We show through simulation that ACOS-based deployments match the performance of fully provisioned pack
To scale optimization and simulation, prior work has explored training machine-learning surrogates that map problem parameters to solutions inexpensively at inference time. Unfortunately, commonly used approaches, including supervised and self-supervised learning with either soft or hard feasibility enforcement, face inherent challenges such as reliance on expensive high-quality labels or difficult optimization landscapes. To address their trade-offs, we propose a novel framework that collects "cheap" imperfect labels, performs supervised model pretraining with a merit loss-based termination scheme, and finally refines the model through self-supervised learning to improve final performance. Empirical validation across challenging domains -- including nonconvex constrained optimization, power-grid operation, and stiff dynamical systems -- shows that this three-stage strategy yields faster convergence; improved accuracy, feasibility, and optimality; and up to 59x reductions in total offline computational cost. We further analyze why and when our framework improves surrogate model training, finding that (i) merit loss is an informative signal and (ii) only small numbers of cheap, inex
Demand for expert-annotated data on the part of leading AI labs has created an expert gig economy with the potential to reshape white collar work and society's understanding of expertise. In this research, we study the vision for the future of expertise described in the public communication of five industry data annotation organizations and their CEOs, as reflected on social media feeds and public appearances on podcasts. We find that the industry envisions AI expertise as cheap, meaning that it can offer a better return on investment than human expertise. Human expertise, meanwhile, is viewed as an extractable resource, the value of which can be judged relative to AI expertise. Finally, institutional expertise (such as that created or possessed by universities and corporations) is viewed as in need of liberation or reform, such that it can be incorporated into the latest artificial intelligence systems. Our findings have implications for human experts, whose professional lives may be transformed and revalued by this industry, as well as for societal institutions that mediate expertise. We close this work with a series of provocations intended to elicit consideration of how society
Reasoning with LLMs increasingly unfolds inside a broader verification loop. Internally, systems use cheap checks, such as self-consistency or proxy rewards, which we call weak verification. Externally, users inspect outputs and steer the model through feedback until results are trustworthy, which we call strong verification. These signals differ sharply in cost and reliability: strong verification can establish trust but is resource-intensive, while weak verification is fast and scalable but noisy and imperfect. We formalize this tension through weak--strong verification policies, which decide when to accept or reject based on weak verification and when to defer to strong verification. We introduce metrics capturing incorrect acceptance, incorrect rejection, and strong-verification frequency. Over population, we show that optimal policies admit a two-threshold structure and that calibration and sharpness govern the value of weak verifiers. Building on this, we develop an online algorithm that provably controls acceptance and rejection errors without assumptions on the query stream, the language model, or the weak verifier.
A single seller offers one or more goods to a single buyer. The buyer's values and the seller's costs are private information. Each player has a commonly known prior over the other player's value or cost, supported on a finite set. What is the optimal selling mechanism? We argue that, despite this question's importance and apparent simplicity, prior work offers no satisfactory answer. If the seller simply chooses an optimal menu given her realized costs, she fails to exploit her informational advantage. At the other extreme, the optimal trade mechanism that satisfies IC/IR constraints for both parties fails in practice, as it conditions prices on the seller's unknown costs in an unenforceable way. The seller's realistic capabilities lie somewhere in between: she may leverage private information but lacks unlimited commitment power. To bridge this gap, we consider a solution concept built on the realistic assumption that the seller can commit to prices but nothing more. Similar -- albeit technically distinct -- solution concepts have been studied in the context of auctions with multiple buyers. Our concept proves surprisingly rich even with a single buyer. In our model, the buyer an
We study linguistic indirectness when speakers attend to social ties. Social ties are modeled by a graph, and conferences are the sets of nodes that hear a message. Conference worth is a distance polynomial on the graph; allocations are given by the Myerson value of the conference-restricted worth, which yields the bargaining-power components for each participant. Aggregating these components gives an effective bias that, via a Partition-Threshold rule, pins down the number of equilibrium message partitions in a cheap talk game. Results: (i) among trees, stars maximize worth, leading to weakly fewer equilibrium partitions; (ii) on stars, we derive closed-form effective biases, with a witness-hub marginal effect of adding leaves changing sign at $δ^{\ast}=0.6$; (iii) for two stars joined by one link, two-star (hub-hub) vs big-star (hub-leaf) precision flips at 8/15 for the same number of nodes; private leaf-leaf conferences are most informative.
An informed Advisor and an uninformed Decision-Maker, with conflicting interests, engage in repeated cheap talk communication in always new decision problems. While the Decision-Maker's optimal payoff is attainable in some subgame-perfect equilibrium, no payoff profile close to the Decision-Maker's optimal one is immune to renegotiation. Pareto efficient renegotiation-proof equilibria entail a compromise between the Advisor and the Decision-Maker. This could involve the Advisor being truthful and the Decision-Maker not fully utilizing this information to their advantage, or the Advisor exaggerating the truth and the Decision-Maker pretending to believe them.
Bootstrapping is often applied to get confidence limits for semiparametric inference of a target parameter in the presence of nuisance parameters. Bootstrapping with replacement can be computationally expensive and problematic when cross-validation is used in the estimation algorithm due to duplicate observations in the bootstrap samples. We provide a valid, fast, easy-to-implement subsampling bootstrap method for constructing confidence intervals for asymptotically linear estimators and discuss its application to semiparametric causal inference. Our method, inspired by the Cheap Bootstrap (Lam, 2022), leverages the quantiles of a t-distribution and has the desired coverage with few bootstrap replications. We show that the method is asymptotically valid if the subsample size is chosen appropriately as a function of the sample size. We illustrate our method with data from the LEADER trial (Marso et al., 2016), obtaining confidence intervals for a longitudinal targeted minimum loss-based estimator (van der Laan and Gruber, 2012). Through a series of empirical experiments, we also explore the impact of subsample size, sample size, and the number of bootstrap repetitions on the perform
We discuss a cheap tetrahedra-free approach to the numerical integration of polynomials on polyhedral elements, based on hyperinterpolation in a bounding box and Chebyshev moment computation via the divergence theorem. No conditioning issues arise, since no matrix factorization or inversion is needed. The resulting quadrature formula is theoretically stable even in the presence of some negative weights.
The literature on strategic communication originated with the influential cheap talk model, which precedes the Bayesian persuasion model by three decades. This model describes an interaction between two agents: sender and receiver. The sender knows some state of the world which the receiver does not know, and tries to influence the receiver's action by communicating a cheap talk message to the receiver. This paper initiates the systematic algorithmic study of cheap talk in a finite environment (i.e., a finite number of states and receiver's possible actions). We first prove that approximating the sender-optimal or the welfare-maximizing cheap talk equilibrium up to a certain additive constant or multiplicative factor is NP-hard. We further prove that deciding whether there exists an equilibrium in which the receiver gets utility higher than the trivial utility he can guarantee is NP-hard. Fortunately, we identify two naturally-restricted cases that admit efficient algorithms for finding a sender-optimal equilibrium - a constant number of states or a receiver having only two actions.
By enabling agents to communicate, recent cooperative multi-agent reinforcement learning (MARL) methods have demonstrated better task performance and more coordinated behavior. Most existing approaches facilitate inter-agent communication by allowing agents to send messages to each other through free communication channels, i.e., cheap talk channels. Current methods require these channels to be constantly accessible and known to the agents a priori. In this work, we lift these requirements such that the agents must discover the cheap talk channels and learn how to use them. Hence, the problem has two main parts: cheap talk discovery (CTD) and cheap talk utilization (CTU). We introduce a novel conceptual framework for both parts and develop a new algorithm based on mutual information maximization that outperforms existing algorithms in CTD/CTU settings. We also release a novel benchmark suite to stimulate future research in CTD/CTU.
Recurrent neural network (RNNs) that are capable of modeling long-distance dependencies are widely used in various speech tasks, eg., keyword spotting (KWS) and speech enhancement (SE). Due to the limitation of power and memory in low-resource devices, efficient RNN models are urgently required for real-world applications. In this paper, we propose an efficient RNN architecture, GhostRNN, which reduces hidden state redundancy with cheap operations. In particular, we observe that partial dimensions of hidden states are similar to the others in trained RNN models, suggesting that redundancy exists in specific RNNs. To reduce the redundancy and hence computational cost, we propose to first generate a few intrinsic states, and then apply cheap operations to produce ghost states based on the intrinsic states. Experiments on KWS and SE tasks demonstrate that the proposed GhostRNN significantly reduces the memory usage (~40%) and computation cost while keeping performance similar.
We present an axiomatic approach to combination theorems for various homological properties of groups and, more generally, of chain complexes. Examples of such properties include algebraic finiteness properties, $\ell^2$-invisibility, $\ell^2$-acyclicity, lower bounds for Novikov--Shubin invariants, and vanishing of homology growth. As a key example, we introduce an algebraic version of Abért--Bergeron--Frączyk--Gaboriau's cheap rebuilding property that implies vanishing of torsion homology growth and fits into our axiomatic framework for combination theorems. In particular, we obtain that certain graphs of groups with amenable vertex groups and elementary amenable edge groups have vanishing torsion homology growth.
We propose semantic entropy probes (SEPs), a cheap and reliable method for uncertainty quantification in Large Language Models (LLMs). Hallucinations, which are plausible-sounding but factually incorrect and arbitrary model generations, present a major challenge to the practical adoption of LLMs. Recent work by Farquhar et al. (2024) proposes semantic entropy (SE), which can detect hallucinations by estimating uncertainty in the space semantic meaning for a set of model generations. However, the 5-to-10-fold increase in computation cost associated with SE computation hinders practical adoption. To address this, we propose SEPs, which directly approximate SE from the hidden states of a single generation. SEPs are simple to train and do not require sampling multiple model generations at test time, reducing the overhead of semantic uncertainty quantification to almost zero. We show that SEPs retain high performance for hallucination detection and generalize better to out-of-distribution data than previous probing methods that directly predict model accuracy. Our results across models and tasks suggest that model hidden states capture SE, and our ablation studies give further insights
This paper considers the dynamics of cheap talk interactions between an oblivious receiver and a sender with different amounts of information. Even though it may seem that having additional information about the state of the game is always beneficial to the sender, we show that there are cases in which garbling the information of a fully informed sender can improve not only receiver's utility in equilibrium, but also that of the sender herself. We also provide efficient algorithms that output the optimal amount of information in sender-receiver scenarios with binary actions and extend some of these results to settings with multiple senders and one receiver.
Transforming large pre-trained low-resolution diffusion models to cater to higher-resolution demands, i.e., diffusion extrapolation, significantly improves diffusion adaptability. We propose tuning-free CutDiffusion, aimed at simplifying and accelerating the diffusion extrapolation process, making it more affordable and improving performance. CutDiffusion abides by the existing patch-wise extrapolation but cuts a standard patch diffusion process into an initial phase focused on comprehensive structure denoising and a subsequent phase dedicated to specific detail refinement. Comprehensive experiments highlight the numerous almighty advantages of CutDiffusion: (1) simple method construction that enables a concise higher-resolution diffusion process without third-party engagement; (2) fast inference speed achieved through a single-step higher-resolution diffusion process, and fewer inference patches required; (3) cheap GPU cost resulting from patch-wise inference and fewer patches during the comprehensive structure denoising; (4) strong generation performance, stemming from the emphasis on specific detail refinement.
A dramatic influx of diffusion-generated images has marked recent years, posing unique challenges to current detection technologies. While the task of identifying these images falls under binary classification, a seemingly straightforward category, the computational load is significant when employing the "reconstruction then compare" technique. This approach, known as DIRE (Diffusion Reconstruction Error), not only identifies diffusion-generated images but also detects those produced by GANs, highlighting the technique's broad applicability. To address the computational challenges and improve efficiency, we propose distilling the knowledge embedded in diffusion models to develop rapid deepfake detection models. Our approach, aimed at creating a small, fast, cheap, and lightweight diffusion synthesized deepfake detector, maintains robust performance while significantly reducing operational demands. Maintaining performance, our experimental results indicate an inference speed 3.2 times faster than the existing DIRE framework. This advance not only enhances the practicality of deploying these systems in real-world settings but also paves the way for future research endeavors that seek
In this work, we reexamine the vulnerability of Payment Channel Networks (PCNs) to bribing attacks, where an adversary incentivizes blockchain miners to deliberately ignore a specific transaction to undermine the punishment mechanism of PCNs. While previous studies have posited a prohibitive cost for such attacks, we show that this cost may be dramatically reduced (to approximately \$125), thereby increasing the likelihood of these attacks. To this end, we introduce Bribe & Fork, a modified bribing attack that leverages the threat of a so-called feather fork which we analyze with a novel formal model for the mining game with forking. We empirically analyze historical data of some real-world blockchain implementations to evaluate the scale of this cost reduction. Our findings shed more light on the potential vulnerability of PCNs and highlight the need for robust solutions.