Both cluster randomized trials and quasi-experimental designs are used to evaluate the impact of health and social policies and interventions. Stepped-wedge cluster randomized trials randomize a staggered adoption approach, while recent difference-in-differences methods allow analysis of non-randomized settings where similar policies are adopted at different time points. These approaches have become common, but the sheer variety of methods for analyzing observational studies with staggered adoption makes it challenging to clearly design and report such studies. We propose that observational and quasi-experimental study investigators can address these challenges by emulating stepped-wedge cluster randomized trials in the target trial emulation framework. The conceptual framework and reporting standards of trial emulation will encourage consideration of key features of these designs, such as policy heterogeneity and time-varying effects, and clear reporting of the estimand and assumptions. It also highlights areas where those interested in randomized trials and quasi-experimental designs can benefit from one another's experience by bringing insights across disciplines. Questions of t
The performance of Metropolis-Hastings algorithms is highly sensitive to the choice of step size, and miss-specification can lead to severe loss of efficiency. We study algorithms with randomized step sizes, considering both auxiliary-variable and marginalized constructions. We show that algorithms with a randomized step size inherit weak Poincaré inequalities/spectral gaps from their fixed-step-size counterparts under minimal conditions, and that the marginalized kernel should always be preferred in terms of asymptotic variance to the auxiliary-variable choice if it is implementable. In addition we show that both types of randomization make an algorithm robust to tuning, meaning that spectral gaps decay polynomially as the step size is increasingly poorly chosen. We further show that step-size randomization often preserves high-dimensional scaling limits and algorithmic complexity, while increasing the optimal acceptance rate for Langevin and Hamiltonian samplers when an Exponential or Uniform distribution is chosen to randomize the step size. Theoretical results are complemented with a numerical study on challenging benchmarks such as Poisson regression, Neal's funnel and the Ros
``Block what you can and randomize what you cannot'' is the core principle for treatment effect estimation in randomized controlled trials. Although a wealth of allocation strategies has been developed, an explicit trade-off between the covariate balance achieved by blocking and the robustness guaranteed by randomization is seldom quantified. Motivated by the second law of thermodynamics, this work posits a new criterion that lowers the covariate imbalance while maximizing the entropy that quantifies contrast and allocation diversity. The resulting optimal strategy, termed the minimum free energy randomized design, is then derived, thereby formally achieving such a trade-off. To facilitate practical implementation, we further develop a computationally efficient dynamic allocation algorithm with theoretical guarantees. Using a finite-sample variance decomposition, the proposed randomization strategy is shown to control covariate imbalance while preventing unobserved heterogeneity from dominating the mean squared error, thus retaining minimax efficiency under the prescribed design constraints. Extensive numerical simulations demonstrate that our method achieves superior statistical e
Without randomization, escaping the saddle points of $f \colon \mathbb{R}^d \to \mathbb{R}$ requires at least $Ω(d)$ pieces of information about $f$ (values, gradients, Hessian-vector products). With randomization, this can be reduced to a polylogarithmic dependence in $d$. The prototypical algorithm to that effect is perturbed gradient descent (PGD): through sustained jitter, it reliably escapes strict saddle points. However, it also never settles: there is no convergence. What is more, PGD requires precise tuning based on Lipschitz constants and a preset target accuracy. To improve on this, we modify the time-tested trust-region method with truncated conjugate gradients (TR-tCG). Specifically, we randomize the initialization of tCG (the subproblem solver), and we prove that tCG automatically amplifies the randomization near saddles (to escape) and absorbs it near local minimizers (to converge). Saddle escape happens over several iterations. Accordingly, our analysis is multi-step, with several novelties. The proposed algorithm is practical: it essentially tracks the good behavior of TR-tCG, with three minute modifications and a single new hyperparameter (the noise scale $σ$). We
Recent work presented at USENIX Security 2025 (SEC'25) claims that occupancy-based attacks can recover AES keys from the MIRAGE randomized cache. In this paper, we examine these claims and find that they arise from a modeling flaw in the SEC'25 paper. Most critically, the SEC'25 paper's simulation of MIRAGE uses a constant seed to initialize the random number generator used for global evictions in MIRAGE, causing every AES encryption they trace to evict the same deterministic sequence of cache lines. This artificially creates a highly repeatable timing pattern that is not representative of a realistic implementation of MIRAGE, where eviction sequences vary randomly between encryptions. When we instead randomize the eviction seed for each run, reflecting realistic operation, the correlation between AES T-table accesses and attacker runtimes disappears, and the attack fails. These findings show that the reported leakage is an artifact of incorrect modeling, and not an actual vulnerability in MIRAGE.
We randomize the implicit two-stage Runge-Kutta scheme in order to improve the rate of convergence (with respect to a deterministic scheme) and stability of the approximate solution (with respect to the solution generated by the explicit scheme). For stability analysis, we use Dahlquist's concept of A-stability, adopted to randomized schemes by considering three notions of stability: asymptotic, mean-square, and in probability. The randomized implicit RK2 scheme proves to be A-stable asymptotically and in probability but not in the mean-square sense.
Contrary to traditional deterministic notions of algorithmic fairness, this paper argues that fairly allocating scarce resources using machine learning often requires randomness. We address why, when, and how to randomize by proposing stochastic procedures that more adequately account for all of the claims that individuals have to allocations of social goods or opportunities.
Smolyak's method, also known as hyperbolic cross approximation or sparse grid method, is a powerful tool to tackle multivariate tensor product problems solely with the help of efficient algorithms for the corresponding univariate problem. In this paper we study the randomized setting, i.e., we randomize Smolyak's method. We provide upper and lower error bounds for randomized Smolyak algorithms with explicitly given dependence on the number of variables and the number of information evaluations used. The error criteria we consider are the worst-case root mean square error (the typical error criterion for randomized algorithms, often referred to as "randomized error") and the root mean square worst-case error (often referred to as "worst-case error"). Randomized Smolyak algorithms can be used as building blocks for efficient methods such as multilevel algorithms, multivariate decomposition methods or dimension-wise quadrature methods to tackle successfully high-dimensional or even infnite-dimensional problems. As an example, we provide a very general and sharp result on the convergence rate of N-th minimal errors of infnite-dimensional integration on weighted reproducing kernel Hilbe
Tiny Object Detection is challenging due to small size, low resolution, occlusion, background clutter, lighting conditions and small object-to-image ratio. Further, object detection methodologies often make underlying assumption that both training and testing data remain congruent. However, this presumption often leads to decline in performance when model is applied to out-of-domain(unseen) data. Techniques like synthetic image generation are employed to improve model performance by leveraging variations in input data. Such an approach typically presumes access to 3D-rendered datasets. In contrast, we propose a novel two-stage methodology Synthetic Randomized Image Augmentation (SRIA), carefully devised to enhance generalization capabilities of models encountering 2D datasets, particularly with lower resolution which is more practical in real-world scenarios. The first stage employs a weakly supervised technique to generate pixel-level segmentation masks. Subsequently, the second stage generates a batch-wise synthesis of artificial images, carefully designed with an array of diverse augmentations. The efficacy of proposed technique is illustrated on challenging foreign object debri
We provide a complete characterization of the randomized Kaczmarz algorithm (RKA) for inconsistent linear systems. The Kaczmarz algorithm, known in some fields as the algebraic reconstruction technique, is a classical method for solving large-scale overdetermined linear systems through a sequence of projection operators; the randomized Kaczmarz algorithm is a recent proposal by Strohmer and Vershynin to randomize the sequence of projections in order to guarantee exponential convergence (in mean square) to the solutions. A flurry of work followed this development, with renewed interest in the algorithm, its extensions, and various bounds on their performance. Earlier, we studied the special case of consistent linear systems and provided an exact formula for the mean squared error (MSE) in the value reconstructed by RKA, as well as a simple way to compute the exact decay rate of the error. In this work, we consider the case of inconsistent linear systems, which is a more relevant scenario for most applications. First, by using a "lifting trick", we derive an exact formula for the MSE given a fixed noise vector added to the measurements. Then we show how to average over the noise when
A basic feature of many field experiments is that investigators are only able to randomize clusters of individuals--such as households, communities, firms, medical practices, schools or classrooms--even when the individual is the unit of interest. To recoup the resulting efficiency loss, some studies pair similar clusters and randomize treatment within pairs. However, many other studies avoid pairing, in part because of claims in the literature, echoed by clinical trials standards organizations, that this matched-pair, cluster-randomization design has serious problems. We argue that all such claims are unfounded. We also prove that the estimator recommended for this design in the literature is unbiased only in situations when matching is unnecessary; its standard error is also invalid. To overcome this problem without modeling assumptions, we develop a simple design-based estimator with much improved statistical properties. We also propose a model-based approach that includes some of the benefits of our design-based estimator as well as the estimator in the literature. Our methods also address individual-level noncompliance, which is common in applications but not allowed for in mo
Network theory has often disregarded many-body relationships, solely focusing on pairwise interactions: neglecting them, however, can lead to misleading representations of complex systems. Hypergraphs represent a suitable framework for describing polyadic interactions. Here, we leverage the representation of hypergraphs based on the incidence matrix for extending the entropy-based approach to higher-order structures: in analogy with the Exponential Random Graphs, we introduce the Exponential Random Hypergraphs (ERHs). After exploring the asymptotic behaviour of thresholds generalising the percolation one, we apply ERHs to study real-world data. First, we generalise key network metrics to hypergraphs; then, we compute their expected value and compare it with the empirical one, in order to detect deviations from random behaviours. Our method is analytically tractable, scalable and capable of revealing structural patterns of real-world hypergraphs that differ significantly from those emerging as a consequence of simpler constraints.
Training Neural Networks (NNs) without overfitting is difficult; detecting that overfitting is difficult as well. We present a novel Random Matrix Theory method that detects the onset of overfitting in deep learning models without access to train or test data. For each model layer, we randomize each weight matrix element-wise, $\mathbf{W} \to \mathbf{W}^{\mathrm{rand}}$, fit the randomized empirical spectral distribution with a Marchenko-Pastur distribution, and identify large outliers that violate self-averaging. We call these outliers Correlation Traps. During the onset of overfitting, which we call the "anti-grokking" phase in long-horizon grokking, Correlation Traps form and grow in number and scale as test accuracy decreases while train accuracy remains high. Traps may be benign or may harm generalization; we provide an empirical approach to distinguish between them by passing random data through the trained model and evaluating the JS divergence of output logits. Our findings show that anti-grokking is an additional grokking phase with high train accuracy and decreasing test accuracy, structurally distinct from pre-grokking through its Correlation Traps. More broadly, we find
We present an efficient scheme to randomize a spin-state ensemble in a nonlinear spin-1 system by tuning chaos with an external periodic drive. Without modulation, the system exhibits a mixed phase space featuring regular islands embedded in a chaotic sea, where global mixing is inhibited by energy conservation. Using numerical simulations, we demonstrate that weak modulation of a linear Zeeman field not only facilitates transport between different energy shells but also drives ensembles toward a Haar-random distribution over spin states. Under optimized conditions, complete randomization is achieved on a timescale set by the inverse nonlinear interaction energy. In the overdriven regime, randomization is unexpectedly suppressed at specific modulation amplitudes, accompanied by the formation of sticky regions in phase space. We attribute this behavior to the dynamical cancellation of the leading low-order harmonic component of the periodic drive. These results illustrate how time-periodic driving can be used to engineer chaotic systems and achieve controllable randomization in nonlinear spin systems.
Strategic randomization is a key principle in game theory, yet it remains underexplored in large language models (LLMs). Prior work often conflates the cognitive decision to randomize with the mechanical generation of randomness, leading to incomplete evaluations. To address this, we propose a novel zero-sum game inspired by the Tian Ji Horse Race, where the Nash equilibrium corresponds to a maximal entropy strategy. The game's complexity masks this property from untrained humans and underdeveloped LLMs. We evaluate five LLMs across prompt styles -- framed, neutral, and hinted -- using competitive multi-tournament gameplay with system-provided random choices, isolating the decision to randomize. Results show that weaker models remain deterministic regardless of prompts, while stronger models exhibit increased randomization under explicit hints. When facing weaker models, strong LLMs adopt deterministic strategies to exploit biases, but converge toward equilibrium play when facing peers. Through win/loss outcomes and Bayes factor analysis, we demonstrate meaningful variation in LLMs' strategic reasoning capabilities, highlighting opportunities for improvement in abstract reasoning a
Evaluations of large language models (LLMs) suffer from instability, where small changes of random factors such as few-shot examples can lead to drastic fluctuations of scores and even model rankings. Moreover, different LLMs can have different preferences for a certain setting of random factors. As a result, using a fixed setting of random factors, which is often adopted as the paradigm of current evaluations, can lead to potential unfair comparisons between LLMs. To mitigate the volatility of evaluations, we first theoretically analyze the sources of variance induced by changes in random factors. Targeting these specific sources, we then propose the instance-level randomization (ILR) method to reduce variance and enhance fairness in model comparisons. Instead of using a fixed setting across the whole benchmark in a single experiment, we randomize all factors that affect evaluation scores for every single instance, run multiple experiments and report the averaged score. Theoretical analyses and empirical results demonstrate that ILR can reduce the variance and unfair comparisons caused by random factors, as well as achieve similar robustness level with less than half computational
Most of the current studies on autonomous vehicle decision-making and control tasks based on reinforcement learning are conducted in simulated environments. The training and testing of these studies are carried out under rule-based microscopic traffic flow, with little consideration of migrating them to real or near-real environments to test their performance. It may lead to a degradation in performance when the trained model is tested in more realistic traffic scenes. In this study, we propose a method to randomize the driving style and behavior of surrounding vehicles by randomizing certain parameters of the car-following model and the lane-changing model of rule-based microscopic traffic flow in SUMO. We trained policies with deep reinforcement learning algorithms under the domain randomized rule-based microscopic traffic flow in freeway and merging scenes, and then tested them separately in rule-based microscopic traffic flow and high-fidelity microscopic traffic flow. Results indicate that the policy trained under domain randomization traffic flow has significantly better success rate and calculative reward compared to the models trained under other microscopic traffic flows.
The assignment of papers to reviewers is a crucial part of the peer review processes of large publication venues, where organizers (e.g., conference program chairs) rely on algorithms to perform automated paper assignment. As such, a major challenge for the organizers of these processes is to specify paper assignment algorithms that find appropriate assignments with respect to various desiderata. Although the main objective when choosing a good paper assignment is to maximize the expertise of each reviewer for their assigned papers, several other considerations make introducing randomization into the paper assignment desirable: robustness to malicious behavior, the ability to evaluate alternative paper assignments, reviewer diversity, and reviewer anonymity. However, it is unclear in what way one should randomize the paper assignment in order to best satisfy all of these considerations simultaneously. In this work, we present a practical, one-size-fits-all method for randomized paper assignment intended to perform well across different motivations for randomness. We show theoretically and experimentally that our method outperforms currently-deployed methods for randomized paper ass
Address Space Layout Randomization (ASLR) is a crucial defense mechanism employed by modern operating systems to mitigate exploitation by randomizing processes' memory layouts. However, the stark reality is that real-world implementations of ASLR are imperfect and subject to weaknesses that attackers can exploit. This work evaluates the effectiveness of ASLR on major desktop platforms, including Linux, MacOS, and Windows, by examining the variability in the placement of memory objects across various processes, threads, and system restarts. In particular, we collect samples of memory object locations, conduct statistical analyses to measure the randomness of these placements and examine the memory layout to find any patterns among objects that could decrease this randomness. The results show that while some systems, like Linux distributions, provide robust randomization, others, like Windows and MacOS, often fail to adequately randomize key areas like executable code and libraries. Moreover, we find a significant entropy reduction in the entropy of libraries after the Linux 5.18 version and identify correlation paths that an attacker could leverage to reduce exploitation complexity