Large language models (LLMs) have achieved notable performance in code synthesis; however, data-aware augmentation remains a limiting factor, handled via heuristic design or brute-force approaches. We introduce a performance-aware, closed-loop solution in the NNGPT ecosystem of projects that enables LLMs to autonomously engineer optimal transformations by internalizing empirical performance cues. We fine-tune LLMs with Low-Rank Adaptation on a novel repository of more than 6,000 empirically evaluated PyTorch augmentation functions, each annotated solely by downstream model accuracy. Training uses pairwise performance ordering (better-worse transformations), enabling alignment through empirical feedback without reinforcement learning, reward models, or symbolic objectives. This reduces the need for exhaustive search, achieving up to 600x times fewer evaluated candidates than brute-force discovery while maintaining competitive peak accuracy and shifting generation from random synthesis to task-aligned design. Ablation studies show that structured Chain-of-Thought prompting introduces syntactic noise and degrades performance, whereas direct prompting ensures stable optimization in per
The $\ell$-matroid intersection ($\ell$-MI) problem asks if $\ell$ given matroids share a common basis. Already for $\ell = 3$, notable canonical NP-complete special cases are $3$-Dimensional Matching and Hamiltonian Path on directed graphs. However, while these problems admit exponential-time algorithms that improve the simple brute force, the fastest known algorithm for $3$-MI is exactly brute force with runtime $2^{n}/poly(n)$, where $n$ is the number of elements. Our first result shows that in fact, brute force cannot be significantly improved, by ruling out an algorithm for $\ell$-MI with runtime $o\left(2^{n-5 \cdot n^{\frac{1}{\ell-1}} \cdot \log (n)}\right)$, for any fixed $\ell\geq 3$. We further obtain: (i) an algorithm that solves $\ell$-MI faster than brute force in time $2^{n-Ω\left(\log^2 (n)\right)} $ (ii) a parameterized running time lower bound of $2^{(\ell-2) \cdot k \cdot \log k} \cdot poly(n)$ for $\ell$-MI, where the parameter $k$ is the rank of the matroids. We obtain these two results by generalizing the Monotone Local Search technique of Fomin et al. (J. ACM'19). Broadly speaking, our generalization converts any parameterized algorithm for a subset problem i
Exploring sign structures of quantum wave functions attracts considerable attention due to the potential for advances in modeling complex phases of matter. This stimulates developing different optimization procedures for imitating and manipulating sign structures of quantum states. In this work, utilizing a brute force approach based on a set of single-qubit transformations we evaluate protocols enabling positivization of the one-dimensional $J_1 -J_2$ model ground states in the regime of strong frustration. Based on the obtained positivization results, we show the difference between the cases of periodic and open boundary conditions, and also establish the dependence of the sign structure on parity of the simulated spin chains.
Accuracy remains a standard metric for evaluating AI systems, but it offers limited insight into how models arrive at their solutions. In this work, we introduce a benchmark based on brainteasers written in long narrative form to probe more deeply into the types of reasoning strategies that models use. Brainteasers are well-suited for this goal because they can be solved with multiple approaches, such as a few-step solution that uses a creative insight or a longer solution that uses more brute force. We investigate large language models (LLMs) across multiple layers of reasoning, focusing not only on correctness but also on the quality and creativity of their solutions. We investigate many aspects of the reasoning process: (1) semantic parsing of the brainteasers into precise mathematical competition style formats; (2) generating solutions from these mathematical forms; (3) self-correcting solutions based on gold solutions; (4) producing step-by-step sketches of solutions; and (5) making use of hints. We find that LLMs are in many cases able to find creative, insightful solutions to brainteasers, suggesting that they capture some of the capacities needed to solve novel problems in
Although research on the control of networked systems has grown considerably, graph-theoretic and algorithmic studies on matrix-weighted graphs remain limited. To bridge this gap in the literature, this work introduces two algorithms-the brute-force search and the Warshall algorithm-for determining connectedness and clustering in undirected matrix-weighted graphs. The proposed algorithms, which are derived from a sufficient condition for connectedness, emphasize a key distinction between matrix-weighted and scalar-weighted graphs. While the existence of a path between two vertices guarantees connectedness in scalar-weighted graphs, connectedness in matrix-weighted graphs is a collective contribution of all paths joining the two vertices. Proofs of correctness and numerical examples are provided to illustrate and demonstrate the effectiveness of the algorithms.
The rapid development of the Internet of Things (IoT) environment has introduced unprecedented levels of connectivity and automation. The Message Queuing Telemetry Transport (MQTT) protocol has become recognized in IoT applications due to its lightweight and efficient features; however, this simplicity also renders MQTT vulnerable to multiple attacks that can be launched against the protocol, including denial of service (DoS) and brute-force attacks. This study aims to improve the detection of intrusion DoS and brute-force attacks in an MQTT traffic intrusion detection system (IDS). Our approach utilizes the MQTT dataset for model training by employing effective feature engineering and ensemble learning techniques. Following our analysis and comparison, we identified the top 10 features demonstrating the highest effectiveness, leading to improved model accuracy. We used supervised machine learning models, including Random Forest, Decision Trees, k-Nearest Neighbors, and XGBoost, in combination with ensemble classifiers. Stacking, voting, and bagging ensembles utilize these four supervised machine-learning methods to combine models. This study's results illustrate the proposed techn
This report evaluates the efficiency of Graph Edit Distance (GED) computation for graph similarity search, comparing Cascading Metric Trees (CMT) with brute-force verification. Despite the anticipated advantages of CMT, our findings indicate it does not consistently outperform brute-force methods in speed. The study, based on graph data from PubChem, suggests that the computational complexity of GED-based GSS remains a challenge.
We propose a new algorithm that finds an $\varepsilon$-approximate fixed point of a smooth function from the $n$-dimensional $\ell_2$ unit ball to itself. We use the general framework of finding approximate solutions to a variational inequality, a problem that subsumes fixed point computation and the computation of a Nash Equilibrium. The algorithm's runtime is bounded by $e^{O(n)}/\varepsilon$, under the smoothed-analysis framework. This is the first known algorithm in such a generality whose runtime is faster than $(1/\varepsilon)^{O(n)}$, which is a time that suffices for an exhaustive search. We complement this result with a lower bound of $e^{Ω(n)}$ on the query complexity for finding an $O(1)$-approximate fixed point on the unit ball, which holds even in the smoothed-analysis model, yet without the assumption that the function is smooth. Existing lower bounds are only known for the hypercube, and adapting them to the ball does not give non-trivial results even for finding $O(1/\sqrt{n})$-approximate fixed points.
Fingerprint authentication has been widely adopted on smartphones to complement traditional password authentication, making it a tempting target for attackers. The smartphone industry is fully aware of existing threats, and especially for the presentation attack studied by most prior works, the threats are nearly eliminated by liveness detection and attempt limit. In this paper, we study the seemingly impossible fingerprint brute-force attack on off-the-shelf smartphones and propose a generic attack framework. We implement BrutePrint to automate the attack, that acts as a middleman to bypass attempt limit and hijack fingerprint images. Specifically, the bypassing exploits two zero-day vulnerabilities in smartphone fingerprint authentication (SFA) framework, and the hijacking leverages the simplicity of SPI protocol. Moreover, we consider a practical cross-device attack scenario and tackle the liveness and matching problems with neural style transfer (NST). We also propose a method based on neural style transfer to generate valid brute-forcing inputs from arbitrary fingerprint images. A case study shows that we always bypasses liveness detection and attempt limit while 71% spoofs ar
Web Vulnerability Assessment and Penetration Testing (Web VAPT) is a comprehensive cybersecurity process that uncovers a range of vulnerabilities which, if exploited, could compromise the integrity of web applications. In a VAPT, it is common to perform a \textit{Directory brute-forcing Attack}, aiming at the identification of accessible directories of a target website. Current commercial solutions are inefficient as they are based on brute-forcing strategies that use wordlists, resulting in enormous quantities of trials for a small amount of success. Offensive AI is a recent paradigm that integrates AI-based technologies in cyber attacks. In this work, we explore whether AI can enhance the directory enumeration process and propose a novel Language Model-based framework. Our experiments -- conducted in a testbed consisting of 1 million URLs from different web application domains (universities, hospitals, government, companies) -- demonstrate the superiority of the LM-based attack, with an average performance increase of 969%.
Cyber attacks are ubiquitous and a constantly growing threat in the age of digitization. In order to protect important data, developers and system administrators must be trained and made aware of possible threats. Practical training can be used for students alike to introduce them to the topic. A constant threat to websites that require user authentication is so-called brute-force attacks, which attempt to crack a password by systematically trying every possible combination. As this is a typical threat, but comparably easy to detect, it is ideal for beginners. Therefore, three open-source blue team scenarios are designed and systematically described. They are contiguous to maximize the learning effect.
A minimal perfect hash function (MPHF) maps a set $S$ of $n$ keys to the first $n$ integers without collisions. There is a lower bound of $n\log_2e-O(\log n)$ bits of space needed to represent an MPHF. A matching upper bound is obtained using the brute-force algorithm that tries random hash functions until stumbling on an MPHF and stores that function's seed. In expectation, $e^n\textrm{poly}(n)$ seeds need to be tested. The most space-efficient previous algorithms for constructing MPHFs all use such a brute-force approach as a basic building block. In this paper, we introduce ShockHash - Small, heavily overloaded cuckoo hash tables. ShockHash uses two hash functions $h_0$ and $h_1$, hoping for the existence of a function $f : S \rightarrow \{0,1\}$ such that $x \mapsto h_{f(x)}(x)$ is an MPHF on $S$. In graph terminology, ShockHash generates $n$-edge random graphs until stumbling on a pseudoforest - a graph where each component contains as many edges as nodes. Using cuckoo hashing, ShockHash then derives an MPHF from the pseudoforest in linear time. It uses a 1-bit retrieval data structure to store $f$ using $n + o(n)$ bits. By carefully analyzing the probability that a random gra
We determine the nucleation rates of hard spheres using brute-force molecular dynamics simulations. We overcome nucleation barriers of up to $28 k_B T$, leading to a rigorous test of nucleation rates obtained from rare-event methods and classical nucleation theory. Our brute-force nucleation rates show excellent agreement with umbrella sampling simulations by Filion et al. [J. Chem. Phys. 133, 244115 (2010)] and seeding simulations by Espinosa et al. [J. Chem. Phys. 144, 034501 (2016)].
A minimal perfect hash function (MPHF) maps a set S of n keys to the first n integers without collisions. There is a lower bound of n*log(e)=1.44n bits needed to represent an MPHF. This can be reached by a brute-force algorithm that tries e^n hash function seeds in expectation and stores the first seed leading to an MPHF. The most space-efficient previous algorithms for constructing MPHFs all use such a brute-force approach as a basic building block. In this paper, we introduce ShockHash - Small, heavily overloaded cuckoo hash tables for minimal perfect hashing. ShockHash uses two hash functions h_0 and h_1, hoping for the existence of a function f : S->{0, 1} such that x -> h_{f(x)}(x) is an MPHF on S. It then uses a 1-bit retrieval data structure to store f using n + o(n) bits. In graph terminology, ShockHash generates n-edge random graphs until stumbling on a pseudoforest - where each component contains as many edges as nodes. Using cuckoo hashing, ShockHash then derives an MPHF from the pseudoforest in linear time. We show that ShockHash needs to try only about (e/2)^n=1.359^n seeds in expectation. This reduces the space for storing the seed by roughly n bits (maintaining
Since the introduction of bcrypt in 1999, adaptive password hashing functions, whereby brute-force resistance increases symmetrically with computational difficulty for legitimate users, have been our most powerful post-breach countermeasure against credential disclosure. Unfortunately, the relatively low tolerance of users to added latency places an upper bound on the deployment of this technique in most applications. In this paper, we present a multi-factor credential hashing function (MFCHF) that incorporates the additional entropy of multi-factor authentication into password hashes to provide asymmetric resistance to brute-force attacks. MFCHF provides full backward compatibility with existing authentication software (e.g., Google Authenticator) and hardware (e.g., YubiKeys), with support for common usability features like factor recovery. The result is a 10^6 to 10^48 times increase in the difficulty of cracking hashed credentials, with little added latency or usability impact.
This article describes an improved brute-force solving strategy for Quadratic Unconstrained Binary Optimization (QUBO) problems that is faster than naive approaches and easily parallelizable. It exploits the Gray code ordering of natural numbers to allow for a more efficient evaluation of the QUBO objective function. The implementation in Python is discussed in detail, and an additional C implementation is provided.
This paper proposes an operational measure of non-stochastic information leakage to formalize privacy against a brute-force guessing adversary. The information is measured by non-probabilistic uncertainty of uncertain variables, the non-stochastic counterparts of random variables. For $X$ that is related to released data $Y$, the non-stochastic brute-force leakage is measured by the complexity of exhaustively checking all the possibilities of the private attribute $U$ of $X$ by an adversary. The complexity refers to the number of trials to successfully guess $U$. Maximizing this leakage over all possible private attributes $U$ gives rise to the maximal (i.e., worst-case) non-stochastic brute-force guessing leakage. This is proved to be fully determined by the minimal non-stochastic uncertainty of $X$ given $Y$, which also determines the worst-case attribute $U$ indicating the highest privacy risk if $Y$ is disclosed. The maximal non-stochastic brute-force guessing leakage is shown to be proportional to the non-stochastic identifiability of $X$ given $Y$ and upper bounds the existing maximin information. The latter quantifies the information leakage when an adversary must perfectly
The astronomical applications greatly benefit from the knowledge of the instrument PSF. We describe the PSF Reconstruction algorithm developed for the LBT LUCI instrument assisted by the SOUL SCAO module. The reconstruction procedure considers only synchronous wavefront sensor telemetry data and a few asynchronous calibrations. We do not compute the Optical Transfer Function and corresponding filters. We compute instead a temporal series of wavefront maps and for each of these the corresponding instantaneous PSF. We tested the algorithm both in laboratory arrangement and in the nighttime for different SOUL configurations, adapting it to the guide star magnitudes and seeing conditions. We nick-named it "BRUTE", Blind Reconstruction Using TElemetry, also recalling the one-to-one approach, one slope-to one instantaneous PSF the algorithm applies.
Line Speed Publish/Subscribe Inter-networking (LIPSIN) is one of the proposed forwarding mechanisms in Information Centric Networking (ICN). It is a stateless source-routing approach based on Bloom filters. However, it has been shown that LIPSIN is vulnerable to brute-force attacks which may lead to distributed denial-of-service (DDoS) attacks and unsolicited messages. In this work, we propose a new forwarding approach that maintains the advantages of Bloom filter based forwarding while allowing forwarding nodes to statelessly verify if packets have been previously authorized, thus preventing attacks on the forwarding mechanism. Analysis of the probability of attack, derived analytically, demonstrates that the technique is highly-resistant to brute-force attacks.
This paper proposes an authentication-simplified and deceptive scheme (SEIGuard) to protect server-side social engineering information (SEI) and other information against brute-force attacks. In SEIGuard, the password check in authentication is omitted and this design is further combined with the SEI encryption design using honey encryption. The login password merely serves as a temporary key to encrypt SEI and there is no password plaintext or ciphertext stored in the database. During the login, the server doesn't check the login passwords, correct passwords decrypt ciphertexts to be correct plaintexts; incorrect passwords decrypt ciphertexts to be phony but plausible-looking plaintexts (sampled from the same distribution). And these two situations share the same undifferentiated backend procedures. This scheme eliminates the anchor that both online and offline brute-force attacks depending on. Furthermore, this paper presents four SEIGuard scheme designs and algorithms for 4 typical social engineering information objects (mobile phone number, identification number, email address, personal name), which represent 4 different types of message space, i.e. 1) limited and uniformly dis