While some common fitness landscape characteristics are critical when determining the runtime of evolutionary algorithms (EAs), the relationship between fitness landscape structure and the runtime of EAs is poorly understood. Recently, Dang, Eremeev, and Lehre introduced a classification of pseudo-Boolean problems showing that "sparsity" of local optima and the "density" of fitness valleys can be crucial characteristics when determining the runtime of EAs Dang et al. (in Proceedings of the Genetic and Evolutionary Computation Conference. Association for Computing Machinery, New York, NY, USA, GECCO'21, pp 1133-1141, 10.1145/3449639.3459398, 2021c). However, their approach could only classify some classes of pseudo-Boolean functions and thus defined an incomplete hierarchy. We generalise the previous work to a complete hierarchy for all pseudo-Boolean functions, denoted Slo[Formula: see text]. The hierarchy is consistent with existing results for the runtime of EAs. The easiest problems are in Slo[Formula: see text] for [Formula: see text] and [Formula: see text]. As we increase [Formula: see text] and decrease [Formula: see text], the function class contains more interesting functions, including instances of hard combinatorial optimisation problems and problems perturbed by static noise. For [Formula: see text] and [Formula: see text] the problem class contains every problem, including problems closed under permutation (No Free Lunch). Problem classes where local optima sparsity exceed fitness valley density are shown to have exponential black-box complexity. We also study how random perturbations of a function can change its classification. E.g., randomly perturbing search points in OneMax with constant probability leads to a problem class that can still be optimised efficiently with appropriately tuned non-elitist EAs.
We settle the parameterized complexities of several variants of independent set reconfiguration and dominating set reconfiguration, parameterized by the number of tokens. We show that both problems are XL-complete when there is no limit on the number of moves, XNL-complete when a maximum length ℓ for the sequence is given in binary in the input, and XNLP-complete when ℓ is given in unary. The problems were known to be W [ 1 ] - and W [ 2 ] -hard respectively when ℓ is also a parameter. We complete the picture by showing membership in those classes. Moreover, we show that for all the variants that we consider, token sliding and token jumping are equivalent under pl-reductions. We introduce partitioned variants of token jumping and token sliding, and give pl-reductions between the four variants that have precise control over the number of tokens and the length of the reconfiguration sequence.
Several works have recently investigated the parameterized complexity of data completion problems, motivated by their applications in machine learning, and clustering in particular. Interestingly, these problems can be equivalently formulated as classical graph problems on induced subgraphs of powers of partially-defined hypercubes. In this paper, we follow up on this recent direction by investigating the Independent Set problem on this graph class, which has been studied in the data science setting under the name Diversity. We obtain a comprehensive picture of the problem's parameterized complexity and establish its fixed-parameter tractability w.r.t. the solution size plus the power of the hypercube. Given that several such First Order Logic (FO) definable problems have been shown to be fixed-parameter tractable on the considered graph class, one may ask whether fixed-parameter tractability could be extended to capture all FO-definable problems. We answer this question in the negative by showing that FO model checking on induced subgraphs of hypercubes is as difficult as FO model checking on general graphs.
A catalytic machine is a model of computation where a traditional space-bounded machine is augmented with an additional, significantly larger, "catalytic" tape, which, while being available as a work tape, has the caveat of being initialized with an arbitrary string, which must be preserved at the end of the computation. Despite this restriction, catalytic machines have been shown to have surprising additional power; a logspace machine with a polynomial length catalytic tape, known as catalytic logspace ( CL ), can compute problems which are believed to be impossible for L . A fundamental question of the model is whether the catalytic condition, of leaving the catalytic tape in its exact original configuration, is robust to minor deviations. This study was initialized by Gupta et al. (2024), who defined lossy catalytic logspace ( LCL [ e ] ) as a variant of CL where we allow up to e errors when resetting the catalytic tape. They showed that LCL [ e ] = CL for any e = O ( 1 ) , which remains the frontier of our understanding. In this work we completely characterize lossy catalytic space ( LCSPACE [ s , c , e ] ) in terms of ordinary catalytic space ( CSPACE [ s , c ] ). We show that LCSPACE [ s , c , e ] = CSPACE [ Θ ( s + e log c ) , Θ ( c ) ] In other words, allowing e errors on a catalytic tape of length c is equivalent, up to a constant stretch, to an equivalent errorless catalytic machine with an additional e log c bits of ordinary working memory. As a consequence, we show that for any e, LCL [ e ] = CL implies SPACE [ e log n ] ⊆ ZPP , thus giving a barrier to any improvement beyond LCL [ O ( 1 ) ] = CL . We also extend all our results to every variant of catalytic space.
The 2-opt heuristic is a very simple local search heuristic for the traveling salesperson problem. In practice it usually converges quickly to solutions within a few percentages of optimality. In contrast to this, its running-time is exponential and its approximation performance is poor in the worst case. Englert, Röglin, and Vöcking (Algorithmica, 2014) provided a smoothed analysis in the so-called one-step model in order to explain the performance of 2-opt on d-dimensional Euclidean instances, both in terms of running-time and in terms of approximation ratio. However, translating their results to the classical model of smoothed analysis, where points are perturbed by Gaussian distributions with standard deviation [Formula: see text], yields only weak bounds. We prove bounds that are polynomial in n and [Formula: see text] for the smoothed running-time with Gaussian perturbations. In addition, our analysis for Euclidean distances is much simpler than the existing smoothed analysis. Furthermore, we prove a smoothed approximation ratio of [Formula: see text]. This bound is almost tight, as we also provide a lower bound of [Formula: see text] for [Formula: see text]. Our main technical novelty here is that, different from existing smoothed analyses, we do not separately analyze objective values of the global and local optimum on all inputs (which only allows for a bound of [Formula: see text]), but simultaneously bound them on the same input.
MaxCut is a classical NP -complete problem and a crucial building block in many combinatorial algorithms. The famous Edwards-Erdös bound states that any connected graph on n vertices with m edges contains a cut of size at least m 2 + n - 1 4 . Crowston, Jones and Mnich [Algorithmica, 2015] showed that the MaxCut problem on simple connected graphs admits an FPT algorithm, where the parameter k is the difference between the desired cut size c and the lower bound given by the Edwards-Erdös bound. This was later improved by Etscheid and Mnich [Algorithmica, 2017] to run in parameterized linear time, i.e., f ( k ) · O ( m ) . We improve upon this result in two ways: Firstly, we extend the algorithm to work also for multigraphs (alternatively, graphs with positive integer weights). Secondly, we change the parameter; instead of the difference to the Edwards-Erdös bound, we use the difference to the Poljak-Turzík bound. The Poljak-Turzík bound states that any weighted graph G has a cut of weight at least w ( G ) 2 + w MSF ( G ) 4 , where w(G) denotes the total weight of G, and w MSF ( G ) denotes the weight of its minimum spanning forest. In connected simple graphs the two bounds are equivalent, but for multigraphs the Poljak-Turzík bound can be larger and thus yield a smaller parameter k. Our algorithm also runs in parameterized linear time, i.e., f ( k ) · O ( m + n ) .
The Matroid Secretary Conjecture is a notorious open problem in online optimization. It claims the existence of an O(1)-competitive algorithm for the Matroid Secretary Problem (MSP). Here, the elements of a weighted matroid appear one-by-one, revealing their weight at appearance, and the task is to select elements online with the goal to get an independent set of largest possible weight. O(1)-competitive MSP algorithms have so far only been obtained for restricted matroid classes and for MSP variations, including Random-Assignment MSP (RA-MSP), where an adversary fixes a number of weights equal to the ground set size of the matroid, which then get assigned randomly to the elements of the ground set. Unfortunately, these approaches heavily rely on knowing the full matroid upfront. This is an arguably undesirable requirement, and there are good reasons to believe that an approach towards resolving the MSP Conjecture should not rely on it. Thus, both Soto (SIAM Journal on Computing 42(1): 178-211, 2013.) and Oveis Gharan and Vondrák (Algorithmica 67(4): 472-497, 2013.) raised as an open question whether RA-MSP admits an O(1)-competitive algorithm even without knowing the matroid upfront. In this work, we answer this question affirmatively. Our result makes RA-MSP the first well-known MSP variant with an O(1)-competitive algorithm that does not need to know the underlying matroid upfront and without any restriction on the underlying matroid. Our approach is based on first approximately learning the rank-density curve of the matroid, which we then exploit algorithmically.
In bi-criteria optimization problems, the goal is typically to compute the set of Pareto-optimal solutions. Many algorithms for these types of problems rely on efficient merging or combining of partial solutions and filtering of dominated solutions in the resulting sets. In this article, we consider the task of computing the Pareto sum of two given Pareto sets A, B of size n. The Pareto sum C contains all non-dominated points of the Minkowski sum M = { a + b | a ∈ A , b ∈ B } . Since the Minkowski sum has a size of n 2 , but the Pareto sum C can be much smaller, the goal is to compute C without having to compute and store all of M. We present several new algorithms for efficient Pareto sum computation, including an output-sensitive successive algorithm with a running time of O ( n log n + n k ) and a space consumption of O ( n + k ) for k = | C | . If the elements of C are streamed, the space consumption reduces to O ( n ) . For output sizes k ≥ 2 n , we prove a conditional lower bound for Pareto sum computation, which excludes running times in O ( n 2 - δ ) for δ > 0 unless the (min,+)-convolution hardness conjecture fails. The successive algorithm matches this lower bound for k ∈ Θ ( n ) . However, for k ∈ Θ ( n 2 ) , the successive algorithm exhibits a cubic running time. But we also present an algorithm with an output-sensitive space consumption and a running time of O ( n 2 log n ) , which matches the lower bound up to a logarithmic factor even for large k. Furthermore, we describe suitable engineering techniques to improve the practical running times of our algorithms. Finally, we provide an extensive comparative experimental study on generated and real-world data. As a showcase application, we consider preprocessing-based bi-criteria route planning in road networks. Pareto sum computation is the bottleneck task in the preprocessing phase and in the query phase. We show that using our algorithms with an output-sensitive space consumption allows to tackle larger instances and reduces the preprocessing and query time compared to algorithms that fully store M.
A minimal perfect hash function (MPHF) maps a set S of n keys to the first n integers without collisions. There is a lower bound of n log 2 e - O ( log n ) ≈ 1.44 n bits needed to represent an MPHF. This can be reached by a brute-force algorithm that tries e n hash function seeds in expectation and stores the first seed that leads to an MPHF. The most space-efficient previous algorithms for constructing MPHFs all use such a brute-force approach as a basic building block. In this paper, we introduce ShockHash - Small, heavily overloaded cuckoo hash tables for minimal perfect hashing. ShockHash uses two hash functions h 0 and h 1 , hoping for the existence of a function f : S → { 0 , 1 } such that x ↦ h f ( x ) ( x ) is an MPHF on S. It then uses a 1-bit retrieval data structure to store f using n + o ( n )  bits. In graph terminology, ShockHash generates n-edge random graphs until stumbling on a pseudoforest - where each component contains as many edges as nodes. Using cuckoo hashing, ShockHash then derives an MPHF from the pseudoforest in linear time. We show that ShockHash needs to try only about ( e / 2 ) n ≈ 1 . 359 n seeds in expectation. This reduces the space for storing the seed by roughly n bits (maintaining the asymptotically optimal space consumption) and speeds up construction by almost a factor of 2 n compared to brute-force. Bipartite ShockHash reduces the expected construction time again to about 1 . 166 n by maintaining a pool of candidate hash functions and checking all possible pairs. Using ShockHash as a building block within the RecSplit framework we obtain ShockHash-RS, which can be constructed up to 3 orders of magnitude faster than competing approaches. ShockHash-RS can build an MPHF for 10 million keys with 1.489 bits per key in about half an hour. When instead using ShockHash after an efficient k-perfect hash function, it achieves space usage similar to the best competitors, while being significantly faster to construct and query.
In this paper, we study the Eulerian Strong Component Arc Deletion problem, where the input is a directed multigraph and the goal is to delete the minimum number of arcs to ensure every strongly connected component of the resulting digraph is Eulerian. This problem is a natural extension of the Directed Feedback Arc Set problem and is also known to be motivated by certain scenarios arising in the study of housing markets. The complexity of the problem, when parameterized by solution size (i.e., size of the deletion set), has remained unresolved and has been highlighted in several papers. In this work, we answer this question by ruling out (subject to the usual complexity assumptions) a fixed-parameter algorithm (FPT algorithm) for this parameter and conduct a broad analysis of the problem with respect to other natural parameterizations. We prove both positive and negative results. Among these, we demonstrate that the problem is also hard (W[1]-hard or even para-NP-hard) when parameterized by either treewidth or maximum degree alone. Complementing our lower bounds, we establish that the problem is in XP when parameterized by treewidth and FPT when parameterized either by both treewidth and maximum degree or by both treewidth and solution size. We show that on simple digraphs, these algorithms have near-optimal asymptotic dependence on the treewidth assuming the Exponential Time Hypothesis.
Competitive co-evolutionary algorithms (CoEAs) do not rely solely on an external function to assign fitness values to sampled solutions. Instead, they use the aggregation of outcomes from interactions between competing solutions allowing to rank solutions and make selection decisions. This makes CoEAs a useful tool for optimisation problems that have intrinsically interactive domains. Over the past decades, many ways to aggregate the outcomes of interactions have been considered. At the moment, it is unclear which of these is the best choice. Previous research is fragmented and most of the fitness aggregation methods (fitness measures) proposed have only been studied empirically. We argue that a proper understanding of the dynamics of CoEAs and their fitness measures can only be achieved through rigorous analysis of their behaviour. In this work we make a step towards this goal by using runtime analysis to study two commonly used fitness measures. We show a dichotomy in the behaviour of a ( 1 , λ )  CoEA when optimising a Bilinear problem. The algorithm finds a solution near the Nash equilibrium in polynomial time with high probability if the worst interaction is used as a fitness measure but is inefficient if the average of all interactions is used instead.
This work investigates the parameterised complexity of counting temporal paths. The problem of counting temporal paths is mainly motivated by temporal betweenness computation. The betweenness centrality of a vertex v is an important centrality measure that quantifies how many optimal paths between pairs of other vertices visit v. Computing betweenness centrality in a temporal graph, in which the edge set may change over discrete timesteps, requires us to count temporal paths that are optimal with respect to some criterion. For several natural notions of optimality, including foremost or fastest temporal paths, this counting problem reduces to #Temporal Path, the problem of counting all temporal paths between a fixed pair of vertices; like the problems of counting foremost and fastest temporal paths, #Temporal Path is #P-hard in general. Motivated by the many applications of this intractable problem, we initiate a systematic study of the parameterised and approximation complexity of #Temporal Path. We show that the problem presumably does not admit an FPT-algorithm for the feedback vertex number of the static underlying graph, and that it is hard to approximate in general. On the positive side, we prove several exact and approximate FPT-algorithms for special cases.
The 2-opt heuristic is a simple local search heuristic for the travelling salesperson problem (TSP). Although it usually performs well in practice, its worst-case running time is exponential in the number of cities. Attempts to reconcile this difference between practice and theory have used smoothed analysis, in which adversarial instances are perturbed probabilistically. We are interested in the classical model of smoothed analysis for the Euclidean TSP, in which the perturbations are Gaussian. This model was previously used by Manthey and Veenstra, who obtained smoothed complexity bounds polynomial in n, the dimension d, and the perturbation strength σ - 1 . However, their analysis only works for d ≥ 4 . The only previous analysis for d ≤ 3 was performed by Englert, Röglin and Vöcking, who used a different perturbation model which can be translated to Gaussian perturbations. Their model yields bounds polynomial in n and σ - d , and super-exponential in d. As the fact that no direct analysis exists for Gaussian perturbations that yields polynomial bounds for all d is somewhat unsatisfactory, we perform this missing analysis. Along the way, we improve all existing smoothed complexity bounds for Euclidean 2-opt with Gaussian perturbations.
Let d be a (well-behaved) shortest-path metric defined on a path-connected subset of [Formula: see text] and let [Formula: see text] be a set of geodesic disks with respect to the metric d. We prove that [Formula: see text], the intersection graph of the disks in [Formula: see text], has a clique-based separator consisting of [Formula: see text] cliques. This significantly extends the class of objects whose intersection graphs have small clique-based separators. Our clique-based separator yields an algorithm for q-Coloring that runs in time [Formula: see text], assuming the boundaries of the disks [Formula: see text] can be computed in polynomial time. We also use our clique-based separator to obtain a simple, efficient, and almost exact distance oracle for intersection graphs of geodesic disks. Our distance oracle uses [Formula: see text] storage and can report the hop distance between any two nodes in [Formula: see text] in [Formula: see text] time, up to an additive error of one. So far, distance oracles with an additive error of one that use subquadratic storage and sublinear query time were not known for such general graph classes.
The farthest-color Voronoi diagram (FCVD) is defined on a set of n points in the plane, where each point is labeled with one of m colors. The colored points constitute a family P of m clusters (sets) of points in the plane whose farthest-site Voronoi diagram is the FCVD. The diagram finds applications in problems related to facility location, shape matching, data imprecision, and others. In this paper we present structural properties of the FCVD, refine its combinatorial complexity bounds, and present efficient algorithms for its construction. We show that the complexity of the diagram is O ( n α ( m ) + str ( P ) ) , where str ( P ) is a parameter reflecting the number of straddles between pairs of clusters, which is O ( m ( n - m ) ) . The bound reduces to O ( n + str ( P ) ) if the clusters are pairwise non-crossing. We also present a lower bound, establishing that the complexity of the FCVD can be Ω ( n + m 2 ) , even if the clusters have pairwise disjoint convex hulls. Our algorithm runs in O ( ( n + str ( P ) ) log 3 n ) -time, and in certain special cases in O ( n log n ) time.
We study the problem of finding maximal exact matches (MEMs) between a query string Q and a labeled graph G. MEMs are an important class of seeds, often used in seed-chain-extend type of practical alignment methods because of their strong connections to classical metrics. A principled way to speed up chaining is to limit the number of MEMs by considering only MEMs of length at least κ ( κ -MEMs). However, on arbitrary input graphs, the problem of finding MEMs cannot be solved in truly sub-quadratic time under SETH (Equi et al., TALG 2023) even on acyclic graphs. In this paper we show an O ( n · L · d L - 1 + m + M κ , L ) -time algorithm finding all κ -MEMs between Q and G spanning exactly L nodes in G, where n is the total length of node labels, d is the maximum degree of a node in G, m = | Q | , and M κ , L is the number of output MEMs. We use this algorithm to develop a κ -MEM finding solution on indexable Elastic Founder Graphs (Equi et al., Algorithmica 2022) running in time O ( n H 2 + m + M κ ) , where H is the maximum number of nodes in a block, and M κ is the total number of κ -MEMs. Our results generalize to the analysis of multiple query strings (MEMs between G and any of the strings). Additionally, we provide some experimental results showing that the number of graph MEMs is an order of magnitude smaller than the number of string MEMs of the corresponding concatenated collection. We show that seed-chain-extend type of alignment methods can be implemented on top of indexable Elastic Founder Graphs by providing an efficient way to produce the seeds between a set of queries and the graph. The code is available in https://github.com/algbio/efg-mems .
The classic Impagliazzo-Nisan-Wigderson (INW) pseudorandom generator (PRG) (STOC '94) for space-bounded computation uses a seed of length O ( log n · log ( n w / ε ) + log d ) to fool ordered branching programs of length n, width w, and alphabet size d to within error ε . A series of works have shown that the analysis of the INW generator can be improved for the class of permutation branching programs or the more general regular branching programs, improving the O ( log 2 n ) dependence on the length n to O ( log n ) or O ~ ( log n ) . However, when also considering the dependence on the other parameters, these analyses still fall short of the optimal PRG seed length O ( log ( n w d / ε ) ) . In this paper, we prove that any "spectral analysis" of the INW generator requires seed length Ω log n · log log min { n , d } + log n · log w / ε + log d to fool ordered permutation branching programs of length n, width w, and alphabet size d to within error ε . By "spectral analysis" we mean an analysis of the INW generator that relies only on the spectral expansion of the graphs used to construct the generator; this encompasses all prior analyses of the INW generator. Our lower bound matches the upper bound of Braverman-Rao-Raz-Yehudayoff (FOCS 2010, SICOMP 2014) for regular branching programs of alphabet size d = 2 except for a gap between their O log n · log log n term and our Ω log n · log log min { n , d } term. It also matches the upper bounds of Koucký-Nimbhorkar-Pudlák (STOC 2011), De (CCC 2011), and Steinke (ECCC 2012) for constant-width ( w = O ( 1 ) ) permutation branching programs of alphabet size d = 2 to within a constant factor. To fool permutation branching programs in the measure of spectral norm, we prove that any spectral analysis of the INW generator requires a seed of length Ω log n · log log n + log n · log ( 1 / ε ) when the width is at least polynomial in n ( w = n Ω ( 1 ) ), matching the recent upper bound of Hoza-Pyne-Vadhan (ITCS 2021) to within a constant factor.
Reconfiguring two shortest paths in a graph means modifying one shortest path to the other by changing one vertex at a time so that all the intermediate paths are also shortest paths. This problem has several natural applications, namely: (a) repaving road networks, (b) rerouting data packets in a synchronous multiprocessing setting, (c) the shipping container stowage problem, and (d) the train marshalling problem. When modelled as graph problems, (a) is the most general case while (b), (c), (d) are restrictions to different graph classes. We show that (a) does not admit polynomial-time algorithms (assuming P ≠ NP ), even for relaxed variants of the problem (assuming P ≠ PSPACE ). For (b), (c), (d), we present polynomial-time algorithms to solve the respective problems. We also generalize the problem to when at most k (for a fixed integer k ≥ 2 ) contiguous vertices on a shortest path can be changed at a time.
For sets of n points, n even, in general position in the plane, we consider straight-line drawings of perfect matchings on them. It is well known that such sets admit at least Cn/2 different plane perfect matchings, where Cn/2 is the n/2-th Catalan number. Generalizing this result we are interested in the number of drawings of perfect matchings which have k crossings. We show the following results. (1) For every k≤164n2-3532nn+122564n, any set with n points, n sufficiently large, admits a perfect matching with exactly k crossings. (2) There exist sets of n points where every perfect matching has at most 572n2-n4 crossings. (3) The number of perfect matchings with at most k crossings is superexponential in n if k is superlinear in n. (4) Point sets in convex position minimize the number of perfect matchings with at most k crossings for k=0,1,2, and maximize the number of perfect matchings with n/22 crossings and with n/22-1 crossings.
For a set Q of points in the plane and a real number δ≥0, let Gδ(Q) be the graph defined on Q by connecting each pair of points at distance at most δ.We consider the connectivity of Gδ(Q) in the best scenario when the location of a few of the points is uncertain, but we know for each uncertain point a line segment that contains it. More precisely, we consider the following optimization problem: given a set P of n-k points in the plane and a set S of k line segments in the plane, find the minimum δ≥0 with the property that we can select one point ps∈s for each segment s∈S and the corresponding graph Gδ(P∪{ps∣s∈S}) is connected. It is known that the problem is NP-hard. We provide an algorithm to exactly compute an optimal solution in O(f(k)nlogn) time, for a computable function f(·). This implies that the problem is FPT when parameterized by k. The best previous algorithm uses O((k!)kkk+1·n2k) time and computes the solution up to fixed precision.