Bloom and cuckoo filters provide fast approximate set membership while using little memory. Engineers use them to avoid expensive disk and network accesses. The recently introduced xor filters can be faster and smaller than Bloom and cuckoo filters. The xor filters are within 23% of the theoretical lower bound in storage as opposed to 44% for Bloom filters. Inspired by Dietzfelbinger and Walzer, we build probabilistic filters -- called binary fuse filters -- that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound. We compare the performance against a wide range of competitive alternatives such as Bloom filters, blocked Bloom filters, vector quotient filters, cuckoo filters, and the recent ribbon filters. Our experiments suggest that binary fuse filters are superior to xor filters.
Probabilistic filters are approximate set membership data structures that represent a set of keys in small space, and answer set membership queries without false negative answers, but with a certain allowed false positive probability. Such filters are widely used in database systems, networks, storage systems and in biological sequence analysis because of their fast query times and low space requirements. Starting with Bloom filters in the 1970s, many filter data structures have been developed, each with its own advantages and disadvantages, e.g., Blocked Bloom filters, Cuckoo filters, XOR filters, Ribbon filters, and more. We introduce Blocked Bloom filters with choices that work similarly to Blocked Bloom filters, except that for each key there are two (or more) alternative choices of blocks where the key's information may be stored. The result is a filter that partially inherits the advantages of a Blocked Bloom filter, such as the ability to insert keys rapidly online or the ability to slightly overload the filter with only a small penalty to the false positive rate. At the same time, it avoids the major disadvantage of a Blocked Bloom filter, namely the larger space consumptio
We study privacy filters, which enable privacy accounting for differentially private (DP) mechanisms with adaptively chosen privacy characteristics. We develop a general theory that characterizes the worst-case privacy loss of an interaction involving an analyst that respects some restrictions on what queries they may issue. We apply this theory to develop residue filters, which unifies existing privacy filters. We develop the Gaussian DP (GDP) residue filter, which strictly improves upon the naïve GDP filter. We also show that residue filters capture the natural filter, which promises greater utility by leveraging exact privacy accounting techniques. Earlier privacy filters consider only simple privacy parameters such as Rényi-DP or GDP parameters. Natural filters account for the entire privacy profile of every query, promising more efficient use of a given privacy budget. We show that, contrary to other forms of DP, natural privacy filters are not free in general. We present a characterization of when a family of private queries admits free natural filters for a given budget. In particular, only families of privacy mechanisms that are totally-ordered when composed admit free natu
Neural network models are increasingly used for state estimation in control and decision-making, yet it remains unclear to what extent they behave as principled filters in nonlinear dynamical systems. Unlike classical filters, which rely on explicit dynamics and noise models, neural estimators can be trained purely from data. We present a systematic comparison between model-free neural estimators and classical filtering methods across multiple nonlinear scenarios. On the neural side, we evaluate Transformer-based models, recurrent neural networks, and state-space models; on the classical side, we compare against particle filters and nonlinear Kalman filters. Results show that structured state-space models (SSMs), in particular Mamba and Mamba-2, are consistently strong among neural estimators. They approach strong classical filters in several nonlinear systems and outperform weaker classical baselines without access to system models, while the evaluated neural implementations achieve substantially higher inference throughput on the tested hardware. Accurate model-based filters can still dominate when their assumptions are well matched. We attribute the relative strength of SSMs to
This paper revisits three backup-based safety filters -- Backup Control Barrier Functions (Backup CBF), Model Predictive Shielding (MPS), and gatekeeper -- through a unified comparative framework. Using a common safety-filter abstraction and shared notation, we make explicit both their common backup-policy structure and their key algorithmic differences. We compare the three methods through their filter-inactive sets, i.e., the states where the nominal policy is left unchanged. In particular, we show that MPS is a special case of gatekeeper, and we further relate gatekeeper to the interior of the Backup CBF inactive set within the implicit safe set. This unified view also highlights a key source of conservatism in backup-based safety filters: safety is often evaluated through the feasibility of a backup maneuver, rather than through the nominal policy's continued safe execution. The paper is intended as a compact tutorial and review that clarifies the theoretical connections and differences among these methods.
We present filters with rational exponents in order to provide a continuum of filter behavior not classically achievable. We discuss their stability, the flexibility they afford, and various representations useful for analysis, design and implementations. We do this for a generalization of second-order filters which we refer to as rational-exponent Generalized Exponent Filters (GEFs) that are useful for a diverse array of applications. We present equivalent representations for rational-exponent GEFs in the time and frequency domains: transfer functions, impulse responses, and integral expressions - the last of which allows for efficient real-time processing without preprocessing requirements. Rational-exponent filters enable filter characteristics to be on a continuum rather than limiting them to discrete values thereby resulting in greater flexibility in the behavior of these filters without additional complexity in causality and stability analyses compared with classical filters. In the case of GEFs, this allows for having arbitrary continuous rather than discrete values for filter characteristics such as (1) the ratio of 3dB quality factor to maximum group delay - particularly i
This paper challenges the prevailing view that convolutional neural network (CNN) filters become increasingly specialized in deeper layers. Motivated by recent observations of clusterable repeating patterns in depthwise separable CNNs (DS-CNNs) trained on ImageNet, we extend this investigation across various domains and datasets. Our analysis of DS-CNNs reveals that deep filters maintain generality, contradicting the expected transition to class-specific filters. We demonstrate the generalizability of these filters through transfer learning experiments, showing that frozen filters from models trained on different datasets perform well and can be further improved when sourced from larger datasets. Our findings indicate that spatial features learned by depthwise separable convolutions remain generic across all layers, domains, and architectures. This research provides new insights into the nature of generalization in neural networks, particularly in DS-CNNs, and has significant implications for transfer learning and model design.
Typical iterated filters, such as the iterated extended Kalman filter (IEKF), iterated unscented Kalman filter (IUKF), and iterated posterior linearization filter (IPLF), have been developed to improve the linearization point (or density) of the likelihood linearization in the well-known extended Kalman filter (EKF) and unscented Kalman filter (UKF). A shortcoming of typical iterated filters is that they do not treat the linearization of the transition model of the system. To remedy this shortcoming, we introduce dynamically iterated filters (DIFs), a unified framework for iterated linearization-based nonlinear filters that deals with nonlinearities in both the transition model and the likelihood, thereby constituting a generalization of the aforementioned iterated filters. We further establish a relationship between the general DIF and the approximate iterated Rauch-Tung-Striebel smoother. This relationship allows for a Gauss-Newton interpretation, which in turn enables explicit step-size correction, leading to damped versions of the DIFs. The developed algorithms, both damped and non-damped, are numerically demonstrated in three examples, showing superior mean-squared error as we
We study the parametric subfamily $p = 3m(m+1) + 1$ with $m = 2^a 3^b - 1$, $a,b \in \mathbb{N}^*$, a 3-smooth slice of the centred hexagonal numbers $3m^2 + 3m + 1 = (m+1)^3 - m^3,$ from the point of view of unconditional primality certification via the Pocklington-Lehmer criterion. The 3-smoothness of $m+1 = 2^a 3^b$ yields, for every $(a,b)$, a fully factored divisor $F = 2^a 3^(b+1)$ of $p-1$ satisfying $F > \sqrt(p)$ unconditionally, reducing the certificate to two witnesses, for $q = 2$ and $q = 3$. Our main new contribution is a complete, deterministic characterisation of the two canonical witnesses. We prove that $w_2 = 5$ is a valid witness if and only if $a - b$ = 1, 2 (mod 4), by quadratic reciprocity; and that $w_3 = 7$ is a valid witness if and only if $m$ is not congruent to 2 (mod 7), by cubic reciprocity in $\mathbb{Z}[omega]$ using the explicit Eisenstein factorisation $p = ((1+m) - m ω)((1+m) - m ω^2)$. These two results turn the heuristic "5 and 7 always work" (which is in fact false) into exact congruence conditions, and yield a deterministic witness-selection rule. Alongside, three elementary arithmetic filters (mod 6, a (-3) quadratic-residue sieve, and a m
In graph signal processing, one of the most important subjects is the study of filters, i.e., linear transformations that capture relations between graph signals. One of the most important families of filters is the space of shift invariant filters, defined as transformations commute with a preferred graph shift operator. Shift invariant filters have a wide range of applications in graph signal processing and graph neural networks. A shift invariant filter can be interpreted geometrically as an information aggregation procedure (from local neighborhood), and can be computed easily using matrix multiplication. However, there are still drawbacks to using solely shift invariant filters in applications, such as being restrictively homogeneous. In this paper, we generalize shift invariant filters by introducing and studying semi shift invariant filters. We give an application of semi shift invariant filters with a new signal processing framework, the subgraph signal processing. Moreover, we also demonstrate how semi shift invariant filters can be used in graph neural networks.
Adaptive filters, such as telescoping and adaptive cuckoo filters, update their representation upon detecting a false positive to avoid repeating the same error in the future. Adaptive filters require an auxiliary structure, typically much larger than the main filter and often residing on slow storage, to facilitate adaptation. However, existing adaptive filters are not practical and have seen no adoption in real-world systems due to two main reasons. Firstly, they offer weak adaptivity guarantees, meaning that fixing a new false positive can cause a previously fixed false positive to come back. Secondly, the sub-optimal design of the auxiliary structure results in adaptivity overheads so substantial that they can actually diminish the overall system performance compared to a traditional filter. In this paper, we design and implement AdaptiveQF, the first practical adaptive filter with minimal adaptivity overhead and strong adaptivity guarantees, which means that the performance and false-positive guarantees continue to hold even for adversarial workloads. The AdaptiveQF is based on the state-of-the-art quotient filter design and preserves all the critical features of the quotient
This paper focuses on spectral filters on graphs, namely filters defined as elementwise multiplication in the frequency domain of a graph. In many graph signal processing settings, it is important to transfer a filter from one graph to another. One example is in graph convolutional neural networks (ConvNets), where the dataset consists of signals defined on many different graphs, and the learned filters should generalize to signals on new graphs, not present in the training set. A necessary condition for transferability (the ability to transfer filters) is stability. Namely, given a graph filter, if we add a small perturbation to the graph, then the filter on the perturbed graph is a small perturbation of the original filter. It is a common misconception that spectral filters are not stable, and this paper aims at debunking this mistake. We introduce a space of filters, called the Cayley smoothness space, that contains the filters of state-of-the-art spectral filtering methods, and whose filters can approximate any generic spectral filter. For filters in this space, the perturbation in the filter is bounded by a constant times the perturbation in the graph, and filters in the Cayle
Understanding the intricate interplay between soot dynamics and chemical reactions within catalytic diesel particulate filters (CDPF) is crucial for enhancing both filtration efficiency and regeneration performance. In this paper, we establish a unified pore-scale multiphysics model based on the Eulerian-Lagrangian framework to comprehensively resolve the transport, deposition, and oxidation of soot. Distinguishing itself from conventional empirical correlations and stochastic-based approximations, the system models soot deposition through fundamental physical principles, integrating elastic deformation and surface adhesion mechanics at the particle-wall interface. Simultaneously, it incorporates a robust oxidation model that accounts for the competitive kinetics of both $\textrm{O}_2$ and $\textrm{NO}_2$ pathways, enabling comprehensive coverage of all CDPF operating regimes. Validated against three classical benchmark cases, the model demonstrates superior accuracy in capturing interfacial mass transfer and particle-wall interactions. Simulation under a typical CDPF low-temperature operating condition emphasizes the pivotal role of $\textrm{NO}_2$ and catalyst in promoting regene
We study linear filters for processing signals supported on abstract topological spaces modeled as simplicial complexes, which may be interpreted as generalizations of graphs that account for nodes, edges, triangular faces etc. To process such signals, we develop simplicial convolutional filters defined as matrix polynomials of the lower and upper Hodge Laplacians. First, we study the properties of these filters and show that they are linear and shift-invariant, as well as permutation and orientation equivariant. These filters can also be implemented in a distributed fashion with a low computational complexity, as they involve only (multiple rounds of) simplicial shifting between upper and lower adjacent simplices. Second, focusing on edge-flows, we study the frequency responses of these filters and examine how we can use the Hodge-decomposition to delineate gradient, curl and harmonic frequencies. We discuss how these frequencies correspond to the lower- and the upper-adjacent couplings and the kernel of the Hodge Laplacian, respectively, and can be tuned independently by our filter designs. Third, we study different procedures for designing simplicial convolutional filters and di
Filters approximately store a set of items while trading off accuracy for space-efficiency and can address the limited memory on accelerators, such as GPUs. However, there is a lack of high-performance and feature-rich GPU filters as most advancements in filter research has focused on CPUs. In this paper, we explore the design space of filters with a goal to develop massively parallel, high performance, and feature rich filters for GPUs. We evaluate various filter designs in terms of performance, usability, and supported features and identify two filter designs that offer the right trade off in terms of performance, features, and usability. We present two new GPU-based filters, the TCF and GQF, that can be employed in various high performance data analytics applications. The TCF is a set membership filter and supports faster inserts and queries, whereas the GQF supports counting which comes at an additional performance cost. Both the GQF and TCF provide point and bulk insertion API and are designed to exploit the massive parallelism in the GPU without sacrificing usability and necessary features. The TCF and GQF are up to $4.4\times$ and $1.4\times$ faster than the previous GPU fil
Bloom filters are probabilistic data structures commonly used for approximate membership problems in many areas of Computer Science (networking, distributed systems, databases, etc.). With the increase in data size and distribution of data, problems arise where a large number of Bloom filters are available, and all them need to be searched for potential matches. As an example, in a federated cloud environment, each cloud provider could encode the information using Bloom filters and share the Bloom filters with a central coordinator. The problem of interest is not only whether a given element is in any of the sets represented by the Bloom filters, but which of the existing sets contain the given element. This problem cannot be solved by just constructing a Bloom filter on the union of all the sets. Instead, we effectively have a multidimensional Bloom filter problem: given an element, we wish to receive a list of candidate sets where the element might be. To solve this problem, we consider 3 alternatives. Firstly, we can naively check many Bloom filters. Secondly, we propose to organize the Bloom filters in a hierarchical index structure akin to a B+ tree, that we call Bloofi. Final
We propose a new method for creating computationally efficient convolutional neural networks (CNNs) by using low-rank representations of convolutional filters. Rather than approximating filters in previously-trained networks with more efficient versions, we learn a set of small basis filters from scratch; during training, the network learns to combine these basis filters into more complex filters that are discriminative for image classification. To train such networks, a novel weight initialization scheme is used. This allows effective initialization of connection weights in convolutional layers composed of groups of differently-shaped filters. We validate our approach by applying it to several existing CNN architectures and training these networks from scratch using the CIFAR, ILSVRC and MIT Places datasets. Our results show similar or higher accuracy than conventional CNNs with much less compute. Applying our method to an improved version of VGG-11 network using global max-pooling, we achieve comparable validation accuracy using 41% less compute and only 24% of the original VGG-11 model parameters; another variant of our method gives a 1 percentage point increase in accuracy over
A low-cost measurement system using filtering of measurements for two-wheeled balancing robot stabilisation purposes has been addressed in this paper. In particular, a measurement system based on gyroscope, accelerometer, and encoder has been considered. The measurements have been corrected for deterministic disturbances and then filtered with Kalman, $α$-$β$ type, and complementary filters. A quantitative assessment of selected filters has been given. As a result, the complete structure of a measurement system has been obtained. The performance of the proposed measurement system has been validated experimentally by using a dedicated research rig.
Filter pruning is widely adopted to compress and accelerate the Convolutional Neural Networks (CNNs), but most previous works ignore the relationship between filters and channels in different layers. Processing each layer independently fails to utilize the collaborative relationship across layers. In this paper, we intuitively propose a novel pruning method by explicitly leveraging the Filters Similarity in Consecutive Layers (FSCL). FSCL compresses models by pruning filters whose corresponding features are more worthless in the model. The extensive experiments demonstrate the effectiveness of FSCL, and it yields remarkable improvement over state-of-the-art on accuracy, FLOPs and parameter reduction on several benchmark models and datasets.
To select suitable filters for a task or to improve existing filters, a deep understanding of their inner workings is vital. Diffusion echoes, which are space-adaptive impulse responses, are useful to visualise the effect of nonlinear diffusion filters. However, they have received little attention in the literature. There may be two reasons for this: Firstly, the concept was introduced specifically for diffusion filters, which might appear too limited. Secondly, diffusion echoes have large storage requirements, which restricts their practicality. This work addresses both problems. We introduce the filter echo as a generalisation of the diffusion echo and use it for applications beyond adaptive smoothing, such as image inpainting, osmosis, and variational optic flow computation. We provide a framework to visualise and inspect echoes from various filters with different applications. Furthermore, we propose a compression approach for filter echoes, which reduces storage requirements by a factor of 20 to 100.