The quadratic complexity of self-attention during the prefill phase impedes long-context inference in large language models. Existing sparse attention methods face a trade-off among context adaptivity, sampling overhead, and fine-tuning costs. We propose VSPrefill, a mechanism requiring lightweight training that uses the vertical-slash structural pattern in attention distributions. Our compact VSIndexer module predicts context-aware importance scores for vertical columns and slash diagonals from key-value representations augmented with RoPE. This approach constructs sparse masks with linear complexity without modifying the backbone parameters. During inference, an adaptive cumulative-threshold strategy allocates sparsity budgets per layer, while a fused kernel executes attention with on-the-fly index merging. Evaluated on Qwen3-4B-Instruct and LLaMA-3.1-8B-Instruct across the LongBench and RULER benchmarks, VSPrefill preserves 98.35% of the full attention accuracy while delivering a 4.95x average speedup at a context length of 128k. These results establish a new Pareto frontier in the trade-off between accuracy and efficiency.
Large Language Models (LLMs) often exhibit slash attention patterns, where attention scores concentrate along the $Δ$-th sub-diagonal for some offset $Δ$. These patterns play a key role in passing information across tokens. But why do they emerge? In this paper, we demystify the emergence of these Slash-Dominant Heads (SDHs) from both empirical and theoretical perspectives. First, by analyzing open-source LLMs, we find that SDHs are intrinsic to models and generalize to out-of-distribution prompts. To explain the intrinsic emergence, we analyze the queries, keys, and Rotary Position Embedding (RoPE), which jointly determine attention scores. Our empirical analysis reveals two characteristic conditions of SDHs: (1) Queries and keys are almost rank-one, and (2) RoPE is dominated by medium- and high-frequency components. Under these conditions, queries and keys are nearly identical across tokens, and interactions between medium- and high-frequency components of RoPE give rise to SDHs. Beyond empirical evidence, we theoretically show that these conditions are sufficient to ensure the emergence of SDHs by formalizing them as our modeling assumptions. Particularly, we analyze the trainin
We present the notion of multilevel slashing, where proof-of-stake blockchain validators can obtain gradual levels of assurance that a certain block is bound to be finalized in a global consensus procedure, unless an increasing and optimally large number of Byzantine processes have their staked assets slashed -- that is, deducted -- due to provably incorrect behavior. Our construction is a highly parameterized generalization of combinatorial intersection systems based on finite projective spaces, with asymptotic high availability and optimal slashing properties. Even under weak conditions, we show that our construction has asymptotically optimal slashing properties with respect to message complexity and validator load; this result also illustrates a fundamental trade off between message complexity, load, and slashing. In addition, we show that any intersection system whose ground elements are disjoint subsets of nodes (e.g. "committees" in committee-based consensus protocols) has asymptotic high availability under similarly weak conditions. Finally, our multilevel construction gives the flexibility to blockchain validators to decide how many "levels" of finalization assurance they
Recent advances in satellite technology have introduced a new frontier of wireless networking by establishing Low Earth Orbit (LEO) Satellite networks that work to connect difficult to reach areas and improve global connectivity. These novel advancements lack robust open-source simulation models that can highlight potential bottlenecks or potential wasted resources, wasting terrestrial users and the companies that provide these networks time and money. To that end, we propose SLASh, a highly-customizable satellite network simulation which allows users to design a simulated network with specific characteristics, and constructs them analog to real-world conditions. Additionally, SLASh can generate abstract telemetry that can be simulated moving throughout the network, allowing users to compare network capabilities across a variety of frameworks.
We present SLASH, a pitch estimation method of speech signals based on self-supervised learning (SSL). To enhance the performance of conventional SSL-based approaches that primarily depend on the relative pitch difference derived from pitch shifting, our method incorporates absolute pitch values by 1) introducing a prior pitch distribution derived from digital signal processing (DSP), and 2) optimizing absolute pitch through gradient descent with a loss between the target and differentiable DSP-derived spectrograms. To stabilize the optimization, a novel spectrogram generation method is used that skips complicated waveform generation. In addition, the aperiodic components in speech are accurately predicted through differentiable DSP, enhancing the method's applicability to speech signal processing. Experimental results showed that the proposed method outperformed both baseline DSP and SSL-based pitch estimation methods, attributed to the effective integration of SSL and DSP.
The upcoming electron-positron collider provides an ideal place to probe deviation from the Standard Model predictions with its clean environment, beam polarization and significant luminosity. We studied anomalous quartic gauge boson couplings~($VVW^-W^+, V\in\{γ,Z\}$), triple gauge couplings~($W^-W^+γ/Z)$, and Higgs-gauge couplings $(HVV, V\in\{W^\pm,Z,γ\}$) induced by $SU(2)_L \times U(1)_Y$ gauge invariant dimension-6 operators in $3l2j\slashed{E}$ final events with initial beam polarization. The phase space of two prominent amplitudes i.e, triple gauge boson production $(WWV)$ and vector boson scattering sub-processes, are selected with boosted decision trees. We employ the asymmetries related to polarization and spin correlation observables along with cross~section to constrain the anomalous couplings. The parity odd polarizations and spin correlations of jets from $W$ boson require flavor tagging which is done using artificial neural networks. We provide one parameter limits at $95\%$ confidence level combining cross~section and spin related observables. Finally, marginalized limits on all nine anomalous couplings are obtained with MCMC analysis. The limits are found to be in
Researchers have recently devised tools for debloating software and detecting configuration errors. Several of these tools rely on the observation that programs are composed of an initialization phase followed by a main-computation phase. Users of these tools are required to manually annotate the boundary that separates these phases, a task that can be time-consuming and error-prone (typically, the user has to read and understand the source code or trace executions with a debugger). Because errors can impair the tool's accuracy and functionality, the manual-annotation requirement hinders the ability to apply the tools on a large scale. In this paper, we present a field study of 24 widely-used C/C++ programs, identifying common boundary properties in 96\% of them. We then introduce \textit{slash}, an automated tool that locates the boundary based on the identified properties. \textit{slash} successfully identifies the boundary in 87.5\% of the studied programs within 8.5\ minutes, using up to 4.4\ GB memory. In an independent test, carried out after \textit{slash} was developed, \textit{slash} identified the boundary in 85.7\% of a dataset of 21 popular C/C++ GitHub repositories. Fi
End-to-end (E2E) delay is critical for interactive video streaming (IVS) experiences, but remains unsatisfactory for its long-tail distribution caused by periodic large keyframes. Conventional optimization strategies, such as jitter buffer, bitrate adaptation, and customized encoding, either sacrifice clarity, average delay, or compatibility. To address this issue, we propose PDStream, a novel pseudo-dual streaming algorithm, aimed at minimizing E2E delay while maintaining video clarity. The core idea is to split the two functions, delay-sensitive playback and delay-tolerant reference, on keyframes through dual streaming. Specifically, the playback function is held by a second parallel stream, which comprises much smaller non-keyframes and is allocated more immediate bandwidth for real-time performance. The reference function is ensured by the first stream with keyframe preservation, allocated more subsequent bandwidth to smooth out bursty traffic. Additionally, ``pseudo'' minimizes computational and transmission overheads by restricting dual streams to brief activation only when keyframes appear, supported by corresponding dual-stream bitrate allocation and adaptation to ensure de
We show that a bimodule of two block algebras of finite groups which has an endopermutation module as a source and which induces a stable equivalence of Morita type gives rise, via slash functors, to a family of bimodules of local block algebras with endopermutation source which induce Morita equivalences. As an application, we show that Morita (resp. stable) equivalences with endopermutation source imply functorial (resp. stable functorial) equivalences defined by Bouc and Yılmaz.
We introduce the class of strong cocomparability graphs, as the class of reflexive graphs whose adjacency matrix can be rearranged by a simultaneous row and column permutation to avoid the submatrix with rows 01, 10, which we call Slash. We provide an ordering characterization, a forbidden structure characterization, and a polynomial-time recognition algorithm, for the class. These results complete the picture in which in addition to, or instead of, the Slash matrix one forbids the Gamma matrix (which has rows 11, 10). It is well known that in these two cases one obtains the class of interval graphs, and the class of strongly chordal graphs, respectively. By complementation, we obtain the class of strong comparability graphs, whose adjacency matrix can be rearranged by a simultaneous row and column permutation to avoid the two-by-two identity submatrix. Thus our results give characterizations and algorithms for this class of irreflexive graphs as well. In other words, our results may be interpreted as solving the following problem: given a symmetric 0,1-matrix with 0-diagonal, can the rows and columns of be simultaneously permuted to avoid the two-by-two identity submatrix?
The goal of combining the robustness of neural networks and the expressivity of symbolic methods has rekindled the interest in neuro-symbolic AI. Recent advancements in neuro-symbolic AI often consider specifically-tailored architectures consisting of disjoint neural and symbolic components, and thus do not exhibit desired gains that can be achieved by integrating them into a unifying framework. We introduce SLASH -- a novel deep probabilistic programming language (DPPL). At its core, SLASH consists of Neural-Probabilistic Predicates (NPPs) and logical programs which are united via answer set programming. The probability estimates resulting from NPPs act as the binding element between the logical program and raw input data, thereby allowing SLASH to answer task-dependent logical queries. This allows SLASH to elegantly integrate the symbolic and neural components in a unified framework. We evaluate SLASH on the benchmark data of MNIST addition as well as novel tasks for DPPLs such as missing data prediction and set prediction with state-of-the-art performance, thereby showing the effectiveness and generality of our method.
This paper introduces the notion of Brauer-friendly modules, a generalisation of endo-p-permutation modules. A module over a block algebra OGe is said to be Brauer-friendly if it is a direct sum of indecomposable modules with compatible fusion-stable endopermutation sources. We obtain, for these modules, a functorial version of Dade's slash construction, also known as deflation-restriction. We prove that our slash functors, defined over Brauer-friendly categories, share most of the very useful properties that are satisfied by the Brauer functor over the category of p-permutation OGe-modules. In particular, we give a parametrisation of indecomposable Brauer-friendly modules, which opens the way to a complete classification whenever the fusion-stable sources are classified. Those tools have been used to prove the existence of a stable equivalence between non-principal blocks in the context of a minimal counter-example to the odd Z*p-theorem.
Choose a topos $E$. There are several different "notions of sheafness" on $E$. How do we visualize them? Let's refer to the classifier object of $E$ as $Ω$, and to its Heyting Algebra of truth-values, $Sub(1_E)$, as $H$; we will sometimes call $H$ the "logic" of the topos. There is a well-known way of representing notions of sheafness as morphisms $j:Ω\to Ω$, but these `$j$'s yield big diagrams when we draw them explicitly; here we will see a way to represent these `$j$'s as maps $J:H\to H$ in a way that is much more manageable. In the previous paper of this series we showed how certain toy models of Heyting Algebras, called "ZHAs", can be used to develop visual intuition for how Heyting Algebras and Intuitionistic Propositional Logic work; here we will extend that to sheaves. The full idea is this: notions of sheafness correspond to local operators and vice-versa; local operators correspond to J-operators and vice-versa; if our Heyting Algebra $H$ is a ZHA then J-operators correspond to slashings on $H$, and vice-versa; slashings on $H$ correspond to "sets of question marks" and vice-versa, and each set of question marks induces a notion of erasing and reconstructing, which induce
This work presents COmPOSER, an open-source, end-to-end framework for RF/mm-wave design automation that translates target specifications into optimized circuits with layouts. It unifies schematic synthesis, layout generation for actives and passives, and placement/routing, incorporating physics-based equations and machine-learning-driven electromagnetic models. Based on post-layout validation on multiple LNAs and PAs operating at up to 60GHz in a commercial 65nm process-kit, COmPOSER meets performance targets, comparable to expert manual designs, while delivering a 100-300x productivity gain. Github repo github[dot]com[slash]UMN-EDA[slash]COmPOSER
Long-context modeling is a pivotal capability for Large Language Models, yet the quadratic complexity of attention remains a critical bottleneck, particularly during the compute-intensive prefilling phase. While various sparse attention mechanisms have been explored, they typically suffer from either significant search latency or insufficient sparsity. In this paper, we propose FlashPrefill, a framework enabling ultra-fast prefilling via instantaneous pattern discovery and thresholding. FlashPrefill leverages a fast block-searching technique to simultaneously locate dynamic vertical, slash, and block-sparse attention patterns. Crucially, it introduces a dynamic thresholding mechanism that bypasses the prohibitive overhead of sorting or accumulating attention scores while effectively eliminating the long-tail distribution to enhance sparsity. Extensive evaluations demonstrate that FlashPrefill achieves a substantial leap in efficiency, delivering an unprecedented 27.78x speedup on 256K sequences. Notably, unlike existing methods that incur efficiency degradation on shorter contexts, FlashPrefill maintains a 1.71x speedup even at a 4K context length, demonstrating its robustness and
We study uniform spanning trees (USTs) on the cylindrical graph $G = C_n \times P_m$. Fix a trunk $L$ as a designated simple path in the tree connecting the two boundary rings of the cylinder. We prove an exponential tail bound for the length of branches emanating from the trunk: there exist constants $C>0$ and $θ=θ(n)\in(0,1)$, depending only on $n$, such that for all $m\in\mathbb{N}$ and $l\geq 0$, $$ \mathbb{P}\left(\text{UST has a branch off the trunk }L \,\text{ of length }\geq l \right) \leq Cm(n-1)θ^{l}. $$ Our work is motivated by the Abelian sandpile model on cylinders and, in particular, by the step-like (ladder) avalanche size distributions observed numerically in [Eckmann--Nagnibeda--Perriard, Abelian sandpiles on cylinders]. Via Dhar's burning algorithm, recurrent sandpile configurations correspond to spanning trees, so the geometry of a typical UST should influence how avalanches propagate along the cylinder. The trunk-with-short-branches structure and slash estimates proved here are intended as a first step towards a geometric explanation of these plateau phenomena for sandpile avalanches.
Block-sparse attention is promising for accelerating long-context LLM pre-filling, yet identifying relevant blocks efficiently remains a bottleneck. Existing methods typically employ coarse-grained attention as a proxy for block importance estimation, but often resort to expensive token-level searching or scoring, resulting in significant selection overhead. In this work, we trace the inaccuracy of standard coarse-grained attention via mean pooling to a theoretical root cause: the interaction between mean pooling and Rotary Positional Embeddings (RoPE). We prove that mean pooling acts as a low-pass filter that induces destructive interference in high-frequency dimensions, effectively creating a "blind spot" for local positional information (e.g., slash patterns). To address this, we introduce Prism, a training-free spectral-aware approach that decomposes block selection into high-frequency and low-frequency branches. By applying energy-based temperature calibration, Prism restores the attenuated positional signals directly from pooled representations, enabling block importance estimation using purely block-level operations, thereby improving efficiency. Extensive evaluations confir
In this work we demonstrate that a distorted double hump like missing energy ($\slashed{E}$) or missing transverse momentum ($\slashed{E}_T$) or missing mass ($\slashed{M}$) distribution at $e^+e^-$ colliders may hint towards the presence of multipartite dark sector. We illustrate the phenomena using a two component dark matter (DM) model involving an inert scalar doublet stabilised under a $\mathcal{Z}_2$ symmetry providing a scalar DM, one vector like fermion doublet and a right handed fermion singlet both stabilised under a different $\mathcal{Z}^{'}_2$ providing a fermion DM. We indicate the region of parameter space where the production of the heavy charged particles and their subsequent decay to DM yield double peak behaviour in $\slashed{E}$ spectrum after satisfying DM constraints. Importantly, we illustrate why and how $\slashed{E}$ serves as a better variable than $\slashed{E}_T$ in distinguishing two component DM frameworks and therefore how International Linear Collider (ILC) does better than the ongoing Large Hadron Collider (LHC). We also chalk out a set of criteria to identify and segregate the second peak in $\slashed{E}$ spectrum, after a careful analysis of the co
A threshold autoregressive (TAR) model is a powerful tool for analyzing nonlinear multivariate time series, which includes special cases like self-exciting threshold autoregressive (SETAR) models and vector autoregressive (VAR) models. In this paper, estimation, inference, and forecasting using the Bayesian approach are developed for multivariate TAR (MTAR) models considering a flexible setup, under which the noise process behavior can be described using not only the Gaussian distribution but also other distributions that belong to the class of Gaussian variance mixtures, which includes Student-t, Slash, symmetric hyperbolic, and contaminated normal distributions, which are also symmetric but are more flexible and with heavier tails than the Gaussian one. Inferences from MTAR models based on that kind of distribution may be less affected by extreme or outlying observations than those based on the Gaussian one. All parameters in the MTAR model are included in the proposed MCMC-type algorithm, except the number of regimes and the autoregressive orders, which can be chosen using the Deviance Information Criterion (DIC) and/or the Watanabe-Akaike Information Criterion (WAIC). A library
The concept of determinant for a linear operator in an infinite-dimensional space is addressed, by using the derivative of the operator's zeta-function (following Ray and Singer) and, eventually, through its zeta-function trace. A little play with operators as simple as $\pm I$ ($I$ being the identity operator) and variations thereof, shows that the presence of a non-commutative anomaly (i.e., the fact that det $(AB) eq$ det $A$ det $B$), is unavoidable, even for commuting and, remarkably, also for almost constant operators. In the case of Dirac-type operators, similarly basic arguments lead to the conclusion ---contradicting common lore--- that in spite of being $\det (\slash D +im) = \det (\slash D -im)$ (as follows from the symmetry condition of the $\slash D$-spectrum), it turns out that these determinants may {\it not} be equal to $\sqrt{\det (\slash D^2 +m^2)}$, simply because $\det [(\slash D +im) (\slash D -im)] eq \det (\slash D +im) \det (\slash D -im)$. A proof of this fact is given, by way of a very simple example, using operators with an harmonic-oscillator spectrum and fulfilling the symmetry condition. This anomaly can be physically relevant if, in addition to a ma