We analyze the scaling of quantum Fisher information with the number of system particles in the limit of large number of particles, as a function of the number of parties interacting with each other, for encoding Hamiltonians having arbitrary-body interactions. We find that estimation of coupling strength of such arbitrary-body encoding Hamiltonians provide a super-Heisenberg scaling that increases monotonically with an increase in the number of interacting particles, in the limit of large number of system particles. Moreover, we also find that the optimal probes corresponding to Hamiltonians that contain even-body interaction terms, may be entangled, but certainly not so in all bipartitions, and particularly, it is possible to attain optimal precision using asymmetric probes. Thereby we find a complementarity in the requirement of asymmetry and genuine entanglement in optimal probes for estimating strength of odd- and even-body interactions respectively. Additionally, we provide an upper bound on the number of parties up to which one can always obtain an asymmetric product state that gives the best metrological precision for even-body interactions. En route, we find the quantum Fi
This is the second paper of a two part work that establishes a definitive quantitative nonlinear scattering theory for asymptotically de Sitter vacuum solutions $(M,g)$ in $(n+1)$ dimensions with $n\geq4$ even, which are determined by small scattering data at $\mathscr{I}^{\pm}.$ In this paper we prove quantitative estimates for systems of wave equations on the $(M,g)$ backgrounds. The systems considered include the Einstein vacuum equations commuted with suitable time-dependent vector fields, where we treat the nonlinear terms as general inhomogeneous factors. The estimates obtained are essential in establishing sharp top order estimates for the scattering map of the Einstein vacuum equations, taking asymptotic data at $\mathscr{I}^-$ to asymptotic data at $\mathscr{I}^+$.
We investigate the norm maps of algebraic even $K$-groups of finite extensions of number fields. Namely, we show that they are surjective in most situations. In the event that they are not surjective, we give a criterion in determining when an element in the even $K$-group of the base field comes from a norm of an element from the even $K$-groups of the extension field. This latter criterion is only reliant on the real primes of the base field.
In this article we propose a new class of the even entire function connected with the product and series with the real coefficients. We address a sufficient condition for all real zeros for it. As a typical example, we give an answer to the problem of Lagarias and Montague. We suggest the open problems for the class of the even entire function.
We present a different approach to the fractional quantum Hall effect (FQHE), focusing it as a consequence of the change in the symmetry of the Hamiltonian of every electron in a two-dimensional electron gas (2DEG) under the application of a magnetic field and in the presence of an electrostatic potential due to the ionized impurities, and leading to a breaking of the degeneration of the Landau levels. As the magnetic field increases the effect of that electrostatic potential evolves, changing in turn the spatial symmetry of the Hamiltonian: from continuous to discrete one. The aim of both works is to give a different picture not only of the FQHE phenomenon, but a coherent one with the integer quantum Hall effect (IQHE) and consistent with the model already described in Hidalgo7, 8, 9. Therefore the model gives a global view of both effects, showing that they are aspects of the same phenomenon, and justifying not only the appearance of the odd denominator plateaux but also the even ones; and giving some physical reasons for the experimental fact that there are much more odd than even denominator plateaux, hardly observed
We determine the possible intersection sizes of a Hermitian surface $\mathcal H$ with an irreducible quadric of ${\mathrm PG}(3,q^2)$ sharing at least a tangent plane at a common non-singular point when $q$ is even.
We study a random even subgraph of a finite graph $G$ with a general edge-weight $p\in(0,1)$. We demonstrate how it may be obtained from a certain random-cluster measure on $G$, and we propose a sampling algorithm based on coupling from the past. A random even subgraph of a planar lattice undergoes a phase transition at the parameter-value $\frac 12 \pc$, where $\pc$ is the critical point of the $q=2$ random-cluster model on the dual lattice. The properties of such a graph are discussed, and are related to Schramm--Löwner evolutions (SLE).
The strong performance of vision transformers on image classification and other vision tasks is often attributed to the design of their multi-head attention layers. However, the extent to which attention is responsible for this strong performance remains unclear. In this short report, we ask: is the attention layer even necessary? Specifically, we replace the attention layer in a vision transformer with a feed-forward layer applied over the patch dimension. The resulting architecture is simply a series of feed-forward layers applied over the patch and feature dimensions in an alternating fashion. In experiments on ImageNet, this architecture performs surprisingly well: a ViT/DeiT-base-sized model obtains 74.9\% top-1 accuracy, compared to 77.9\% and 79.9\% for ViT and DeiT respectively. These results indicate that aspects of vision transformers other than attention, such as the patch embedding, may be more responsible for their strong performance than previously thought. We hope these results prompt the community to spend more time trying to understand why our current models are as effective as they are.
Atomic masses of the neutron-rich isotopes $^{121-128}$Cd, $^{129,131}$In, $^{130-135}$Sn, $^{131-136}$Sb, and $^{132-140}$Te have been measured with high precision (10 ppb) using the Penning trap mass spectrometer JYFLTRAP. Among these, the masses of four r-process nuclei $^{135}$Sn, $^{136}$Sb, and $^{139,140}$Te were measured for the first time. The data reveals a strong $N$=82 shell gap at $Z$=50 but indicates the importance of correlations for $Z>50$. An empirical neutron pairing gap expressed as the odd-even staggering of isotopic masses shows a strong quenching across $N$=82 for Sn, with the $Z$-dependence that is unexplainable by the current theoretical models.
Colour is a key component in the successful dissemination of information. Since many real-world concepts are associated with colour, for example danger with red, linguistic information is often complemented with the use of appropriate colours in information visualization and product marketing. Yet, there is no comprehensive resource that captures concept-colour associations. We present a method to create a large word-colour association lexicon by crowdsourcing. A word-choice question was used to obtain sense-level annotations and to ensure data quality. We focus especially on abstract concepts and emotions to show that even they tend to have strong colour associations. Thus, using the right colours can not only improve semantic coherence, but also inspire the desired emotional response.
Membership Inference Attacks (MIAs) are widely used to quantify training data memorization and assess privacy risks. Standard evaluation requires repeated retraining, which is computationally costly for large models. One-run methods (single training with randomized data inclusion) and zero-run methods (post hoc evaluation) are often used instead, though their statistical validity remains unclear. To address this gap, we frame MIA evaluation as a causal inference problem, defining memorization as the causal effect of including a data point in the training set. This novel formulation reveals and formalizes key sources of bias in existing protocols: one-run methods suffer from interference between jointly included points, while zero-run evaluations popular for LLMs are confounded by non-random membership assignment. We derive causal analogues of standard MIA metrics and propose practical estimators for multi-run, one-run, and zero-run regimes with non-asymptotic consistency guarantees. Experiments on real-world data show that our approach enables reliable memorization measurement even when retraining is impractical and under distribution shift, providing a principled foundation for pr
Model distillation enables efficient emulation of frontier large language models (LLMs), creating a need for robust mechanisms to detect when a third-party student model has trained on a teacher model's outputs. However, existing fingerprinting techniques that could be used to detect such distillation rely on heuristic perturbations that impose a steep trade-off between generation quality and fingerprinting strength, often requiring significant degradation of utility to ensure the fingerprint is effectively internalized by the student. We introduce antidistillation fingerprinting (ADFP), a principled approach that aligns the fingerprinting objective with the student's learning dynamics. Building upon the gradient-based framework of antidistillation sampling, ADFP utilizes a proxy model to identify and sample tokens that directly maximize the expected detectability of the fingerprint in the student after fine-tuning, rather than relying on the incidental absorption of the un-targeted biases of a more naive watermark. Experiments on GSM8K and OASST1 benchmarks demonstrate that ADFP achieves a significant Pareto improvement over state-of-the-art baselines, yielding stronger detection
Document retrieval is an important task for search and Retrieval-Augmented Generation (RAG) applications. Large Language Models (LLMs) have contributed to improving the accuracy of text-based document retrieval. However, documents with complex layout and visual elements like tables, charts and infographics are not perfectly represented in textual format. Recently, image-based document retrieval pipelines have become popular, which use visual large language models (VLMs) to retrieve relevant page images given a query. Current evaluation benchmarks on visual document retrieval are limited, as they primarily focus only English language, rely on synthetically generated questions and offer a small corpus size. Therefore, we introduce MIRACL-VISION, a multilingual visual document retrieval evaluation benchmark. MIRACL-VISION covers 18 languages, and is an extension of the MIRACL dataset, a popular benchmark to evaluate text-based multilingual retrieval pipelines. MIRACL was built using a human-intensive annotation process to generate high-quality questions. In order to reduce MIRACL-VISION corpus size to make evaluation more compute friendly while keeping the datasets challenging, we hav
High-dimensional covariance estimation is notoriously sensitive to outliers. While statistically optimal estimators exist for general heavy-tailed distributions, they often rely on computationally expensive techniques like semidefinite programming or iterative M-estimation ($O(d^3)$). In this work, we target the specific regime of \textbf{Sub-Weibull distributions} (characterized by stretched exponential tails $\exp(-t^α)$). We investigate a computationally efficient alternative: the \textbf{Cross-Fitted Norm-Truncated Estimator}. Unlike element-wise truncation, our approach preserves the spectral geometry while requiring $O(Nd^2)$ operations, which represents the theoretical lower bound for constructing a full covariance matrix. Although spherical truncation is geometrically suboptimal for anisotropic data, we prove that within the Sub-Weibull class, the exponential tail decay compensates for this mismatch. Leveraging weighted Hanson-Wright inequalities, we derive non-asymptotic error bounds showing that our estimator recovers the optimal sub-Gaussian rate $\tilde{O}(\sqrt{r(Σ)/N})$ with high probability. This provides a scalable solution for high-dimensional data that exhibits ta
Dell, Lapinskas and Meeks [DLM SICOMP 2022] presented a general reduction from approximate counting to decision for a class of fine-grained problems that can be viewed as hyperedge counting or detection problems in an implicit hypergraph, thus obtaining tight equivalences between approximate counting and decision for many key problems such as $k$-clique, $k$-sum and more. Their result is a reduction from approximately counting the number of hyperedges in an implicit $k$-partite hypergraph to a polylogarithmic number of calls to a hyperedge oracle that returns whether a given subhypergraph contains an edge. The main result of this paper is a generalization of the DLM result for {\em output-sensitive} approximate counting, where the running time of the desired counting algorithm is inversely proportional to the number of witnesses. Our theorem is a reduction from approximately counting the (unknown) number of hyperedges in an implicit $k$-partite hypergraph to a polylogarithmic number of calls to a hyperedge oracle called only on subhypergraphs with a small ``measure''. If a subhypergraph has $u_i$ nodes in the $i$th node partition of the $k$-partite hypergraph, then its measure is $
Two of the most fundamental distributed symmetry-breaking problems are that of finding a maximal independent set (MIS) and a maximal matching (MM) in a graph. It is a major open question whether these problems can be solved in constant rounds of the all-to-all communication model of \textsf{Congested\ Clique}, with $O(\log\log Δ)$ being the best upper bound known (where $Δ$ is the maximum degree). We explore in this paper the boundary of the feasible, asking for \emph{which graphs} we can solve the problems in constant rounds. We find that for several graph parameters, ranging from sparse to highly dense graphs, the problems do have a constant-round solution. In particular, we give algorithms that run in constant rounds when: (1) the average degree is at most $d(G) \le 2^{O(\sqrt{\log n})}$, (2) the neighborhood independence number is at most $β(G) \le 2^{O(\sqrt{\log n})}$, or (3) the independence number is at most $α(G) \le |V(G)|/d(G)^μ$, for any constant $μ> 0$. Further, we establish that these are tight bounds for the known methods, for all three parameters, suggesting that new ideas are needed for further progress.
Designs of data structures for approximate membership queries with false-positive errors that support both insertions and deletions stipulate the following two conditions: (1) Duplicate insertions are prohibited, i.e., it is prohibited to insert an element $x$ if $x$ is currently a member of the dataset. (2) Deletions of nonelements are prohibited, i.e., it is prohibited to delete $x$ if $x$ is not currently a member of the dataset. Under these conditions, the space required for the approximate representation of a datasets of cardinality $n$ with a false-positive probability of $ε^{+}$ is at most $(1+o(1))n\cdot\log_2 (1/ε^{+}) + O(n)$ bits [Bender et al., 2018; Bercea and Even, 2019]. We prove that if these conditions are lifted, then the space required for the approximate representation of datasets of cardinality $n$ from a universe of cardinality $u$ is at least $\frac 12 \cdot (1-ε^{+} -\frac 1n)\cdot \log \binom{u}{n} -O(n)$ bits.
Quantifying causal effects in the presence of complex and multivariate outcomes remains a key challenge in treatment evaluation. For hierarchical multivariate outcomes, the FDA recommends the Win Ratio and Generalized Pairwise Comparisons approaches \citep{Pocock2011winratio,Buyse2010}. However, commonly used estimators can yield treatment recommendations that target a population-level estimand (the probability that a randomly sampled patient under treatment fares better than another randomly sampled patient under control), which can contradict conclusions drawn from an ideal estimand (the probability that an individual would fare better with treatment than without), especially in heterogeneous populations. This discrepancy arises from the non-identifiability of the latter estimand and underscores both the influence of the chosen causal measure on the resulting conclusions and the necessity of articulating the underlying causal framework with clarity. We propose a novel, individual-level yet identifiable causal effect measure that more closely approximates the ideal individual-level estimand. We show that computing the Win Ratio or Net Benefit via nearest-neighbor pairing between t
Lead halide perovskite quantum dots (QDs), the latest generation of colloidal QD family, exhibit outstanding optical properties which are now exploited as both classical and quantum light sources. Most of their rather exceptional properties are related to the peculiar exciton fine-structure of band-edge states which can support unique bright triplet excitons. The degeneracy of the bright triplet excitons is lifted with energetic splitting in the order of millielectronvolts, which can be resolved by the photoluminescence (PL) measurements of single QDs at cryogenic temperatures. Each bright exciton fine-structure-state (FSS) exhibits a dominantly linear polarization, in line with several theoretical models based on the sole crystal field, exchange interaction and shape anisotropy. Here, we show that in addition to a high degree of linear polarization, the individual exciton FSS can exhibit a non-negligible degree of circular polarization even without external magnetic fields by investigating the four Stokes parameters of the exciton fine-structure in individual CsPbBr3 QDs through Stokes polarimetric measurements. We observe a degree of circular polarization up to ~38%, which could
Motivated by multi-task and meta-learning approaches, we consider the problem of learning structure shared by tasks or users, such as shared low-rank representations or clustered structures. While all previous works focus on well-specified linear regression, we consider more general convex objectives, where the structural low-rank and cluster assumptions are expressed on the optima of each function. We show that under mild assumptions such as \textit{Hessian concentration} and \textit{noise concentration at the optimum}, rank and clustered regularized estimators recover such structure, provided the number of samples per task and the number of tasks are large enough. We then study the problem of recovering the subspace in which all the solutions lie, in the setting where there is only a single sample per task: we show that in that case, the rank-constrained estimator can recover the subspace, but that the number of tasks needs to scale exponentially large with the dimension of the subspace. Finally, we provide a polynomial-time algorithm via nuclear norm constraints for learning a shared linear representation in the context of convex learning objectives.