Sequence and channel mixers, the core mechanism in sequence models, have become the de facto standard in time series analysis (TSA). However, recent studies have questioned the necessity of complex sequence mixers, such as attention mechanisms, demonstrating that simpler architectures can achieve comparable or even superior performance. This suggests that the benefits attributed to complex sequencemixers might instead emerge from other architectural or optimization factors. Based on this observation, we pose a central question: Are common sequence mixers necessary for time-series analysis? Therefore, we propose JustDense, an empirical study that systematically replaces sequence mixers in various well-established TSA models with dense layers. Grounded in the MatrixMixer framework, JustDense treats any sequence mixer as a mixing matrix and replaces it with a dense layer. This substitution isolates the mixing operation, enabling a clear theoretical foundation for understanding its role. Therefore, we conducted extensive experiments on 29 benchmarks covering five representative TSA tasks using seven state-of-the-art TSA models to address our research question. The results show that rep
Large Language Models (LLMs) have achieved remarkable results on a range of standardized tests originally designed to assess human cognitive and psychological traits, such as intelligence and personality. While these results are often interpreted as strong evidence of human-like characteristics in LLMs, this paper argues that such interpretations constitute an ontological error. Human psychological and educational tests are theory-driven measurement instruments, calibrated to a specific human population. Applying these tests to non-human subjects without empirical validation, risks mischaracterizing what is being measured. Furthermore, a growing trend frames AI performance on benchmarks as measurements of traits such as ``intelligence'', despite known issues with validity, data contamination, cultural bias and sensitivity to superficial prompt changes. We argue that interpreting benchmark performance as measurements of human-like traits, lacks sufficient theoretical and empirical justification. This leads to our position: Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead. We call for the development of principled, AI-specific evaluation frameworks t
Fossil fuels, which meet most of humanity's energy needs, cause climate change due to their high carbon emissions. There are two types of energy sources that can replace fossil fuels: renewable and nuclear. Nuclear energy sources are more advantageous in terms of efficiency and sustainability. The use of Thorium as nuclear fuel in fusion reactors will contribute to the reduction of radioactive waste, due to the much lower production of transuranics. Fusion reactors, which are considered promising, are still in the R&D phase. In this respect, hybrid fusion-fission reactors seem more promising and the recently proposed combination of muon-catalyzed DD fusion with a cascade thorium reactor is worthy of appreciation. In this study, we show that using the DD collider instead of muonic fusion has significant advantages.
The widely discussed ``black-bounce'' mechanism of removing a singularity at $r=0$ in a spherically symmetric space-time, proposed by Simpson and Visser, consists in removing the point $r=0$ and its close neighborhood, resulting in emergence of a regular minimum of the spherical radius that can be a wormhole throat or a regular bounce. Instead, it has been recently proposed to make $r=0$ a regular center by properly modifying the metric, still preserving its form in regions far from $r=0$. Different algorithms of such modifications have been formulated for a few classes of singularities. The previous paper considered space-times whose Ricci tensor satisfies the condition $R^t_t =R^r_r$, and regular modifications were obtained for the Schwarzschild, Reissner-Nordström metrics, and two examples of solutions with magnetic fields obeying nonlinear electrodynamics (NED). The present paper considers regular modifications of more general space-times, and as examples, modifications with a regular center have been obtained for the Fisher (also known as JNW) solution with a naked singularity and a family of dilatonic black holes. Possible field sources of the new regular metrics are consider
Federated Learning (FL) enables collaborative optimization of machine learning models across decentralized data by aggregating model parameters. Our approach extends this concept by aggregating "knowledge" derived from models, instead of model parameters. We present a novel framework called CoDream, where clients collaboratively optimize randomly initialized data using federated optimization in the input data space, similar to how randomly initialized model parameters are optimized in FL. Our key insight is that jointly optimizing this data can effectively capture the properties of the global data distribution. Sharing knowledge in data space offers numerous benefits: (1) model-agnostic collaborative learning, i.e., different clients can have different model architectures; (2) communication that is independent of the model size, eliminating scalability concerns with model parameters; (3) compatibility with secure aggregation, thus preserving the privacy benefits of federated learning; (4) allowing of adaptive optimization of knowledge shared for personalized learning. We empirically validate CoDream on standard FL tasks, demonstrating competitive performance despite not sharing mod
There is an increasing trend towards evaluating NLP models with LLMs instead of human judgments, raising questions about the validity of these evaluations, as well as their reproducibility in the case of proprietary models. We provide JUDGE-BENCH, an extensible collection of 20 NLP datasets with human annotations covering a broad range of evaluated properties and types of data, and comprehensively evaluate 11 current LLMs, covering both open-weight and proprietary models, for their ability to replicate the annotations. Our evaluations show substantial variance across models and datasets. Models are reliable evaluators on some tasks, but overall display substantial variability depending on the property being evaluated, the expertise level of the human judges, and whether the language is human or model-generated. We conclude that LLMs should be carefully validated against human judgments before being used as evaluators.
Currently, prompting techniques can be mainly divided into two categories:1)shot method implicitly inspires the model to answer the question by mimicing the steps in the given example, e.g., the few-shot CoT. 2) Guideline method explicitly instructs the model to reason by following guidelines, which contains succinct and concise task-specific knowledge. Shot method is prone to difficulties in terms of selection of shots type, the number of shots, and the design of the reasoning steps, so a question arises: can we only use guideline instead of shot in the prompt? To this end, we propose the FGT framework to automatically learn task-specific guidelines from dataset consisting of Feedback, Guideline, and Tree-gather agents. First, the feedback agent is designed to evaluate the outcomes, both right and wrong, of each Q&A to gather insights guiding more effective optimization strategies. Next, the guideline agent is tasked with deriving guidelines from each piece of feedback and storing them in local memory. Lastly, the tree-gather agent aggregates all guidelines hierarchically through a tree structure, ultimately obtaining all unduplicated guidelines from a global perspective. In a
Disordering of solids typically leads to amorphization, but polymorph transitions, facilitated by favorable atomic rearrangements, may temporarily help to maintain long-range periodicity in the solid state. In far-from-equilibrium situations, such as atomic collision cascades, these rearrangements may not necessarily follow a thermodynamically gainful path, but may be kinetically limited. In this Letter, we focused on such crystallization instead of amorphization in collision cascades in gallium oxide (\ce{Ga2O3}). We determined the disorder threshold for irreversible $β$-to-$γ$ polymorph transition and explained why it results in elevating energy to that of the $γ$-polymorph, which exhibits the highest polymorph energy in the system below the amorphous state. Specifically, we demonstrate that upon reaching the disorder transition threshold, the \ce{Ga}-sublattice kinetically favors transitioning to the $γ$-like configuration, requiring significantly less migration for \ce{Ga} atoms to reach the lattice sites during post-cascade processes. As such, our data provide a consistent explanation of this remarkable phenomenon and can serve as a toolbox for predictive multi-polymorph fabri
Over the past decade and a half, adoption of Bayesian inference in pulsar timing analysis has led to increasingly sophisticated models. The recent announcement of evidence for a stochastic background of gravitational waves by various pulsar timing array projects highlighted Bayesian inference as a central tool for parameter estimation and model selection. Despite its success, Bayesian inference is occasionally misused in the pulsar timing community. A common workflow is that the data is analyzed in multiple steps: a first analysis of single pulsars individually, and a subsequent analysis of the whole array of pulsars. A mistake that is then sometimes introduced stems from using the posterior distribution to craft the prior for the analysis of the same data in a second step, a practice referred to in the statistics literature as ``circular analysis.'' This is done to prune the model for computational efficiency. Multiple recent high-profile searches for gravitational waves by pulsar timing array (PTA) projects have this workflow. This letter highlights this error and suggests that Spike and Slab priors can be used to carry out model averaging instead of model selection in a single p
Dramatic advances in artificial intelligence over the past decade (for narrow-purpose AI) and the last several years (for general-purpose AI) have transformed AI from a niche academic field to the core business strategy of many of the world's largest companies, with hundreds of billions of dollars in annual investment in the techniques and technologies for advancing AI's capabilities. We now come to a critical juncture. As the capabilities of new AI systems begin to match and exceed those of humans across many cognitive domains, humanity must decide: how far do we go, and in what direction? This essay argues that we should keep the future human by closing the "gates" to smarter-than-human, autonomous, general-purpose AI -- sometimes called "AGI" -- and especially to the highly-superhuman version sometimes called "superintelligence." Instead, we should focus on powerful, trustworthy AI tools that can empower individuals and transformatively improve human societies' abilities to do what they do best.
Explicit calibration-based methods have dominated RAW image denoising under extremely low-light environments. However, these methods are impeded by several critical limitations: a) the explicit calibration process is both labor- and time-intensive, b) challenge exists in transferring denoisers across different camera models, and c) the disparity between synthetic and real noise is exacerbated by digital gain. To address these issues, we introduce a groundbreaking pipeline named Lighting Every Darkness (LED), which is effective regardless of the digital gain or the camera sensor. LED eliminates the need for explicit noise model calibration, instead utilizing an implicit fine-tuning process that allows quick deployment and requires minimal data. Structural modifications are also included to reduce the discrepancy between synthetic and real noise without extra computational demands. Our method surpasses existing methods in various camera models, including new ones not in public datasets, with just a few pairs per digital gain and only 0.5% of the typical iterations. Furthermore, LED also allows researchers to focus more on deep learning advancements while still utilizing sensor engine
The conventional reverse fill/flush flow modulation for comprehensive two-dimensional gas chromatography requires a bleed capillary column to be connected to the outlet of the modulator channel. The purpose of this capillary, that does not contain the stationary phase, is to provide a pressure resistance to the modulator channel flow. In this way, the desired modulator flow can be achieved, and channel over-filling can be avoided. Normally, the length and the internal diameter of the bleed capillary are chosen so as to obtain the modulator flow that is close to the flow of the first separation column. Thus, for any chosen set of chromatographic conditions, the required dimensions of the bleed capillary can be completely different, making the GCxGC method development tedious and generating additional costs in consumables and analyst time. In this work a tunable pressure source generating a suitable backpressure was used instead of the fixed bleed capillary which has the advantage of the possibility to freely adapt the pressure resistance and generate the required modulator channel flow for any conditions. This set-up has been evaluated and compared in terms of the impact on the modu
Quick global aggregation of effective distributed parameters is crucial to federated learning (FL), which requires adequate bandwidth for parameters communication and sufficient user data for local training. Otherwise, FL may cost excessive training time for convergence and produce inaccurate models. In this paper, we propose a brand-new FL framework, PromptFL, that replaces the federated model training with the federated prompt training, i.e., let federated participants train prompts instead of a shared model, to simultaneously achieve the efficient global aggregation and local training on insufficient data by exploiting the power of foundation models (FM) in a distributed way. PromptFL ships an off-the-shelf FM, i.e., CLIP, to distributed clients who would cooperatively train shared soft prompts based on very few local data. Since PromptFL only needs to update the prompts instead of the whole model, both the local training and the global aggregation can be significantly accelerated. And FM trained over large scale data can provide strong adaptation capability to distributed users tasks with the trained soft prompts. We empirically analyze the PromptFL via extensive experiments, a
Providing meaningful recommendations in a content marketplace is challenging due to the fact that users are not the final content consumers. Instead, most users are creatives whose interests, linked to the projects they work on, change rapidly and abruptly. To address the challenging task of recommending images to content creators, we design a RecSys that learns visual styles preferences transversal to the semantics of the projects users work on. We analyze the challenges of the task compared to content-based recommendations driven by semantics, propose an evaluation setup, and explain its applications in a global image marketplace. This technical report is an extension of the paper "Learning Users' Preferred Visual Styles in an Image Marketplace", presented at ACM RecSys '22.
A major challenge when using k-means clustering often is how to choose the parameter k, the number of clusters. In this letter, we want to point out that it is very easy to draw poor conclusions from a common heuristic, the "elbow method". Better alternatives have been known in literature for a long time, and we want to draw attention to some of these easy to use options, that often perform better. This letter is a call to stop using the elbow method altogether, because it severely lacks theoretic support, and we want to encourage educators to discuss the problems of the method -- if introducing it in class at all -- and teach alternatives instead, while researchers and reviewers should reject conclusions drawn from the elbow method.
Learning from interaction is the primary way that biological agents acquire knowledge about their environment and themselves. Modern deep reinforcement learning (DRL) explores a computational approach to learning from interaction and has made significant progress in solving various tasks. However, despite its power, DRL still falls short of biological agents in terms of energy efficiency. Although the underlying mechanisms are not fully understood, we believe that the integration of spiking communication between neurons and biologically-plausible synaptic plasticity plays a prominent role in achieving greater energy efficiency. Following this biological intuition, we optimized a spiking policy network (SPN) using a genetic algorithm as an energy-efficient alternative to DRL. Our SPN mimics the sensorimotor neuron pathway of insects and communicates through event-based spikes. Inspired by biological research showing that the brain forms memories by creating new synaptic connections and rewiring these connections based on new experiences, we tuned the synaptic connections instead of weights in the SPN to solve given tasks. Experimental results on several robotic control tasks demonst
In our desire to give a new suggestion for H-based superconductors experiments we present a theoretical framework for understanding the impact of an applied electric field on pressured hydride superconductors. We study a material at pressure $p$, when it possesses insulator-superconductor transition, at the respective superconducting critical temperature $T_{cr}$. The theory shows the applied electric field penetrates the material and forces the Cooper pairs to Bose condensate. If one applies an electric field and then increases the temperature, the theory predicts novel critical temperature $T^{el}_{cr}$ higher than $T_{cr}$. Therefore, the system has a higher superconducting critical temperature if we apply an electric field instead of increasing the pressure. The result shows that in the case of carbonaceous sulfur hydride at $234Gpa$ and near but below critical temperature $T_c=283K$, applying a sufficiently strong electric field, we can bring the superconducting critical temperature close to 300K.
We give a first account of our new parallel SAT solver Gimsatul. Its key feature is to share clauses physically in memory instead of copying them, which is the method of other state-of-the-art multi-threaded SAT solvers to exchange clauses logically. Our approach keeps information about which literals are watched in a clause local to a solving thread but shares the actual immutable literals of a clause globally among all solving threads. This design gives quite remarkable parallel scalability, allows aggressive clause sharing while keeping memory usage low and produces more compact proofs.
We study the Representative Volume Element (RVE) method, which is a method to approximately infer the effective behavior $a_{\text{hom}}$ of a stationary random medium. The latter is described by a coefficient field $a(x)$ generated from a given ensemble $\langle\cdot\rangle$ and the corresponding linear elliptic operator $- abla\cdot a abla$. In line with the theory of homogenization, the method proceeds by computing $d = 3$ correctors (d denoting the space dimension).To be numerically tractable, this computation has to be done on a finite domain: the so-called "representative" volume element, i. e. a large box with, say, periodic boundary conditions. The main message of this article is: Periodize the ensemble instead of its realizations. By this we mean that it is better to sample from a suitably periodized ensemble than to periodically extend the restriction of a realization $a(x)$ from the whole-space ensemble $\langle\cdot\rangle$. We make this point by investigating the bias (or systematic error), i. e. the difference between $a_{\text{hom}}$ and the expected value of the RVE method, in terms of its scaling w. r. t. the lateral size $L$ of the box. In case of periodizing $a(x
There are two major generalizations of the standard ordinal analysis: One is Girard's $Π^1_2$-proof theory in which dilators are assigned to theories instead of ordinals. The other is Pohlers' generalized ordinal analysis with Spector classes, where ordinals greater than $ω_1^{\mathsf{CK}}$ are assigned to theories. In this paper, we show that these two are systematically entangled, and $Σ^1_2$-proof theoretic analysis has a critical role in connecting these two.