Loss curves are smooth during most of model training, so visible discontinuities stand out as possible conceptual breakthroughs. Studying these breakthroughs enables a deeper understanding of learning dynamics, but only when they are properly identified. This paper argues that similar breakthroughs occur frequently throughout training but they are obscured by a loss metric that collapses all variation into a single scalar. To find these hidden transitions, we introduce POLCA, a method for decomposing changes in loss along arbitrary bases of the low-rank training subspace. We use our method to identify clusters of samples that share similar changes in loss during training, disaggregating the overall loss into that of smaller groups of conceptually similar data. We validate our method on synthetic arithmetic and natural language tasks, showing that POLCA recovers clusters that represent interpretable breakthroughs in the model's capabilities. We demonstrate the promise of these hidden phase transitions as a tool for unsupervised interpretability.
Adsorption breakthrough modeling often requires complex software environments and scripting, limiting accessibility for many practitioners. We present AIM, a MATLAB-based graphical user interface (GUI) application that streamlines fixed-bed adsorption modeling and analysis through an integrated workflow, which includes isotherm fitting, estimation of the enthalpy of adsorption, prediction of mixture behavior, and multicomponent breakthrough simulations. AIM supports 13 isotherm models for isotherm fitting and includes the implementation of Ideal Adsorbed Solution Theory (IAST) (FastIAS) and extended Langmuir models for predicting mixture isotherms. Moreover, the isotherm models can be used to run non-isothermal breakthrough simulations along with isosteric enthalpies of adsorption from the Clausius-Clapeyron and Virial equations. Users can export detailed column and outlet profiles (e.g., composition, temperature) in multiple formats, enhancing reproducibility and data sharing among practitioners. We compared the breakthrough simulation results from the AIM workflow and compared that with the experimental data in the literature for a ternary gas mixture (CO2/H2/N2) and found excell
The seminal 2009 paper by Bernard, Krauth, and Wilson marked a paradigm shift in Monte Carlo sampling. By abandoning the restrictive condition of detailed balance in favor of the more fundamental principle of global balance, they introduced the Event-Chain Monte Carlo (ECMC) algorithm, which achieves rejection-free, deterministic sampling for hard spheres. This breakthrough demonstrated that persistent, directional dynamics could dramatically accelerate equilibration in dense particle systems. In this commentary, we review this foundational work and elucidate its underlying mechanism using the broader Event-Driven Monte Carlo (EDMC) framework developed in subsequent years. We show how the original hard-sphere concept naturally generalizes to continuous potentials and modern lifted Markov chain formalisms, transforming a surprising specific result into a powerful general class of sampling algorithms.
Progress in science and technology is punctuated by disruptive innovation and breakthroughs. Researchers have characterized these disruptions to explore the factors that spark such innovations and to assess their long-term trends. However, although understanding disruptive breakthroughs and their drivers hinges upon accurately quantifying disruptiveness, the core metric used in previous studies -- the disruption index -- remains insufficiently understood and tested. Here, after demonstrating the critical shortcomings of the disruption index, including its conflicting evaluations for simultaneous discoveries, we propose a new, continuous measure of disruptiveness based on a neural embedding framework that addresses these limitations. Our measure not only better distinguishes disruptive works, such as Nobel Prize-winning papers, from others, but also reveals simultaneous disruptions by allowing us to identify the "twins" that have the most similar future context. By offering a more robust and precise lens for identifying disruptive innovations and simultaneous discoveries, our study provides a foundation for deepening insights into the mechanisms driving scientific breakthroughs whil
Ferguson's 1973 introduction of the Dirichlet process marked a breakthrough in Bayesian nonparametric statistics. For the first time, a prior on the space of probability measures fulfilled two key desiderata: large support and analytical tractability. In this paper, we review three complementary constructions of the Dirichlet process, whose roots can be traced back to Ferguson: through finite-dimensional distributions, via normalization of a gamma process, and through predictive distributions. Each perspective not only deepens the understanding of the Dirichlet process but also provides a template for generalizations, from normalized random measures with independent increments to Gibbs--type priors and beyond. Over the past fifty years, the Dirichlet process has become the cornerstone of Bayesian nonparametric methodology and applications, while simultaneously inspiring the expansion of the landscape of nonparametric priors. Since de Finetti laid out the Bayesian nonparametric framework in the 1930s, the key obstacle had been the absence of a tractable nonparametric prior. Ferguson's contribution overcame this challenge, providing a solution to a decades-long open problem. In recog
Science is driven by community endeavors across diverse fields and specializations, forming a complex structure that renders conventional performance evaluation methods inadequate. Using established indicators, the network-based normalized citation score, and the disruptive index, combined with the GENEPY algorithm, we evaluate the complexity rank of countries based on their breakthrough performance across 89 subfields of physical sciences, drawing on nearly 60 million articles (1900-2023). This quality-focused integrated approach reveals pronounced asymmetries: while countries such as the United States, Israel, and several in Europe sustain long-term structural advantages, emerging nations show rapid gains in later decades. A power-law relationship between aggregated breakthrough performance and countries' R&D expenditure underscores the unequal and scale-dependent nature of global science. These results demonstrate that scientific advancement arises not from uniform growth but from asymmetric complexity, offering actionable insights for policymakers and funding agencies aiming to foster sustainable, high-quality research ecosystems.
Despite the usefulness of machine learning approaches for the early screening of potential breakthrough technologies, their practicality is often hindered by opaque models. To address this, we propose an interpretable machine learning approach to predicting future citation counts from patent texts using a patent-specific hierarchical attention network (PatentHAN) model. Central to this approach are (1) a patent-specific pre-trained language model, capturing the meanings of technical words in patent claims, (2) a hierarchical network structure, enabling detailed analysis at the claim level, and (3) a claim-wise self-attention mechanism, revealing pivotal claims during the screening process. A case study of 35,376 pharmaceutical patents demonstrates the effectiveness of our approach in early screening of potential breakthrough technologies while ensuring interpretability. Furthermore, we conduct additional analyses using different language models and claim types to examine the robustness of the approach. It is expected that the proposed approach will enhance expert-machine collaboration in identifying breakthrough technologies, providing new insight derived from text mining into tech
3I/ATLAS, an interstellar object, made its closest approach to Earth on 2025 December 19. On 2025 December 18, the Breakthrough Listen program conducted a technosignature search toward 3I/ATLAS using the 100 m Robert C. Byrd Green Bank Telescope at 1-12 GHz. We report a nondetection of candidate signals down to the 100 mW level.
The article provides a brief description of the MathPartner service. This freely available cloud-based Mathematics is a universal system for symbolic-numeric calculations. Its Mathpar language is a subset of the LaTeX language, but allows you to create mathematical texts that contain "computable" mathematical operators. This opens up completely new opportunities for improving the educational process for all natural science disciplines, for the use of mathematics in scientific and engineering calculations. To save and freely exchange educational and other texts in the Mathpar language, a GitHub repository has been created. It is concluded that cloud mathematics MathPartner is a new breakthrough technology for school and university natural science education, for scientific and engineering applications.
We implement a machine learning algorithm to search for extra-terrestrial technosignatures in radio observations of several hundred nearby stars, obtained with the Parkes and Green Bank Telescopes by the Breakthrough Listen collaboration. Advances in detection technology have led to an exponential growth in data, necessitating innovative and efficient analysis methods. This problem is exacerbated by the large variety of possible forms an extraterrestrial signal might take, and the size of the multidimensional parameter space that must be searched. It is then made markedly worse by the fact that our best guess at the properties of such a signal is that it might resemble the signals emitted by human technology and communications, the main (yet diverse) contaminant in radio observations. We address this challenge by using a combination of simulations and machine learning methods for anomaly detection. We rank candidates by how unusual they are in frequency, and how persistent they are in time, by measuring the similarity between consecutive spectrograms of the same star. We validate that our filters significantly improve the quality of the candidates that are selected for human vettin
When conflicting images are presented to either eye, binocular fusion is disrupted. Rather than experiencing a blend of both percepts, often only one eye's image is experienced, whilst the other is suppressed from awareness. Importantly, suppression is transient - the two rival images compete for dominance, with stochastic switches between mutually exclusive percepts occurring every few seconds with law-like regularity. From the perspective of dynamical systems theory, visual rivalry offers an experimentally tractable window into the dynamical mechanisms governing perceptual awareness. In a recently developed visual rivalry paradigm - tracking continuous flash suppression (tCFS) - it was shown that the transition between awareness and suppression is hysteretic, with a higher contrast threshold required for a stimulus to breakthrough suppression into awareness than to be suppressed from awareness. Here, we present an analytically-tractable model of visual rivalry that quantitatively explains the hysteretic transition between periods of awareness and suppression in tCFS. Grounded in the theory of neural dynamics, we derive closed-form expressions for the duration of perceptual domina
Reactive flows in porous media play an important role in our life and are crucial for many industrial, environmental and biomedical applications. Very often the concentration of the species at the inlet is known, and the so-called breakthrough curves, measured at the outlet, are the quantities which could be measured or computed numerically. The measurements and the simulations could be time-consuming and expensive, and machine learning and Big Data approaches can help to predict breakthrough curves at lower costs. Machine learning (ML) methods, such as Gaussian processes and fully-connected neural networks, and a tensor method, cross approximation, are well suited for predicting breakthrough curves. In this paper, we demonstrate their performance in the case of pore scale reactive flow in catalytic filters.
Climate models must simulate hundreds of future scenarios for hundreds of years at coarse resolutions, and a handful of high-resolution decadal simulations to resolve localized extreme events. Using Oceananigans.jl, written from scratch in Julia, we report several achievements: First, a global ocean simulation with breakthrough horizontal resolution -- 488m -- reaching 15 simulated days per day (0.04 simulated years per day; SYPD). Second, Oceananigans simulates the global ocean at 488m with breakthrough memory efficiency on just 768 Nvidia A100 GPUs, a fraction of the resources available on current and upcoming exascale supercomputers. Third, and arguably most significant for climate modeling, Oceananigans achieves breakthrough energy efficiency reaching 0.95 SYPD at 1.7 km on 576 A100s and 9.9 SYPD at 10 km on 68 A100s -- the latter representing the highest horizontal resolutions employed by current IPCC-class ocean models. Routine climate simulations with 10 km ocean components are within reach.
The Breakthrough Listen search for intelligent life is, to date, the most extensive technosignature search of nearby celestial objects. We present a radio technosignature search of the centers of 97 nearby galaxies, observed by Breakthrough Listen at the Robert C. Byrd Green Bank Telescope. We performed a narrowband Doppler drift search using the turboSETI pipeline with a minimum signal-to-noise parameter threshold of 10, across a drift rate range of $\pm$ 4 Hz\ $s^{-1}$, with a spectral resolution of 3 Hz and a time resolution of $\sim$ 18.25 s. We removed radio frequency interference by using an on-source/off-source cadence pattern of six observations and discarding signals with Doppler drift rates of 0. We assess factors affecting the sensitivity of the Breakthrough Listen data reduction and search pipeline using signal injection and recovery techniques and apply new methods for the investigation of the RFI environment. We present results in four frequency bands covering 1 -- 11 GHz, and place constraints on the presence of transmitters with equivalent isotropic radiated power on the order of $10^{26}$ W, corresponding to the theoretical power consumption of Kardashev Type II ci
Real world data is an increasingly utilized resource for post-market monitoring of vaccines and provides insight into real world effectiveness. However, outside of the setting of a clinical trial, heterogeneous mechanisms may drive observed breakthrough infection rates among vaccinated individuals; for instance, waning vaccine-induced immunity as time passes and the emergence of a new strain against which the vaccine has reduced protection. Analyses of infection incidence rates are typically predicated on a presumed mechanism in their choice of an "analytic time zero" after which infection rates are modeled. In this work, we propose an explicit test for driving mechanism situated in a standard Cox proportional hazards framework. We explore the test's performance in simulation studies and in an illustrative application to real world data. We additionally introduce subgroup differences in infection incidence and evaluate the impact of time zero misspecification on bias and coverage of model estimates. In this study we observe strong power and controlled type I error of the test to detect the correct infection-driving mechanism under various settings. Similar to previous studies, we f
Theories of innovation emphasize the role of social networks and teams as facilitators of breakthrough discoveries. Around the world, scientists and inventors today are more plentiful and interconnected than ever before. But while there are more people making discoveries, and more ideas that can be reconfigured in novel ways, research suggests that new ideas are getting harder to find-contradicting recombinant growth theory. In this paper, we shed new light on this apparent puzzle. Analyzing 20 million research articles and 4 million patent applications across the globe over the past half-century, we begin by documenting the rise of remote collaboration across cities, underlining the growing interconnectedness of scientists and inventors globally. We further show that across all fields, periods, and team sizes, researchers in these remote teams are consistently less likely to make breakthrough discoveries relative to their onsite counterparts. Creating a dataset that allows us to explore the division of labor in knowledge production within teams and across space, we find that among distributed team members, collaboration centers on late-stage, technical tasks involving more codifie
Transport networks, such as vasculature or river networks, provide key functions in organisms and the environment. They usually contain loops whose significance for the stability and robustness of the network is well documented. However, the dynamics of their formation is usually not considered. Such structures often grow in response to the gradient of an external field. During evolution, extending branches compete for the available flux of the field, which leads to effective repulsion between them and screening of the shorter ones. Yet, in remarkably diverse processes, from unstable fluid flows to the canal system of jellyfish, loops suddenly form near the breakthrough when the longest branch reaches the boundary of the system. We provide a physical explanation for this universal behavior. Using a 1D model, we explain that the appearance of effective attractive forces results from the field drop inside the leading finger as it approaches the outlet. Furthermore, we numerically study the interactions between two fingers, including screening in the system and its disappearance near the breakthrough. Finally, we perform simulations of the temporal evolution of the fingers to show how
This work is aimed at understanding the basic principles of adsorption process in great details as adsorptive separation process has broad applications in the industry. To this end, a simple mathematical model has been used to describe transient fixed bed physical adsorption process. Governing equations are solved numerically to obtain breakthrough curves for single component and multi-component monolayer adsorption. Desorption of a saturated bed by an inert fluid is also considered. A full parametric study is performed to analyze the effects of different parameters such as bed length, velocity, diffusivity, particle radius and isotherm properties on the nature of the breakthrough curve. Analysis of these results led to the development of the generic breakthrough curve for a single component monolayer adsorption which will enable us to tell the nature of breakthrough curve for different process parameters without recourse to the numerical simulation or experiment. Thus this study will be of great interest in the industrial separation process.
The Breakthrough Listen Initiative is conducting a program using multiple telescopes around the world to search for "technosignatures": artificial transmitters of extraterrestrial origin from beyond our solar system. The VERITAS Collaboration joined this program in 2018, and provides the capability to search for one particular technosignature: optical pulses of a few nanoseconds duration detectable over interstellar distances. We report here on the analysis and results of dedicated VERITAS observations of Breakthrough Listen targets conducted in 2019 and 2020 and of archival VERITAS data collected since 2012. Thirty hours of dedicated observations of 136 targets and 249 archival observations of 140 targets were analyzed and did not reveal any signals consistent with a technosignature. The results are used to place limits on the fraction of stars hosting transmitting civilizations. We also discuss the minimum-pulse sensitivity of our observations and present VERITAS observations of CALIOP: a space-based pulsed laser onboard the CALIPSO satellite. The detection of these pulses with VERITAS, using the analysis techniques developed for our technosignature search, allows a test of our a
Reactive flows are important part of numerous technical and environmental processes. Often monitoring the flow and species concentrations within the domain is not possible or is expensive, in contrast, outlet concentration is straightforward to measure. In connection with reactive flows in porous media, the term breakthrough curve is used to denote the time dependency of the outlet concentration with prescribed conditions at the inlet. In this work we apply several machine learning methods to predict breakthrough curves from the given set of parameters. In our case the parameters are the Damköhler and Peclet numbers. We perform a thorough analysis for the one-dimensional case and also provide the results for the three-dimensional case.