Achieving chemical accuracy for molecular simulations remains a central challenge in computational chemistry. Here, we present an embedded correlated wavefunction transfer learning (ECW-TL) framework for accurately simulating molecular dynamics in the condensed phase. ECW-TL incorporates high-level electron exchange and correlation effects in ECW theory while preserving training and computational efficiency of machine learned interatomic potentials. We demonstrate the framework on Ca2+-CO32- ion pairing in aqueous solution, a key process underlying CO2 mineralization in seawater. As proof of principle, we first show that finetuning a DFT-revPBE-D3(BJ) baseline model with embedded-DFT-SCAN data reproduces the DFT-SCAN free-energy surface within 1 kcal/mol across all solvation states. Extending the framework to embedded MP2 and localized natural-orbital CCSD(T) further refines the free-energy profile, revealing the crucial role of exact electron exchange and correlation in determining ion-pair stability and structure. ECW-TL thus provides a general, data-efficient route for transferring CW accuracy to large-scale simulations of complex aqueous and interfacial chemical processes.
This scientometric study analyzes Avian Influenza research from 2014 to 2023 using bibliographic data from the Web of Science database. We examined publication trends, sources, authorship, collaborative networks, document types, and geographical distribution to gain insights into the global research landscape. Results reveal a steady increase in publications, with high contributions from Chinese and American institutions. Journals such as PLoS One and the Journal of Virology published the highest number of studies, indicating their influence in this field. The most prolific institutions include the Chinese Academy of Sciences and the University of Hong Kong, while the College of Veterinary Medicine at South China Agricultural University emerged as the most productive department. China and the USA lead in publication volume, though developed nations like the United Kingdom and Germany exhibit a higher rate of international collaboration. "Articles" are the most common document type, constituting 84.6% of the total, while "Reviews" account for 7.6%. This study provides a comprehensive view of global trends in Avian Influenza research, emphasizing the need for collaborative efforts ac
Demographic data collection is essential in education research, as demographic data allows researchers to better describe the participant population they study and to contextualize findings. However, current research practices for neurodiversity demographics often rely on prescriptive methods (e.g., requiring participants to report official diagnoses) rather than allowing participants to self-identify. This approach can: a) not allow participants to express their intersecting identities in ways that are authentic; and b) limit trustworthiness and reliability of the data and interpretation. In addition, inconsistent dissemination and representation of demographic data across studies hinder the accessibility and usability of this work. Through a literature review of neurodivergent student experiences with learning and performing STEM, we identified widespread discrepancies in how demographic information is collected and reported. This paper explores how neurodivergent identities can be more accurately and inclusively represented in education research. We present findings of a thematic analysis on the ways neurodivergent demographic data collection is done in the literature using data
Developing accurate models for chemical reactors is often challenging due to the complexity of reaction kinetics and process dynamics. Traditional approaches require retraining models for each new system, limiting generalizability and efficiency. In this work, we take a step toward foundation models for chemical reactor modeling by introducing a neural network framework that generalizes across diverse reactor types and rapidly adapts to new chemical processes. Our approach leverages meta-learning to pretrain the model on a broad set of reactor dynamics, enabling efficient adaptation to unseen reactions with minimal data. To further enhance generalizability, we incorporate physics-informed fine-tuning, ensuring physically consistent adaptation to new reactor conditions. Our framework is evaluated across three integer-order fundamental reactor types - continuous stirred tank reactors, batch reactors, and plug flow reactors - demonstrating superior few-shot adaptation compared to conventional data-driven, physics-informed, and transfer learning approaches. By combining meta-learning with physics-informed adaptation, this work lays the foundation for a generalizable modeling framework,
This paper presents a scientometric analysis of research output from the University of Lagos, focusing on the two decades spanning 2004 to 2023. Using bibliometric data retrieved from the Web of Science, we examine trends in publication volume, collaboration patterns, citation impact, and the most prolific authors, departments, and research domains at the university. The study reveals a consistent increase in research productivity, with the highest publication output recorded in 2023. Health Sciences, Engineering, and Social Sciences are identified as dominant fields, reflecting the university's interdisciplinary research strengths. Collaborative efforts, both locally and internationally, show a positive correlation with higher citation impact, with the United States and the United Kingdom being the leading international collaborators. Notably, open-access publications account for a significant portion of the university's research output, enhancing visibility and citation rates. The findings offer valuable insights into the university's research performance over the past two decades, providing a foundation for strategic planning and policy formulation to foster research excellence
[Abridged] This review paper discussed which chemical effects may be at play in a planet-forming disk midplane, which effects are relevant under different conditions, and which tools are available for modelling chemical kinetics in a disk midplane. The review goes on to discuss some important efforts in the planet formation modelling community to treat chemical evolution, and, vice versa, efforts in the chemical modelling community to implement more physical effects related to planet formation into the chemical modelling. The aim of this review is both to outline some concepts related to planet formation chemistry, but also to encourage, not just collaboration between the planet formation modelling community and the astrochemical community, but also assistance and guidance from one community to the other. Guidance, regarding which effects, out of many, might be more relevant than others under certain planet formation conditions, and regarding why certain included effects lead to certain important modelling outcomes. As the research fields of exoplanet atmospheres and protoplanetary disks near new frontiers in observational insights with upcoming facilities, developing appropriate m
In this work, we study an integrated fault detection and classification framework called FARM for fast, accurate, and robust online chemical process monitoring. The FARM framework integrates the latest advancements in statistical process control (SPC) for monitoring nonparametric and heterogeneous data streams with novel data analysis approaches based on Riemannian geometry together in a hierarchical framework for online process monitoring. We conduct a systematic evaluation of the FARM monitoring framework using the Tennessee Eastman Process (TEP) dataset. Results show that FARM performs competitively against state-of-the-art process monitoring algorithms by achieving a good balance among fault detection rate (FDR), fault detection speed (FDS), and false alarm rate (FAR). Specifically, FARM achieved an average FDR of 96.97% while also outperforming benchmark methods in successfully detecting hard-to-detect faults that are previously known, including Faults 3, 9 and 15, with FDRs being 97.08%, 96.30% and 95.99%, respectively. In terms of FAR, our FARM framework allows practitioners to customize their choice of FAR, thereby offering great flexibility. Moreover, we report a significa
As the volume and complexity of nonclinical toxicology studies continue to increase, toxicologic pathology reporting faces persistent challenges, including fragmented sources of data (e.g., histopathology images, clinical pathology and other study data, adverse effects database, mechanistic literature), variable reporting timelines and heightened regulatory expectations. This white paper examines the emerging role of agentic artificial intelligence (AI) in addressing these issues through coordinated workflow orchestration, data integration, and pathologist-in-the-loop report generation. Based on a closed-door roundtable held during the 2025 Society of Toxicologic Pathology (STP) Annual Meeting and follow-on discussions, this paper synthesizes the perspectives of leading toxicologic pathologists, toxicologists, and AI developers. It outlines the key pain points in current reporting workflows, identifies realistic near-term use cases for agentic AI, and describes major adoption barriers including requirements for transparency, validation, and organizational readiness. A phased adoption roadmap and pilot design considerations are proposed to help support responsible evaluation and dep
A physically motivated equation that determines the number of electrons of a molecule is proposed based on chemical common sense. It shows that all molecules are entangled in the number of electrons and results in the fundamental assumption of molecular energy convexity that underpins molecular quantum mechanics. The proposed physical principle includes the molecular size consistency principle as a special case. Application of wavefunction theory to the principle shows that an individual molecule with a noninteger number of electrons is locally physical albeit locally unreal. The energy of a molecule is piecewise linear with respect to its continuous number of electrons. The continuity of the number of electrons allows the definition of an electronic chemical potential of a single molecule. A state function equivalent to the energy of a molecule can be defined using the chemical potential as a variable. The aforementioned physical principle can alternatively be expressed as a simple additivity with the new state function. The latter also shows that the quantum entanglement in the number of electrons can be viewed as all molecules sharing the same chemical potential.
Optical detection of magnetic resonance enables spin-based quantum sensing with high spatial resolution and sensitivity-even at room temperature-as exemplified by solid-state defects. Molecular systems provide a complementary, chemically tunable, platform for room-temperature optically detected magnetic resonance (ODMR)-based quantum sensing. A critical parameter governing sensing sensitivity is the optical contrast-i.e., the difference in emission between two spin states. In state-of-the-art solid-state defects such as the nitrogen-vacancy center in diamond, this contrast is approximately 30%. Here, capitalizing on chemical tunability, we show that room-temperature ODMR contrasts of 40% can be achieved in molecules. Using a nitrogen-substituted analogue of pentacene (6,13-diazapentacene), we enhance contrast compared to pentacene and, by determining the triplet kinetics through time-dependent pulsed ODMR, show how this arises from accelerated anisotropic intersystem crossing. Furthermore, we translate high-contrast room-temperature pulsed ODMR to self-assembled nanocrystals. Overall, our findings highlight the synthetic handles available to optically readable molecular spins and t
Altering chemical reactivity and material structure in confined optical environments is on the rise, and yet, a conclusive understanding of the microscopic mechanisms remains elusive. This originates mostly from the fact that accurately predicting vibrational and reactive dynamics for soluted ensembles of realistic molecules is no small endeavor, and adding (collective) strong light-matter interaction does not simplify matters. Here, we establish a framework based on a combination of machine learning (ML) models, trained using density-functional theory calculations, and molecular dynamics to accelerate such simulations. We then apply this approach to evaluate strong coupling, changes in reaction rate constant, and their influence on enthalpy and entropy for the deprotection reaction of 1-phenyl-2-trimethylsilylacetylene, which has been studied previously both experimentally and using ab initio simulations. While we find qualitative agreement with critical experimental observations, especially with regard to the changes in kinetics, we also find differences in comparison with previous theoretical predictions. The features for which the ML-accelerated and ab initio simulations agree
This paper presents a quasi-sequential optimal design framework for toxicology experiments, specifically applied to sea urchin embryos. The authors propose a novel approach combining robust optimal design with adaptive, stage-based testing to improve efficiency in toxicological studies, particularly where traditional uniform designs fall short. The methodology uses statistical models to refine dose levels across experimental phases, aiming for increased precision while reducing costs and complexity. Key components include selecting an initial design, iterative dose optimization based on preliminary results, and assessing various model fits to ensure robust, data-driven adjustments. Through case studies, we demonstrate improved statistical efficiency and adaptability in toxicology, with potential applications in other experimental domains.
The study demonstrates the capabilities of a vector-based approach for calculating stoichiometric coefficients in chemical equations, using black powder as an illustrative example. A method is proposed for selecting and constraining intermediate interactions between reactants, as well as for identifying final products. It is shown that even a small number of components can lead to a large number of final and intermediate products. Through concrete calculations, a correlation is established between the number of possible chemical equations and the number of reactants. A methodology is proposed for computing all possible chemical equations within a reaction system for arbitrary component ratios, enabling the derivation of all feasible chemical reactions. Additionally, a method is developed for calculating the chemical composition for a fixed set of reactants, allowing for the evaluation of the set of products resulting from all possible chemical interactions given a specified initial composition.
This chapter provides a pedagogical introduction and overview of spatial and temporal correlation and fluctuation effects resulting from the fundamentally stochastic kinetics underlying chemical reactions and the dynamics of populations or epidemics. After reviewing the assumptions and mean-field type approximations involved in the construction of chemical rate equations for uniform reactant densities, we first discuss spatial clustering in birth-death systems, where non-linearities are introduced through either density-limiting pair reactions, or equivalently via local imposition of finite carrying capacities. The competition of offspring production, death, and non-linear inhibition induces a population extinction threshold, which represents a non-equilibrium phase transition that separates active from absorbing states. This continuous transition is characterized by the universal scaling exponents of critical directed percolation clusters. Next we focus on the emergence of depletion zones in single-species annihilation processes and spatial population segregation with the associated reaction fronts in two-species pair annihilation. These strong (anti-)correlation effects are dynam
Interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years. This success can be partly attributed to the advancements made in the sub-fields of AI such as machine learning, computer vision, and natural language processing. Much of the growth in these fields has been made possible with deep learning, a sub-area of machine learning that uses artificial neural networks. This has created significant interest in the integration of vision and language. In this survey, we focus on ten prominent tasks that integrate language and vision by discussing their problem formulation, methods, existing datasets, evaluation measures, and compare the results obtained with corresponding state-of-the-art methods. Our efforts go beyond earlier surveys which are either task-specific or concentrate only on one type of visual content, i.e., image or video. Furthermore, we also provide some potential future directions in this field of research with an anticipation that this survey stimulates innovative thoughts and ideas to address the existing challenges and build new applications.
Traditional chemical kinetics may be inappropriate to describe chemical reactions in micro-domains involving only a small number of substrate and reactant molecules. Starting with the stochastic dynamics of the molecules, we derive a master-diffusion equation for the joint probability density of a mobile reactant and the number of bound substrate in a confined domain. We use the equation to calculate the fluctuations in the number of bound substrate molecules as a function of initial reactant distribution. A second model is presented based on a Markov description of the binding and unbinding and on the mean first passage time of a molecule to a small portion of the boundary. These models can be used for the description of noise due to gating of ionic channels by random binding and unbinding of ligands in biological sensor cells, such as olfactory cilia, photo-receptors, hair cells in the cochlea.
Normalised citation counts are routinely used to assess the average impact of research groups or nations. There is controversy over whether confidence intervals for them are theoretically valid or practically useful. In response, this article introduces the concept of a group's underlying research capability to produce impactful research. It then investigates whether confidence intervals could delimit the underlying capability of a group in practice. From 123120 confidence interval comparisons for the average citation impact of the national outputs of ten countries within 36 individual large monodisciplinary journals, moderately fewer than 95% of subsequent indicator values fall within 95% confidence intervals from prior years, with the percentage declining over time. This is consistent with confidence intervals effectively delimiting the research capability of a group, although it does not prove that this is the cause of the results. The results are unaffected by whether internationally collaborative articles are included.
In most countries, basic research is supported by research councils that select, after peer review, the individuals or teams that are to receive funding. Unfortunately, the number of grants these research councils can allocate is not infinite and, in most cases, a minority of the researchers receive the majority of the funds. However, evidence as to whether this is an optimal way of distributing available funds is mixed. The purpose of this study is to measure the relation between the amount of funding provided to 12,720 researchers in Quebec over a fifteen year period (1998-2012) and their scientific output and impact from 2000 to 2013. Our results show that both in terms of the quantity of papers produced and of their scientific impact, the concentration of research funding in the hands of a so-called "elite" of researchers generally produces diminishing marginal returns. Also, we find that the most funded researchers do not stand out in terms of output and scientific impact.
Large Language Models (LLMs) have substantially driven scientific progress in various domains, and many papers have demonstrated their ability to tackle complex problems with creative solutions. Our paper introduces a new foundation model, nach0, capable of solving various chemical and biological tasks: biomedical question answering, named entity recognition, molecular generation, molecular synthesis, attributes prediction, and others. nach0 is a multi-domain and multi-task encoder-decoder LLM pre-trained on unlabeled text from scientific literature, patents, and molecule strings to incorporate a range of chemical and linguistic knowledge. We employed instruction tuning, where specific task-related instructions are utilized to fine-tune nach0 for the final set of tasks. To train nach0 effectively, we leverage the NeMo framework, enabling efficient parallel optimization of both base and large model versions. Extensive experiments demonstrate that our model outperforms state-of-the-art baselines on single-domain and cross-domain tasks. Furthermore, it can generate high-quality outputs in molecular and textual formats, showcasing its effectiveness in multi-domain setups.
We developed a multiscale approach (MultiSCAAL) that integrates the potential of mean force (PMF) obtained from all-atomistic molecular dynamics simulations with a knowledge-based energy function for coarse-grained molecular simulations in better exploring the energy landscape of a small protein under chemical interference such as chemical denaturation. An excessive amount of water molecules in all-atomistic molecular dynamics simulations often negatively impacts the sampling efficiency of some advanced sampling techniques such as the replica exchange method and it makes the investigation of chemical interferences on protein dynamics difficult. Thus, there is a need to develop an effective strategy that focuses on sampling structural changes in protein conformations rather than solvent molecule fluctuations. In this work, we address this issue by devising a multiscale simulation scheme (MultiSCAAL) that bridges the gap between all-atomistic molecular dynamics simulation and coarse-grained molecular simulation. The two key features of this scheme are the Boltzmann inversion and a protein atomistic reconstruction method we previously developed (SCAAL). Using MultiSCAAL, we were able