We introduce ChemPro, a progressive benchmark with 4100 natural language question-answer pairs in Chemistry, across 4 coherent sections of difficulty designed to assess the proficiency of Large Language Models (LLMs) in a broad spectrum of general chemistry topics. We include Multiple Choice Questions and Numerical Questions spread across fine-grained information recall, long-horizon reasoning, multi-concept questions, problem-solving with nuanced articulation, and straightforward questions in a balanced ratio, effectively covering Bio-Chemistry, Inorganic-Chemistry, Organic-Chemistry and Physical-Chemistry. ChemPro is carefully designed analogous to a student's academic evaluation for basic to high-school chemistry. A gradual increase in the question difficulty rigorously tests the ability of LLMs to progress from solving basic problems to solving more sophisticated challenges. We evaluate 45+7 state-of-the-art LLMs, spanning both open-source and proprietary variants, and our analysis reveals that while LLMs perform well on basic chemistry questions, their accuracy declines with different types and levels of complexity. These findings highlight the critical limitations of LLMs in
We present QDK/Chemistry, a software toolkit for quantum chemistry workflows targeting quantum computers. The toolkit addresses a key challenge in the field: while quantum algorithms for chemistry have matured considerably, the infrastructure connecting classical electronic structure calculations to quantum circuit execution remains fragmented. QDK/Chemistry provides this infrastructure through a modular architecture that separates data representations from computational methods, enabling researchers to compose workflows from interchangeable components. In addition to providing native implementations of targeted algorithms in the quantum-classical pipeline, the toolkit builds upon and integrates with widely used open-source quantum chemistry packages and quantum computing frameworks through a plugin system, allowing users to combine methods from different sources without modifying workflow logic. This paper describes the design philosophy, current capabilities, and role of QDK/Chemistry as a foundation for reproducible quantum chemistry experiments.
To enhance large language models (LLMs) for chemistry problem solving, several LLM-based agents augmented with tools have been proposed, such as ChemCrow and Coscientist. However, their evaluations are narrow in scope, leaving a large gap in understanding the benefits of tools across diverse chemistry tasks. To bridge this gap, we develop ChemToolAgent, an enhanced chemistry agent over ChemCrow, and conduct a comprehensive evaluation of its performance on both specialized chemistry tasks and general chemistry questions. Surprisingly, ChemToolAgent does not consistently outperform its base LLMs without tools. Our error analysis with a chemistry expert suggests that: For specialized chemistry tasks, such as synthesis prediction, we should augment agents with specialized tools; however, for general chemistry questions like those in exams, agents' ability to reason correctly with chemistry knowledge matters more, and tool augmentation does not always help.
This review presents the covalent chemistry of carbon within the spin-radical concept of electron interaction. Using the language of valence bond trimodality, the regions of classical spinless covalence and its spin counterpart are defined. Carbon is the only element exhibiting spin covalent chemistry. Classical covalent chemistry of carbon concerns molecular substances whose valence bond structure includes segregate or chained single sp3C-C bonds. Substances with double sp2C-C and triple sp1C-C bonds are the subject of spin covalent chemistry of carbon. The mathematical apparatus of spin covalence forms the basis of algorithms governing the chemical modification of carbon substances, polymerization processes, and catalysis involving them, making it possible to supplement the empirical spin covalent chemistry of carbon with its virtual analog.
Multimodal scientific reasoning remains a significant challenge for large language models (LLMs), particularly in chemistry, where problem-solving relies on symbolic diagrams, molecular structures, and structured visual data. Here, we systematically evaluate 40 proprietary and open-source multimodal LLMs, including GPT-5, o3, Gemini-2.5-Pro, and Qwen2.5-VL, on a curated benchmark of Olympiad-style chemistry questions drawn from over two decades of U.S. National Chemistry Olympiad (USNCO) exams. These questions require integrated visual and textual reasoning across diverse modalities. We find that many models struggle with modality fusion, where in some cases, removing the image even improves accuracy, indicating misalignment in vision-language integration. Chain-of-Thought prompting consistently enhances both accuracy and visual grounding, as demonstrated through ablation studies and occlusion-based interpretability. Our results reveal critical limitations in the scientific reasoning abilities of current MLLMs, providing actionable strategies for developing more robust and interpretable multimodal systems in chemistry. This work provides a timely benchmark for measuring progress in
Accelerated materials discovery is critical for addressing global challenges. However, developing new laboratory workflows relies heavily on real-world experimental trials, and this can hinder scalability because of the need for numerous physical make-and-test iterations. Here we present MATTERIX, a multiscale, graphics processing unit-accelerated robotic simulation framework designed to create high-fidelity digital twins of chemistry laboratories, thus accelerating workflow development. This multiscale digital twin simulates robotic physical manipulation, powder and liquid dynamics, device functionalities, heat transfer and basic chemical reaction kinetics. This is enabled by integrating realistic physics simulation and photorealistic rendering with a modular graphics processing unit-accelerated semantics engine, which models logical states and continuous behaviors to simulate chemistry workflows across different levels of abstraction. MATTERIX streamlines the creation of digital twin environments through open-source asset libraries and interfaces, while enabling flexible workflow design via hierarchical plan definition and a modular skill library that incorporates learning-based
The spatial distribution of the chemical reservoirs in protoplanetary disks is key to elucidate the composition of planets, especially habitable ones. However, the partitioning of the main elements among the refractory and volatile phases is still elusive. Key parameters such as the carbon-to-oxygen C/O elemental ratio and the ionization fraction remain poorly constrained, with the latter potentially orders of magnitude lower than in the interstellar medium. Moreover, the thermal structure of the gas is also poorly known, despite its deep influence on gas-phase chemistry. In this context, ortho-to-para ratios could provide selective and sensitive probes. Recent ALMA observations have measured the spatially resolved column density of ortho-and para-H2CO in the transition disk orbiting TW Hya and derived the radial profile of the ortho-to-para ratio. Yet, current disk models do not include the nuclear-spin-resolved chemistry required to interpret these observations. The present work aims to fill this gap, by combining a parametric disk physical model of TW Hya with the UGAN network, updated to include a comprehensive description of the nuclear-spin-resolved chemistry of formaldehyde.
The hydration of magnesium oxide (MgO) to magnesium hydroxide (Mg(OH)$_2$) is a fundamental solid-surface chemical reaction with significant implications for materials science. Yet its molecular-level mechanism from water adsorption to Mg(OH)$_2$ nucleation and growth remains elusive due to its complex and multi-step nature. Here, we elucidate the molecular process of MgO hydration based on structures of the MgO/water interface obtained by a combined computational chemistry approach of potential-scaling molecular dynamics simulations and first-principles calculations without any a priori assumptions about reaction pathways. The result shows that the Mg$^{2+}$ dissolution follows the dissociative water adsorption. We find that this initial dissolution can proceed exothermically even from the defect-free surface with an average activation barrier of $\sim$12 kcal/mol. This exothermicity depends crucially on the stabilization of the resulting surface vacancy, achieved by proton adsorption onto neighboring surface oxygen atoms. Further Mg$^{2+}$ dissolution then occurs in correlation with proton penetration into the solid. Moreover, we find that the Mg(OH)$_2$ nucleation and growth pro
Planets form in disks of gas and dust around young stars. The disk molecular reservoirs and their chemical evolution affect all aspects of planet formation, from the coagulation of dust grains into pebbles, to the elemental and molecular compositions of the mature planet. Disk chemistry also enables unique probes of disk structures and dynamics, including those directly linked to ongoing planet formation. Here we review the protoplanetary disk chemistry of the volatile elements HOCNSP, the associated observational and theoretical methods, and the links between disk and planet chemical compositions. Three takeaways from this review are: (1) The disk chemical composition, including the organic reservoirs, is set by both inheritance and in situ chemistry. (2) Disk gas and solid O/C/N/H elemental ratios often deviate from stellar values due to a combination of condensation of molecular carriers, chemistry, and dynamics. (3) Chemical, physical, and dynamical processes in disks are closely linked, which complicates disk chemistry modeling, but these links also present an opportunity to develop chemical probes of different aspects of disk evolution and planet formation.
The NSF Workshop in Quantum Information and Computation for Chemistry assembled experts from directly quantum-oriented fields such as algorithms, chemistry, machine learning, optics, simulation, and metrology, as well as experts in related fields such as condensed matter physics, biochemistry, physical chemistry, inorganic and organic chemistry, and spectroscopy. The goal of the workshop was to summarize recent progress in research at the interface of quantum information science and chemistry as well as to discuss the promising research challenges and opportunities in the field. Furthermore, the workshop hoped to identify target areas where cross fertilization among these fields would result in the largest payoff for developments in theory, algorithms, and experimental techniques. The ideas can be broadly categorized in two distinct areas of research that obviously have interactions and are not separated cleanly. The first area is quantum information for chemistry, or how quantum information tools, both experimental and theoretical can aid in our understanding of a wide range of problems pertaining to chemistry. The second area is chemistry for quantum information, which aims to di
Stratospheric aerosol injection (SAI) has been proposed as a geoengineering strategy to mitigate global warming by increasing Earth's albedo. Silica-based materials, such as diamond-doped silica aerogels, have shown promising optical properties, but their impact on stratospheric chemistry, ozone one in particular, remains largely unknown. Here, we present first-principles molecular dynamics (MD) simulations of the heterogeneous reaction between hydrogen chloride ($\mathrm{HCl}$) and chlorine nitrate ($\mathrm{ClONO_2}$), two main reservoirs of stratospheric chlorine and nitrogen species, on a dry, hydroxylated $α$-quartz silica interface. Our results reveal a barrierless reaction pathway toward the formation of chlorine gas ($\mathrm{Cl}_2$), a major contributor to stratospheric ozone loss. We design a heterogeneous kinetic model informed by our MD simulation and available experimental data: despite the barrierless formation of $\mathrm{Cl_2}$, the higher surface affinities and partial pressures of $\mathrm{HNO_3}$ and $\mathrm{HCl}$ compared to those of $\mathrm{ClONO_2}$ result in a negligible reaction probability, $γ_\mathrm{ClONO_2}$, upon chlorine nitrate collision with the si
Efficient chemical kinetic model inference and application in combustion are challenging due to large ODE systems and widely separated time scales. Machine learning techniques have been proposed to streamline these models, though strong nonlinearity and numerical stiffness combined with noisy data sources make their application challenging. Here, we introduce ChemKANs, a novel neural network framework with applications both in model inference and simulation acceleration for combustion chemistry. ChemKAN's novel structure augments the generic Kolmogorov Arnold Network Ordinary Differential Equations (KAN-ODEs) with knowledge of the information flow through the relevant kinetic and thermodynamic laws. This chemistry-specific structure combined with the expressivity and rapid neural scaling of the underlying KAN-ODE algorithm instills in ChemKANs a strong inductive bias, streamlined training, and higher accuracy predictions compared to standard benchmarks, while facilitating parameter sparsity through shared information across all inputs and outputs. In a model inference investigation, we benchmark the robustness of ChemKANs to sparse data containing up to 15% added noise, and superfl
Cyanopolyynes, a family of nitrogen containing carbon chains, are common in the interstellar medium and possibly form the backbone of species relevant to prebiotic chemistry. Following their gas phase formation, they are expected to freeze out on ice grains in cold interstellar regions. In this work we present the hydrogenation reaction network of isolated HC_{3}N, the smallest cyanopolyyne, that consists over-a-barrier radical-neutral reactions and barrierless radical-radical reactions. We employ density functional theory, coupled cluster and multiconfigurational methods to obtain activation and reaction energies for the hydrogenation network of HC_{3}N. This work explores the reaction network of the isolated molecule and constitutes a preview on the reactions occurring on the ice grain surface. We find that the reactions where the hydrogen atom adds to the carbon chain at carbon atom opposite of the cyano-group give the lowest and most narrow barriers. Subsequent hydrogenation leads to the astrochemically relevant vinyl cyanide and ethyl cyanide. Alternatively, the cyano-group can hydrogenate via radical-radical reactions, leading to the fully saturated propylamine. These results
Quantum computer provides new opportunities for quantum chemistry. In this article, we present a versatile, extensible, and efficient software package, named Q$^2$Chemistry, for developing quantum algorithms and quantum inspired classical algorithms in the field of quantum chemistry. In Q$^2$Chemistry, wave function and Hamiltonian can be conveniently mapped into the qubit space, then quantum circuits can be generated according to a specific quantum algorithm already implemented in the package or newly developed by the users. The generated circuits can be dispatched to either a physical quantum computer, if available, or to the internal virtual quantum computer realized by simulating quantum circuit on classical supercomputers. As demonstrated by our benchmark simulations with up to 72 qubit, Q$^2$Chemistry achieves excellent performance in simulating medium scale quantum circuits. Application of Q$^2$Chemistry to simulate molecules and periodic systems are given with performance analysis.
Three-body recombination, or ternary association, is a termolecular reaction in which three particles collide, forming a bound state between two, whereas the third escapes freely. Three-body recombination reactions play a significant role in many systems relevant to physics and chemistry. In particular, they are relevant in cold and ultracold chemistry, quantum gases, astrochemistry, atmospheric physics, physical chemistry, and plasma physics. As a result, three-body recombination has been the subject of extensive work during the last 50 years, although primarily from an experimental perspective. Indeed, a general theory for three-body recombination remains elusive despite the available experimental information. Our group recently developed a direct approach based on classical trajectory calculations in hyperspherical coordinates for three-body recombination to amend this situation, leading to a first principle explanation of ion-atom-atom and atom-atom-atom three-body recombination processes. This review aims to summarize our findings on three-body recombination reactions and identify the remaining challenges in the field.
We present two open-source implementations of the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) algorithm to find a few eigenvalues and eigenvectors of large, possibly sparse matrices. We then test LOBPCG for various quantum chemistry problems, encompassing medium to large, dense to sparse, wellbehaved to ill-conditioned ones, where the standard method typically used is Davidson's diagonalization. Numerical tests show that, while Davidson's method remains the best choice for most applications in quantum chemistry, LOBPCG represents a competitive alternative, especially when memory is an issue, and can even outperform Davidson for ill-conditioned, non diagonally dominant problems.
Plasma Assisted Combustion (PAC) is a promising technology to enhance the combustion of lean mixtures prone to instabilities and flame blow-off. Although many PAC experiments demonstrated combustion enhancement, several studies report an increase in NOx emissions. The aim of this study is to determine the kinetic pathways leading to NOx formation in the second stage of a sequential combustor assisted by Nanosecond Repetitively Pulsed Discharges (NRPDs). For this purpose, Large Eddy Simulation (LES) associated with an accurate description of the combustion/NOx chemistry and a phenomenological model of the plasma kinetics is used. Detailed kinetics 0-Dimensional reactors complement the study. First, the LES setup is validated by comparison with experiments. Then, the NOx chemistry is analyzed. For the conditions of operation studied, it is shown that the production of atomic nitrogen in the plasma by direct electron impact on nitrogen molecules increases the formation of NO. Then, the NO molecules are transported through the turbulent flame without being strongly affected. This study illustrates the need to limit the diatomic nitrogen dissociation process in order to mitigate harmful
In this work, the teaching content of a theoretical-chemistry (TC) course is reformed, establishing a theoretical contents from micro- to macro-system, and comprehensively introducing the theory of chemical reaction to undergraduate students in chemistry. In order to develop such TC course based on the general physical-chemistry course, we focus on the last-mile problem between the physics and chemistry courses to train the critical thinking of undergraduate students in chemistry. To clearly show this, a reduction scheme of polymer molecular dynamics was discussed as an example, which shows a different theoretical content in polymer chemistry. Moreover, we propose a series of experiences and dependent measures that can provide information regarding students' levels of knowledge and understanding. This assessment quiz was designed to test students on the fundamental concepts and applications of TC, such as dynamics, statistical ensemble, kinetics, and so on. From the actual teaching for 36 students, it was found that these students performed significantly improvement from the present TC content. Further analysis of each individual question revealed that approximately two-third of th
In their recent article, Derrien et al. (Derrien et al., 2023) study the anisotropy of microdosimetric quantities for spherical sites of several sizes placed around spherical gold nanoparticles of several diameters irradiated by monoenergetic photons. This comment points out that (1) the reported single event distributions of specific energy may be biased due to overcounting. (2) by considering only energy imparted by electrons produced in photon interactions in the nanoparticle, the magnitude of the anisotropy is overestimated by up to orders of magnitude with respect to an irradiation under conditions of secondary particle equilibrium.
Kohn-Sham density functional theory is in principle an exact formulation of quantum mechanical electronic structure theory, but in practice we have to rely on approximate exchange-correlation (xc) functionals. The objective of our work has been to design an xc functional with broad accuracy across as wide an expanse of chemistry and physics as possible, leading-as a long-range goal-to a functional with good accuracy for all problems, i.e., a universal functional. To guide our path toward that goal and to measure our progress, we have developed-building on earlier work in our group-a set of databases of reference data for a variety of energetic and structural properties in chemistry and physics. These databases include energies of molecular processes such as atomization, complexation, proton addition, and ionization; they also include molecular geometries and solid-state lattice constants, chemical reaction barrier heights, and cohesive energies and band gaps of solids. For the present paper we gather many of these databases into four comprehensive databases, two with 384 energetic data for chemistry and solid-state physics and another two with 68 structural data for chemistry and s