Activity cliff prediction - identifying positions where small structural changes cause large potency shifts - has been a persistent challenge in computational medicinal chemistry. This work focuses on a parsimonious definition: which small modifications, at which positions, confer the highest probability of an outcome change. Position-level sensitivity is calculated using 25 million matched molecular pairs from 50 ChEMBL targets across six protein families, revealing that two questions have fundamentally different answers. "Which positions vary most?" is answered by scaffold size alone (NDCG@3 = 0.966), requiring no machine learning. "Which are true activity cliffs?" - where small modifications cause disproportionately large effects, as captured by SALI normalization - requires an 11-feature model with 3D pharmacophore context (NDCG@3 = 0.910 vs. 0.839 random), generalizing across all six protein families, novel scaffolds (0.913), and temporal splits (0.878). The model identifies the cliff-prone position first 53% of the time (vs. 27% random - 2x lift), reducing positions a chemist must explore from 3.1 to 2.1 - a 31% reduction in first-round experiments. Predicting which modificat
The spatial distribution of the chemical reservoirs in protoplanetary disks is key to elucidate the composition of planets, especially habitable ones. However, the partitioning of the main elements among the refractory and volatile phases is still elusive. Key parameters such as the carbon-to-oxygen C/O elemental ratio and the ionization fraction remain poorly constrained, with the latter potentially orders of magnitude lower than in the interstellar medium. Moreover, the thermal structure of the gas is also poorly known, despite its deep influence on gas-phase chemistry. In this context, ortho-to-para ratios could provide selective and sensitive probes. Recent ALMA observations have measured the spatially resolved column density of ortho-and para-H2CO in the transition disk orbiting TW Hya and derived the radial profile of the ortho-to-para ratio. Yet, current disk models do not include the nuclear-spin-resolved chemistry required to interpret these observations. The present work aims to fill this gap, by combining a parametric disk physical model of TW Hya with the UGAN network, updated to include a comprehensive description of the nuclear-spin-resolved chemistry of formaldehyde.
We present a new approach to identify satellite trails (or other linear artifacts) in ACS/WFC imaging data using a modified Radon Transform. We demonstrate that this approach is sensitive to features with mean brightness significantly below the background noise level, and it is resistant to the influence of bright astronomical sources (e.g., stars, galaxies) in most cases. Comparing with a set of satellite trails identified by eye, we find a trail recovery rate of 85\% and a false detection rate (after removing diffraction spikes that are easily filtered) of 2.5\%. By performing an analysis using a much larger ACS/WFC data set where false trails are identified by their persistence across multiple images of the same field, we identify the Radon Transform parameter space and image properties where our algorithm is unreliable, and estimate a false detection rate of $\sim10\%$ elsewhere. We apply our method to ACS/WFC data taken between 2002 and 2022 to determine both the frequency of satellite trail contamination in science data and also the typical trail brightness as a function of time. We find the rate of satellite trail contamination has increased by approximately a factor of two
Multimodal scientific reasoning remains a significant challenge for large language models (LLMs), particularly in chemistry, where problem-solving relies on symbolic diagrams, molecular structures, and structured visual data. Here, we systematically evaluate 40 proprietary and open-source multimodal LLMs, including GPT-5, o3, Gemini-2.5-Pro, and Qwen2.5-VL, on a curated benchmark of Olympiad-style chemistry questions drawn from over two decades of U.S. National Chemistry Olympiad (USNCO) exams. These questions require integrated visual and textual reasoning across diverse modalities. We find that many models struggle with modality fusion, where in some cases, removing the image even improves accuracy, indicating misalignment in vision-language integration. Chain-of-Thought prompting consistently enhances both accuracy and visual grounding, as demonstrated through ablation studies and occlusion-based interpretability. Our results reveal critical limitations in the scientific reasoning abilities of current MLLMs, providing actionable strategies for developing more robust and interpretable multimodal systems in chemistry. This work provides a timely benchmark for measuring progress in
Accelerated materials discovery is critical for addressing global challenges. However, developing new laboratory workflows relies heavily on real-world experimental trials, and this can hinder scalability because of the need for numerous physical make-and-test iterations. Here we present MATTERIX, a multiscale, graphics processing unit-accelerated robotic simulation framework designed to create high-fidelity digital twins of chemistry laboratories, thus accelerating workflow development. This multiscale digital twin simulates robotic physical manipulation, powder and liquid dynamics, device functionalities, heat transfer and basic chemical reaction kinetics. This is enabled by integrating realistic physics simulation and photorealistic rendering with a modular graphics processing unit-accelerated semantics engine, which models logical states and continuous behaviors to simulate chemistry workflows across different levels of abstraction. MATTERIX streamlines the creation of digital twin environments through open-source asset libraries and interfaces, while enabling flexible workflow design via hierarchical plan definition and a modular skill library that incorporates learning-based
The noise in bias frames for all four readout amplifiers in the Advanced Camera for Surveys (ACS) Wide Field Channel (WFC) is dependent on row number. This is because dark current accumulated during readout increases across the detector, influencing and increasing the read noise as a function of row number. In this report, we investigate bias frames taken with the ACS/WFC to explore the column dependence of read noise for each of the amplifiers for different anneal periods. Analyzing the data, we find that there is no column dependence of read noise and that the read noise values for the physical pre-scans are approximately 0.5 e$^-$ lower than in the science arrays because there is no readout dark accumulated in this area. We further investigate 1) the evolution of read noise over an anneal period, 2) a linear decrease in read noise within the initial columns per amplifier, and 3) pixels in elevated read noise columns. We conclude that 1) there is no visual trend of read noise over an anneal period, 2) amplifiers A and C have an initial linear decrease of read noise in the science arrays, and 3) masking unstable hot pixels in a column will decrease its read noise values.
Using repeat imaging of a galaxy cluster taken over a seventeen-year baseline, we examine the impact that degraded Charge Transfer Efficiency (CTE) has on photometric measurements of extended sources using the ACS/WFC on HST. We examine how measured brightnesses depend on time since ACS installation, source location on the WFC detectors, source brightness, and local background level in individual exposures. We find that global brightness measurements using large apertures are generally reliable within $\sim$0.05 magnitudes across the WFC detectors if exposure backgrounds are above $20e^-/{pixel}$ and sources are brighter than $\sim300e^-$ in a single exposure. However, brightness measurements on smaller scales can suffer deficiencies in excess of 0.1 mags (sometimes, significantly more) in recent data unless sources are very close to the CCD serial registers ($\lesssim 512$ pixels), or brighter than $\sim3000\,e^-$ in a single exposure. We also show how degraded CTE can result in artificial asymmetries in galaxy light distributions, which are largely mitigated if backgrounds are $>20e^-/{pixel}$ and targets are not far ($>1536$ pixels) from the serial registers. As expected,
We present a dedicated study of CCD serial ($x$-direction) charge transfer efficiency (CTE) in ACS/WFC. Following past studies of parallel ($y$-direction) CTE, we use the serial CTE trails behind hot pixels in calibration dark frames to characterize charge trapping and release in the serial registers of the WFC detectors. Serial CTE trails are sharper and longer than parallel CTE trails. Many fewer charge traps come into play during serial pixel transfers than parallel transfers, which explains why parallel CTE is much worse than serial CTE. We find that serial CTE can cause losses of $\sim$0.005-0.02~mag in stellar photometry and shift stellar centroids by $\sim$0.01-0.035 pixels. The pixel-based algorithm in CALACS that corrects for parallel CTE losses in WFC data has been modified to include a correction for serial CTE losses. The PCTETAB reference file has also been updated to include serial CTE parameters. The pixel-based correction for serial CTE currently runs only on full-frame WFC images obtained after SM4 (May 2009). Shortly following the publication of this report, science data corrected for both parallel and serial CTE will be available in the MAST archive.
Recently, the ACS team applied an Ubercal framework to assess the photometric repeatability of stars observed across the WFC detector using 15 years of post-SM4 calibration data in the globular cluster 47 Tuc (Ryan et al., 2024). A surprising finding was an apparent 0.05 mag global difference in sensitivity between the WFC1 and WFC2 chips, which had not been seen in prior tests of sensitivity variations around the field-of-view. Given the many degenerate variables within the Ubercal framework such as CTE losses, time-dependent sensitivity, and flat-field corrections, we obtained new calibration data to perform a straightforward test of the reported $\sim$5$\%$ flux offset between detectors. We observed three white dwarf standards with three filters at four positions on the detector (each on a different amplifier), but with the same number of x and y pixel transfers to mitigate differential CTE-related effects. For the F606W and F814W filters, the agreements are good to 0.4$\%$ on average, and always 1$\%$ or better in individual cases. The consistency of these two filters over all three stars and the four dither positions provides very strong evidence against the large global sensi
We examined the long-term behavior of the superbias calibration frames for the Advanced Camera for Surveys Wide Field Channel (ACS/WFC) aboard the Hubble Space Telescope (HST). Superbias frames are used to remove detector-level bias structure from science images and are currently generated after an anneal and delivered monthly. The primary goal of this study was to determine whether the frequency of superbias generation could be reduced without compromising calibration quality, potentially aligning with the Wide Field Camera 3 UVIS (WFC3/UVIS) approach of generating only one superbias per year. We analyzed superbias frames produced from 2007 through 2024 to investigate whether these calibration products have changed significantly over time, and whether the frequency of superbias generation and delivery could be safely reduced without loss of calibration accuracy. In addition to visual inspections and pixel-level comparisons, we employed Principal Component Analysis (PCA) to evaluate whether any long-term, global structure exists beneath the apparent noise in these frames. Our findings show that the superbias structure has remained fairly stable post-Servicing Mission 4 (SM4), a 15-
Efficient chemical kinetic model inference and application in combustion are challenging due to large ODE systems and widely separated time scales. Machine learning techniques have been proposed to streamline these models, though strong nonlinearity and numerical stiffness combined with noisy data sources make their application challenging. Here, we introduce ChemKANs, a novel neural network framework with applications both in model inference and simulation acceleration for combustion chemistry. ChemKAN's novel structure augments the generic Kolmogorov Arnold Network Ordinary Differential Equations (KAN-ODEs) with knowledge of the information flow through the relevant kinetic and thermodynamic laws. This chemistry-specific structure combined with the expressivity and rapid neural scaling of the underlying KAN-ODE algorithm instills in ChemKANs a strong inductive bias, streamlined training, and higher accuracy predictions compared to standard benchmarks, while facilitating parameter sparsity through shared information across all inputs and outputs. In a model inference investigation, we benchmark the robustness of ChemKANs to sparse data containing up to 15% added noise, and superfl
Cyanopolyynes, a family of nitrogen containing carbon chains, are common in the interstellar medium and possibly form the backbone of species relevant to prebiotic chemistry. Following their gas phase formation, they are expected to freeze out on ice grains in cold interstellar regions. In this work we present the hydrogenation reaction network of isolated HC_{3}N, the smallest cyanopolyyne, that consists over-a-barrier radical-neutral reactions and barrierless radical-radical reactions. We employ density functional theory, coupled cluster and multiconfigurational methods to obtain activation and reaction energies for the hydrogenation network of HC_{3}N. This work explores the reaction network of the isolated molecule and constitutes a preview on the reactions occurring on the ice grain surface. We find that the reactions where the hydrogen atom adds to the carbon chain at carbon atom opposite of the cyano-group give the lowest and most narrow barriers. Subsequent hydrogenation leads to the astrochemically relevant vinyl cyanide and ethyl cyanide. Alternatively, the cyano-group can hydrogenate via radical-radical reactions, leading to the fully saturated propylamine. These results
Three-body recombination, or ternary association, is a termolecular reaction in which three particles collide, forming a bound state between two, whereas the third escapes freely. Three-body recombination reactions play a significant role in many systems relevant to physics and chemistry. In particular, they are relevant in cold and ultracold chemistry, quantum gases, astrochemistry, atmospheric physics, physical chemistry, and plasma physics. As a result, three-body recombination has been the subject of extensive work during the last 50 years, although primarily from an experimental perspective. Indeed, a general theory for three-body recombination remains elusive despite the available experimental information. Our group recently developed a direct approach based on classical trajectory calculations in hyperspherical coordinates for three-body recombination to amend this situation, leading to a first principle explanation of ion-atom-atom and atom-atom-atom three-body recombination processes. This review aims to summarize our findings on three-body recombination reactions and identify the remaining challenges in the field.
During this era of new drug designing, medicinal plants had become a very interesting object of further research. Pharmacology screening of active compound of medicinal plants would be time consuming and costly. Molecular docking is one of the in silico method which is more efficient compare to in vitro or in vivo method for its capability of finding the active compound in medicinal plants. In this method, three-dimensional structure becomes very important in the molecular docking methods, so we need a database that provides information on three-dimensional structures of chemical compounds from medicinal plants in Indonesia. Therefore, this study will prepare a database which provides information of the three dimensional structures of chemical compounds of medicinal plants. The database will be prepared by using MySQL format and is designed to be placed in http://herbaldb.farmasi.ui.ac.id website so that eventually this database can be accessed quickly and easily by users via the Internet.
CO is an important component in many N2/CH4 atmospheres including Titan, Triton, and Pluto, and has also been detected in the atmosphere of a number of exoplanets. Numerous experimental simulations have been carried out in the laboratory to understand the chemistry in N2/CH4 atmospheres, but very few simulations have included CO in the initial gas mixtures. The effect of CO on the chemistry occurring in these atmospheres is still poorly understood. We have investigated the effect of CO on both gas and solid phase chemistry in a series of planetary atmosphere simulation experiments using gas mixtures of CO, CH4, and N2 with a range of CO mixing ratios from 0.05% to 5% at low temperature (~100 K). We find that CO affects the gas phase chemistry, the density, and the composition of the solids. Specifically, with the increase of CO in the initial gases, there is less H2 but more H2O, HCN, C2H5N/HCNO and CO2 produced in the gas phase, while the density, oxygen content, and degree of unsaturation of the solids increase. The results indicate that CO has an important impact on the chemistry occurring in our experiments and accordingly in planetary atmospheres.