Despite their ability to understand chemical knowledge, large language models (LLMs) remain limited in their capacity to propose novel molecules with desired functions (e.g., drug-like properties). In addition, the molecules that LLMs propose can often be challenging to make, and are almost never compatible with automated synthesis approaches. To better enable the discovery of functional small molecules, LLMs need to learn a new molecular language that is more effective in predicting properties and inherently synced with automated synthesis technology. Current molecule LLMs are limited by representing molecules based on atoms. In this paper, we argue that just like tokenizing texts into meaning-bearing (sub-)word tokens instead of characters, molecules should be tokenized at the level of functional building blocks, i.e., parts of molecules that bring unique functions and serve as effective building blocks for real-world automated laboratory synthesis. This motivates us to propose mCLM, a modular Chemical-Language Model that comprises a bilingual language model that understands both natural language descriptions of functions and molecular blocks. mCLM front-loads synthesizability co
Language-molecule models have emerged as an exciting direction for molecular discovery and understanding. However, training these models is challenging due to the scarcity of molecule-language pair datasets. At this point, datasets have been released which are 1) small and scraped from existing databases, 2) large but noisy and constructed by performing entity linking on the scientific literature, and 3) built by converting property prediction datasets to natural language using templates. In this document, we detail the $\textit{L+M-24}$ dataset, which has been created for the Language + Molecules Workshop shared task at ACL 2024. In particular, $\textit{L+M-24}$ is designed to focus on three key benefits of natural language in molecule design: compositionality, functionality, and abstraction.
This work takes inspiration from chemistry where the spectral characteristics of the molecules are determined by hybridization of electronic states evolving from the individual atomic orbitals. Based on analogy between quantum mechanics and the classical electrodynamics, we sorted dielectric microspheres with almost identical positions of their whispering gallery mode (WGM) resonances. Using these microspheres as classical photonic atoms, we assembled them in a wide range of structures including linear chains and planar photonic molecules. We studied WGM hybridization effects in such structures using side coupling by tapered microfibers as well as finite difference time domain modeling. We demonstrated that the patterns of WGM spectral splitting are representative of the symmetry, number of constituting atoms and topology of the photonic molecules which in principle can be viewed as "spectral signatures" of various molecules. We also show new ways of controlling WGM coupling constants in such molecules. Excellent agreement was found between measured transmission spectra and spectral signatures of photonic molecules predicted by simulation.
We survey results on the creation of heteronuclear Fermi molecules by tuning a degenerate Bose-Fermi mixture into the neighborhood of an association resonance, either photoassociation or Feshbach, as well as the subsequent prospects for Cooper-like pairing between atoms and molecules. In the simplest case of only one molecular state, corresponding to either a Feshbach resonance or one-color photoassociation, the system displays Rabi oscillations and rapid adiabatic passage between a Bose-Fermi mixture of atoms and fermionic molecules. For two-color photoassociation, the system admits stimulated Raman adiabatic passage (STIRAP) from a Bose-Fermi mixture of atoms to stable Fermi molecules, even in the presence of particle-particle interactions. By tailoring the STIRAP sequence it is possible to deliberately convert only a fraction of the initial atoms, leaving a finite fraction of bosons behind to induce atom-molecule Cooper pairing via density fluctuations; unfortunately, this enhancement is insufficient to achieve a superfluid transition with present ultracold technology. We therefore propose the use of an association resonance that converts atoms and diatomic molecules (dimers) in
The collisions between linear polar molecules, trapped in a microwave field with circular polarization, are theoretically analyzed. The microwave trap suggested by DeMille \cite{DeMille} seems to be rather advantageous in comparison with other traps. Here we have demonstrated that the microwave trap can provide a successful evaporative cooling for polar molecules in a rather broad range of frequencies of the AC-field. We suggested that not only ground state polar molecules but also molecules in some other states can be safely trapped. But the state in which molecules can be safely loaded and trapped depends on the frequency of the AC-field.
The goal of the present article is to review the major developments that have led to the current understanding of molecule-field interactions and experimental methods for manipulating molecules with electromagnetic fields. Molecule-field interactions are at the core of several, seemingly distinct, areas of molecular physics. This is reflected in the organization of this article, which includes sections on Field control of molecular beams, External field traps for cold molecules, Control of molecular orientation and molecular alignment, Manipulation of molecules by non-conservative forces, Ultracold molecules and ultracold chemistry, Controlled many-body phenomena, Entanglement of molecules and dipole arrays, and Stability of molecular systems in high-frequency super-intense laser fields. The article contains 853 references.
There are views prevalent in the noncovalent chemistry literature that i) the O atom in molecules cannot form a chalcogen bond, and ii) if formed, this bond is very weak. We have shown in this study that these views are not necessarily true since the attractive energy between the oxygen atom of some molecules and several electron rich anionic bases examined in a series of 34 ion-molecule complexes varied from the weak (ca -2.30 kcal/mol) to the ultrastrong (-90.10 kcal/mol). The [MP2/aug-cc-pVTZ] binding energies for several of these complexes were found to be comparable to or significantly larger than that of the well-known hydrogen bond complex [FH...F]- (roughly -40 kcal/mol). The nature of the intermolecular interactions was examined using the quantum theory of atoms in molecules, second order natural bond orbital and symmetric adaptive perturbation theory energy decomposition analyses. It was found that many of these interactions comprise mixed bonding character (ionic and covalent), especially manifest in the moderate to strongly bound complexes. All these can be explained by a bonding to an anti-bonding orbital type donor acceptor charge transfer delocalization. This study,
The spectrum of hadronic molecules composed of heavy-antiheavy charmed hadrons has been obtained in our previous work. The potentials are constants at the leading order, which are estimated from resonance saturation. The experimental candidates of hadronic molecules, say $X(3872)$, $Y(4260)$, three $P_c$ states and $P_{cs}(4459)$, fit the spectrum well. The success in describing the pattern of heavy-antiheavy hadronic molecules stimulates us to give more predictions for the heavy-heavy cases, which are less discussed in literature than the heavy-antiheavy ones. Given that the heavy-antiheavy hadronic molecules, several of which have strong experimental evidence, emerge from the dominant constant interaction from resonance saturation, we find that the existence of many heavy-heavy hadronic molecules is natural. Among these predicted heavy-heavy states we highlight the $DD^*$ molecule and the $D^{(*)}Σ_c^{(*)}$ molecules, which are the partners of the famous $X(3872)$ and $P_c$ states. Quite recently, LHCb collaboration reported a doubly charmed tetraquark state, $T_{cc}$, which is in line with our results for the $DD^*$ molecule. With the first experimental signal of this new kind o
Positrons bind to molecules leading to vibrational excitation and spectacularly enhanced annihilation. Whilst positron binding energies have been measured via resonant annihilation spectra for $\sim$90 molecules in the past two decades, an accurate \emph{ab initio} theoretical description has remained elusive. Of the molecules studied experimentally, calculations exist for only 6, and for these, standard quantum chemistry approaches have proved severely deficient, agreeing with experiment to at best 25% accuracy for polar molecules, and failing to predict binding in nonpolar molecules. The mechanisms of binding are not understood. Here, we develop a many-body theory of positron-molecule interactions and uncover the role of strong many-body correlations including polarization of the electron cloud, screening of the positron-electron Coulomb interaction by molecular electrons, and crucially, the unique non-perturbative process of virtual-positronium formation (where a molecular electron temporarily tunnels to the positron): they dramatically enhance binding in polar molecules and enable binding in nonpolars. We also elucidate the role of individual molecular orbitals, highlighting th
We study the effects of quantum statistics on the counting statistics of ultracold heteronuclear molecules formed by Feshbach-assisted photoassociation [Phys. Rev. Lett. {\bf 93}, 140405 (2004)]. Exploiting the formal similarities with sum frequency generation and using quantum optics methods we consider the cases where the molecules are formed from atoms out of two Bose-Einstein condensates, out of a Bose-Einstein condensate and a gas of degenerate fermions, and out of two degenerate Fermi gases with and without superfluidity. Bosons are treated in a single mode approximation and fermions in a degenerate model. In these approximations we can numerically solve the master equations describing the system's dynamics and thus we find the full counting statistics of the molecular modes. The full quantum dynamics calculations are complemented by mean field calculations and short time perturbative expansions. While the molecule production rates are very similar in all three cases at this level of approximation, differences show up in the counting statistics of the molecular fields. The intermediate field of closed-channel molecules is for short times second-order coherent if the molecules
Many important tasks in chemistry revolve around molecules during reactions. This requires predictions far from the equilibrium, while most recent work in machine learning for molecules has been focused on equilibrium or near-equilibrium states. In this paper we aim to extend this scope in three ways. First, we propose the DimeNet++ model, which is 8x faster and 10% more accurate than the original DimeNet on the QM9 benchmark of equilibrium molecules. Second, we validate DimeNet++ on highly reactive molecules by developing the challenging COLL dataset, which contains distorted configurations of small molecules during collisions. Finally, we investigate ensembling and mean-variance estimation for uncertainty quantification with the goal of accelerating the exploration of the vast space of non-equilibrium structures. Our DimeNet++ implementation as well as the COLL dataset are available online.
In this Chapter, we give an introduction into experiments with Feshbach molecules and their applications. In particular, we discuss the various creation and detection methods, and the internal-state manipulation of such molecules. We highlight two topics, namely Feshbach molecules in the halo regime and the application of Feshbach molecule to achieve ultracold gases of molecules in the rovibrational ground state. Our illustrative examples are mainly based on work performed at Innsbruck University.
With the rapid increase of compound databases available in medicinal and material science, there is a growing need for learning representations of molecules in a semi-supervised manner. In this paper, we propose an unsupervised hierarchical feature extraction algorithm for molecules (or more generally, graph-structured objects with fixed number of types of nodes and edges), which is applicable to both unsupervised and semi-supervised tasks. Our method extends recently proposed Paragraph Vector algorithm and incorporates neural message passing to obtain hierarchical representations of subgraphs. We applied our method to an unsupervised task and demonstrated that it outperforms existing proposed methods in several benchmark datasets. We also experimentally showed that semi-supervised tasks enhanced predictive performance compared with supervised ones with labeled molecules only.
We demonstrate efficient transfer of ultracold molecules into a deeply bound rovibrational level of the singlet ground state potential in the presence of an optical lattice. The overall molecule creation efficiency is 25%, and the transfer efficiency to the rovibrational level |v=73,J=2> is above 80%. We find that the molecules in |v=73,J=2> are trapped in the optical lattice, limited by optical excitation by the lattice light. The molecule trapping time for a lattice depth of 15 atomic recoil energies is about 20 ms. We determine the trapping frequency by the lattice phase and amplitude modulation technique. It will now be possible to transfer the molecules to the rovibrational ground state |v=0,J=0> in the presence of the optical lattice.
Current methods for investigation of receptor - ligand interactions in drug discovery are based on three-dimensional complementarity of receptor and ligand surfaces, and they include pharmacophore modelling, QSAR, molecular docking etc. Those methods only consider short-range molecular interactions (distances <5A), and not include long-range interactions (distances >5A) which are essential for kinetic of biochemical reactions because they influence the number of productive collisions between interacting molecules. Previously was shown that the electron-ion interaction potential (EIIP) represents the physical property which determines the long-range properties of biological molecules. This molecular descriptor served as a base for development of the informational spectrum method (ISM), a virtual spectroscopy method for investigation of protein-protein interactions. In this paper, we proposed a new approach to treat small molecules as linear entities, allowing study of the small molecule - protein interaction by ISM. We analyzed here 21 sets of KEGG drug-protein interactions and showed that this new approach allows an efficient discrimination between biologically active and ina
This article presents a review of the current state of the art in the research field of cold and ultracold molecules. It serves as an introduction to the Special Issue of the New Journal of Physics on Cold and Ultracold Molecules and describes new prospects for fundamental research and technological development. Cold and ultracold molecules may revolutionize physical chemistry and few body physics, provide techniques for probing new states of quantum matter, allow for precision measurements of both fundamental and applied interest, and enable quantum simulations of condensed-matter phenomena. Ultracold molecules offer promising applications such as new platforms for quantum computing, precise control of molecular dynamics, nanolithography, and Bose-enhanced chemistry. The discussion is based on recent experimental and theoretical work and concludes with a summary of anticipated future directions and open questions in this rapidly expanding research field.
Curiosity has detected a surprising variety of organic molecules on Mars, including compounds tied to the chemistry of life。 Some of these molecules may be billions of years old, preserved in ancient clay-rich rocks that once held water。 One standout find resembles building blocks of DNA, raising exciting questions about Mars’ past
We theoretically investigate the magnetic properties and nonequilibrium dynamics of two interacting ultracold polar and paramagnetic molecules in a one-dimensional harmonic trap in external electric and magnetic fields. The molecules interact via a multichannel two-body contact potential, incorporating the short-range anisotropy of intermolecular interactions. We show that various magnetization states arise from the interplay of the molecular interactions, electronic spins, dipole moments, rotational structures, external fields, and spin-rotation coupling. The rich magnetization diagrams depend primarily on the anisotropy of the intermolecular interaction and the spin-rotation coupling. These specific molecular properties are challenging to calculate or measure. Therefore, we propose the quench dynamics experiments for extracting them from observing the time evolution of the analyzed system. Our results indicate the possibility of controlling the molecular few-body magnetization with the external electric field and pave the way towards studying the magnetization of ultracold molecules trapped in optical tweezers or optical lattices and their application in quantum simulation of mol
We present an astonishingly simple and elegant proof of the celebrated Basel problem.
In this review chapter we focus on the many-body dynamics of cold polar molecules in the strongly interacting regime. In particular, we discuss a toolbox for engineering many-body Hamiltonians based on the manipulation of the electric dipole moments of the molecules, and thus of molecular interactions, using external static and microwave fields. This forms the basis for the realization of novel quantum phases in these systems.