Molecular systems involve interactions across multiple spatial scales, from local coordination and short-range perturbations to long-range electrostatic and solvent-mediated effects. However, most molecular representation learning methods rely on manually predefined scales, and the task-optimal modeling scale may not coincide with these fixed levels. This study introduces a loss-guided adaptive scale refinement framework for molecular force prediction, treating predefined scales as initial anchors and discovering task-effective resolutions through interpolation, routing, differentiable scale updates, and scale pool refinement. Using a NaCl aqueous ionic system as a minimal testbed, this study constructs short-scale and long-range force prediction branches and analyzes their complementarity. Oracle hard routing reduces the overall force MAE from 399.65 to 382.67, while continuous oracle interpolation further reduces it to 380.96. In close-contact regimes with nearest-ion distance below 0.6 nm, the close-contact MAE decreases from 327.22 to 260.51. A minimal scale pool update experiment shows that starting from endpoint anchors {0,1}, loss-guided updates automatically generate interm
FIR and submm observations have established the fundamental role of dust-obscured star formation in the assembly of stellar mass over the past 12 billion years. At z between 2 and 4, the bulk of star formation is enshrouded in dust, and dusty star forming galaxies (DSFGs) contain about half of the total stellar mass density. Star formation develops in dense molecular clouds, and is regulated by a complex interplay between all the ISM components that contribute to the energy budget of a galaxy: gas, dust, cosmic rays, interstellar electromagnetic fields, gravitational field, dark matter. Molecular gas is the actual link between star forming gas and its complex environment, providing by far the richest amount of information about the star formation process. However, molecular lines interpretation requires complex modeling of astrochemical networks, which regulate the molecular formation and establishes molecular abundances in a cloud, and a modeling of the physical conditions of the gas in which molecular energy levels become populated. This paper critically reviews the main astrochemical parameters needed to get predictions about molecular signals in DSFGs. We review the current kno
Summary: VTX is a molecular visualization software capable to handle most molecular structures and dynamics trajectories file formats. It features a real-time high-performance molecular graphics engine, based on modern OpenGL, optimized for the visualization of massive molecular systems and molecular dynamics trajectories. VTX includes multiple interactive camera and user interaction features, notably free-fly navigation and a fully modular graphical user interface designed for increased usability. It allows the production of high-resolution images for presentations and posters with custom background. VTX design is focused on performance and usability for research, teaching and educative purposes. Availability and implementation: VTX is open source and free for non commercial use. Builds for Windows and Ubuntu Linux are available at http://vtx.drugdesign.fr. The source code is available at https://github.com/VTX-Molecular-Visualization . Supplementary Information: A video displaying free-fly navigation in a whole-cell model is available
AI-assisted molecular property prediction has become a promising technique in early-stage drug discovery and materials design in recent years. However, due to high-cost and complex wet-lab experiments, real-world molecules usually experience the issue of scarce annotations, leading to limited labeled data for effective supervised AI model learning. In light of this, few-shot molecular property prediction (FSMPP) has emerged as an expressive paradigm that enables learning from only a few labeled examples. Despite rapidly growing attention, existing FSMPP studies remain fragmented, without a coherent framework to capture methodological advances and domain-specific challenges. In this work, we present the first comprehensive and systematic survey of few-shot molecular property prediction. We begin by analyzing the few-shot phenomenon in molecular datasets and highlighting two core challenges: (1) cross-property generalization under distribution shifts, where each task corresponding to each property, may follow a different data distribution or even be inherently weakly related to others from a biochemical perspective, requiring the model to transfer knowledge across heterogeneous predi
Molecular communication (MC) provides a foundational framework for information transmission in the Internet of Bio-Nano Things (IoBNT), where efficiency and reliability are crucial. However, the inherent limitations of molecular channels, such as low transmission rates, noise, and intersymbol interference (ISI), limit their ability to support complex data transmission. This paper proposes an end-to-end semantic learning framework designed to optimize task-oriented molecular communication, with a focus on biomedical diagnostic tasks under resource-constrained conditions. The proposed framework employs a deep encoder-decoder architecture to efficiently extract, quantize, and decode semantic features, prioritizing taskrelevant semantic information to enhance diagnostic classification performance. Additionally, a probabilistic channel network is introduced to approximate molecular propagation dynamics, enabling gradient-based optimization for end-to-end learning. Experimental results demonstrate that the proposed semantic framework improves diagnostic accuracy by at least 25% compared to conventional JPEG compression with LDPC coding methods under resource-constrained communication sce
Generative models for molecules have shown considerable promise for use in computational chemistry, but remain difficult to use for non-experts. For this reason, we introduce open-source infrastructure for easily building generative molecular models into the widely used DeepChem [Ramsundar et al., 2019] library with the aim of creating a robust and reusable molecular generation pipeline. In particular, we add high quality PyTorch [Paszke et al., 2019] implementations of the Molecular Generative Adversarial Networks (MolGAN) [Cao and Kipf, 2022] and Normalizing Flows [Papamakarios et al., 2021]. Our implementations show strong performance comparable with past work [Kuznetsov and Polykovskiy, 2021, Cao and Kipf, 2022].
Information molecules play a crucial role in molecular communication (MC), acting as carriers for information transfer. A common approach to get information molecules in MC involves harvesting them from the environment; however, the harvested molecules are often a mixture of various environmental molecules, and the initial concentration ratios in the reservoirs are identical, which hampers high-fidelity transmission techniques such as molecular shift keying (MoSK). This paper presents a transmitter design that harvests molecules from the surrounding environment and stores them in two reservoirs. To separate the mixed molecules, energy is consumed to transfer them between reservoirs. Given limited energy resources, this work explores energy-efficient strategies to optimize transmitter performance. Through theoretical analysis and simulations, we investigate different methods for moving molecules between reservoirs. The results demonstrate that transferring higher initial concentration molecules enhances transmitter performance, while using fewer molecules per transfer further improves efficiency. These findings provide valuable insights for optimizing MC systems through energy-effic
The function of the organism hinges on the performance of its information-processing networks, which convey information via molecular recognition. Many paths within these networks utilize molecular codebooks, such as the genetic code, to translate information written in one class of molecules into another molecular "language" . The present paper examines the emergence and evolution of molecular codes in terms of rate-distortion theory and reviews recent results of this approach. We discuss how the biological problem of maximizing the fitness of an organism by optimizing its molecular coding machinery is equivalent to the communication engineering problem of designing an optimal information channel. The fitness of a molecular code takes into account the interplay between the quality of the channel and the cost of resources which the organism needs to invest in its construction and maintenance. We analyze the dynamics of a population of organisms that compete according to the fitness of their codes. The model suggests a generic mechanism for the emergence of molecular codes as a phase transition in an information channel. This mechanism is put into biological context and demonstrated
Neural Networks (GNNs) have revolutionized the molecular discovery to understand patterns and identify unknown features that can aid in predicting biophysical properties and protein-ligand interactions. However, current models typically rely on 2-dimensional molecular representations as input, and while utilization of 2\3- dimensional structural data has gained deserved traction in recent years as many of these models are still limited to static graph representations. We propose a novel approach based on the transformer model utilizing GNNs for characterizing dynamic features of protein-ligand interactions. Our message passing transformer pre-trains on a set of molecular dynamic data based off of physics-based simulations to learn coordinate construction and make binding probability and affinity predictions as a downstream task. Through extensive testing we compare our results with the existing models, our MDA-PLI model was able to outperform the molecular interaction prediction models with an RMSE of 1.2958. The geometric encodings enabled by our transformer architecture and the addition of time series data add a new dimensionality to this form of research.
Existing molecular communication systems, both theoretical and experimental, are characterized by low information rates. In this paper, inspired by time-of-flight mass spectrometry (TOFMS), we consider the design of a molecular communication system in which the channel is a vacuum and demonstrate that this method has the potential to increase achievable information rates by many orders of magnitude. We use modelling results from TOFMS to obtain arrival time distributions for accelerated ions and use them to analyze several species of ions, including hydrogen, nitrogen, argon, and benzene. We show that the achievable information rates can be increased using a velocity (Wien) filter, which reduces uncertainty in the velocity of the ions. Using a simplified communication model, we show that data rates well above 1 Gbit/s/molecule are achievable.
The estimation of molecular abundances in interstellar clouds from spectroscopic observations requires radiative transfer calculations, which depend on basic molecular input data. This paper reviews recent developments in the fields of molecular data and radiative transfer. The first part is an overview of radiative transfer techniques, along with a "road map" showing which technique should be used in which situation. The second part is a review of measurements and calculations of molecular spectroscopic and collisional data, with a summary of recent collisional calculations and suggested modeling strategies if collision data are unavailable. The paper concludes with an overview of future developments and needs in the areas of radiative transfer and molecular data.
This contribution exploits the duality between a viral infection process and macroscopic air-based molecular communication. Airborne aerosol and droplet transmission through human respiratory processes is modeled as an instance of a multiuser molecular communication scenario employing respiratory-event-driven molecular variable-concentration shift keying. Modeling is aided by experiments that are motivated by a macroscopic air-based molecular communication testbed. In artificially induced coughs, a saturated aqueous solution containing a fluorescent dye mixed with saliva is released by an adult test person. The emitted particles are made visible by means of optical detection exploiting the fluorescent dye. The number of particles recorded is significantly higher in test series without mouth and nose protection than in those with a wellfitting medical mask. A simulation tool for macroscopic molecular communication processes is extended and used for estimating the transmission of infectious aerosols in different environments. Towards this goal, parameters obtained through self experiments are taken. The work is inspired by the recent outbreak of the coronavirus pandemic.
Molecular recognition, which is essential in processing information in biological systems, takes place in a crowded noisy biochemical environment and requires the recognition of a specific target within a background of various similar competing molecules. We consider molecular recognition as a transmission of information via a noisy channel and use this analogy to gain insights on the optimal, or fittest, molecular recognizer. We focus on the optimal structural properties of the molecules such as flexibility and conformation. We show that conformational changes upon binding, which often occur during molecular recognition, may optimize the detection performance of the recognizer. We thus suggest a generic design principle termed 'conformational proofreading' in which deformation enhances detection. We evaluate the optimal flexibility of the molecular recognizer, which is analogous to the stochasticity in a decision unit. In some scenarios, a flexible recognizer, i.e., a stochastic decision unit, performs better than a rigid, deterministic one. As a biological example, we discuss conformational changes during homologous recombination, the process of genetic exchange between two DNA s
G-Protein Coupled Receptors (GPCRs) are a big family of eukaryotic cell transmembrane proteins, responsible for numerous biological processes. From a practical viewpoint around 34\% of the drugs approved by the US Food and Drug Administration target these receptors. They can be analyzed from their simulated molecular dynamics, including the prediction of their behavior in the presence of drugs. In this paper, the capability of Long Short-Term Memory Networks (LSTMs) are evaluated to learn and predict the molecular dynamic trajectories of a receptor. Several models were trained with the 3D position of the amino acids of the receptor considering different transformations on the position of the amino acid, such as their centers of mass, the geometric centers and the position of the $α$--carbon for each amino acid. The error of the prediction of the position was evaluated by the mean average error (MAE) and root-mean-square deviation (RMSD). The LSTM models show a robust performance, with results comparable to the state-of-the-art in non-dynamic 3D predictions. The best MAE and RMSD values were found for the mass center of the amino acids with 0.078 Å and 0.156 Å respectively. This wor
The CDMS was founded 1998 to provide in its catalog section line lists of molecular species which may be observed in various astronomical sources using radio astronomy. The line lists contain transition frequencies with qualified accuracies, intensities, quantum numbers, as well as further auxilary information. They have been generated from critically evaluated experimental line lists, mostly from laboratory experiments, employing established Hamiltonian models. Seperate entries exist for different isotopic species and usually also for different vibrational states. As of December 2015, the number of entries is 792. They are available online as ascii tables with additional files documenting information on the entries. The Virtual Atomic and Molecular Data Centre was founded more than 5 years ago as a common platform for atomic and molecular data. This platform facilitates exchange not only between spectroscopic databases related to astrophysics or astrochemistry, but also with collisional and kinetic databases. A dedicated infrastructure was developed to provide a common data format in the various databases enabling queries to a large variety of databases on atomic and molecular dat
Centaurus A, the nearest AGN shows molecular absorption in the millimeter and radio regime. By observing the absorption with VLBI, we try to constrain the distribution of the gas, in particular whether it resides in the circumnuclear region. Analysis of VLBA observations in four OH and two H2CO transitions is presented here, as well as molecular excitation models parameterized with distance from the AGN. We conclude that the gas is most likely associated with the tilted molecular ring structure observed before in molecular emission and IR continuum. The formaldehyde absorption shows small scale absorption which requires a different distribution than the hydroxyl.
Molecular codes translate information written in one type of molecules into another molecular language. We introduce a simple model that treats molecular codes as noisy information channels. An optimal code is a channel that conveys information accurately and efficiently while keeping down the impact of errors. The equipoise of the three conflicting needs, for minimal error-load, minimal cost of resources and maximal diversity of vocabulary, defines the fitness of the code. The model suggests a mechanism for the emergence of a code when evolution varies the parameters that control this equipoise and the mapping between the two molecular languages becomes non-random. This mechanism is demonstrated by a simple toy model that is formally equivalent to a mean-field Ising magnet.
The reversal of the magnetization of crystals of molecular magnets that have a large spin and high anisotropy barrier generally proceeds below the blocking temperature by quantum tunneling. This is manifested as a series of controlled steps in the hysteresis loops at resonant values of the magnetic field where energy levels on opposite sides of the barrier cross. An abrupt reversal of the magnetic moment of the entire crystal can occur instead by a process commonly referred to as a magnetic avalanche, where the molecular spins reverse along a deflagration front that travels through the sample at subsonic speed. In this chapter, we review experimental results obtained to date for magnetic deflagration in molecular nanomagnets.
Methods for experimental reconstruction of molecular frame (MF) photoionization dynamics, and related properties - specifically MF photoelectron angular distributions (PADs) and continuum density matrices - are outlined and discussed. General concepts are introduced for the non-expert reader, and experimental and theoretical techniques are further outlined in some depth. Particular focus is placed on a detailed example of numerical reconstruction techniques for matrix-element retrieval from time-domain experimental measurements making use of rotational-wavepackets (i.e. aligned frame measurements) - the ``bootstrapping to the MF" methodology - and a matrix-inversion technique for direct MF-PAD recovery. Ongoing resources for interested researchers are also introduced, including sample data, reconstruction codes (the \textit{Photoelectron Metrology Toolkit}, written in python, and associated \textit{Quantum Metrology with Photoelectrons} platform/ecosystem), and literature via online repositories; it is hoped these resources will be of ongoing use to the community.
In order to function reliably, synthetic molecular circuits require mechanisms that allow them to adapt to environmental disturbances. Least mean squares (LMS) schemes, such as commonly encountered in signal processing and control, provide a powerful means to accomplish that goal. In this paper we show how the traditional LMS algorithm can be implemented at the molecular level using only a few elementary biomolecular reactions. We demonstrate our approach using several simulation studies and discuss its relevance to synthetic biology.