Summary: VTX is a molecular visualization software capable to handle most molecular structures and dynamics trajectories file formats. It features a real-time high-performance molecular graphics engine, based on modern OpenGL, optimized for the visualization of massive molecular systems and molecular dynamics trajectories. VTX includes multiple interactive camera and user interaction features, notably free-fly navigation and a fully modular graphical user interface designed for increased usability. It allows the production of high-resolution images for presentations and posters with custom background. VTX design is focused on performance and usability for research, teaching and educative purposes. Availability and implementation: VTX is open source and free for non commercial use. Builds for Windows and Ubuntu Linux are available at http://vtx.drugdesign.fr. The source code is available at https://github.com/VTX-Molecular-Visualization . Supplementary Information: A video displaying free-fly navigation in a whole-cell model is available
FIR and submm observations have established the fundamental role of dust-obscured star formation in the assembly of stellar mass over the past 12 billion years. At z between 2 and 4, the bulk of star formation is enshrouded in dust, and dusty star forming galaxies (DSFGs) contain about half of the total stellar mass density. Star formation develops in dense molecular clouds, and is regulated by a complex interplay between all the ISM components that contribute to the energy budget of a galaxy: gas, dust, cosmic rays, interstellar electromagnetic fields, gravitational field, dark matter. Molecular gas is the actual link between star forming gas and its complex environment, providing by far the richest amount of information about the star formation process. However, molecular lines interpretation requires complex modeling of astrochemical networks, which regulate the molecular formation and establishes molecular abundances in a cloud, and a modeling of the physical conditions of the gas in which molecular energy levels become populated. This paper critically reviews the main astrochemical parameters needed to get predictions about molecular signals in DSFGs. We review the current kno
Molecular communication (MC) provides a foundational framework for information transmission in the Internet of Bio-Nano Things (IoBNT), where efficiency and reliability are crucial. However, the inherent limitations of molecular channels, such as low transmission rates, noise, and intersymbol interference (ISI), limit their ability to support complex data transmission. This paper proposes an end-to-end semantic learning framework designed to optimize task-oriented molecular communication, with a focus on biomedical diagnostic tasks under resource-constrained conditions. The proposed framework employs a deep encoder-decoder architecture to efficiently extract, quantize, and decode semantic features, prioritizing taskrelevant semantic information to enhance diagnostic classification performance. Additionally, a probabilistic channel network is introduced to approximate molecular propagation dynamics, enabling gradient-based optimization for end-to-end learning. Experimental results demonstrate that the proposed semantic framework improves diagnostic accuracy by at least 25% compared to conventional JPEG compression with LDPC coding methods under resource-constrained communication sce
AI-assisted molecular property prediction has become a promising technique in early-stage drug discovery and materials design in recent years. However, due to high-cost and complex wet-lab experiments, real-world molecules usually experience the issue of scarce annotations, leading to limited labeled data for effective supervised AI model learning. In light of this, few-shot molecular property prediction (FSMPP) has emerged as an expressive paradigm that enables learning from only a few labeled examples. Despite rapidly growing attention, existing FSMPP studies remain fragmented, without a coherent framework to capture methodological advances and domain-specific challenges. In this work, we present the first comprehensive and systematic survey of few-shot molecular property prediction. We begin by analyzing the few-shot phenomenon in molecular datasets and highlighting two core challenges: (1) cross-property generalization under distribution shifts, where each task corresponding to each property, may follow a different data distribution or even be inherently weakly related to others from a biochemical perspective, requiring the model to transfer knowledge across heterogeneous predi
Molecular systems involve interactions across multiple spatial scales, from local coordination and short-range perturbations to long-range electrostatic and solvent-mediated effects. However, most molecular representation learning methods rely on manually predefined scales, and the task-optimal modeling scale may not coincide with these fixed levels. This study introduces a loss-guided adaptive scale refinement framework for molecular force prediction, treating predefined scales as initial anchors and discovering task-effective resolutions through interpolation, routing, differentiable scale updates, and scale pool refinement. Using a NaCl aqueous ionic system as a minimal testbed, this study constructs short-scale and long-range force prediction branches and analyzes their complementarity. Oracle hard routing reduces the overall force MAE from 399.65 to 382.67, while continuous oracle interpolation further reduces it to 380.96. In close-contact regimes with nearest-ion distance below 0.6 nm, the close-contact MAE decreases from 327.22 to 260.51. A minimal scale pool update experiment shows that starting from endpoint anchors {0,1}, loss-guided updates automatically generate interm
The study of microorganisms, or microbiology, has demonstrated significant development since its inception and is currently a key field of biological sciences that has a huge impact on modern society and scientific research. Over the centuries, this discipline has undergone significant changes, shaping our understanding of infectious diseases and food safety. Starting from the simplest observations of microscopic organisms such as bacteria, viruses, fungi and protozoa, and ending with modern molecular and genomic research methods. This article describes a brief historical path of microbiology development. The heuristic, morphological, physiological, immunological, and molecular genetic stages are the main periods into which the development of this science is traditionally divided, despite the lack of full-fledged and precise boundaries between them.
Microorganisms are ubiquitous in nature, and microbial activities are closely intertwined with the entire life cycle system and human life. Developing novel technologies for the detection, characterization and manipulation of microorganisms promotes their applications in clinical, environmental and industrial areas. Over the last two decades, terahertz (THz) technology has emerged as a new optical tool for microbiology. The great potential originates from the unique advantages of THz waves including the high sensitivity to water and inter-/intra-molecular motions, the non-invasive and label-free detecting scheme, and their low photon energy. THz waves have been utilized as a stimulus to alter microbial functions, or as a sensing approach for quantitative measurement and qualitative differentiation. This review specifically focuses on recent research progress of THz technology applied in the field of microbiology, including two major parts of THz biological effects and the microbial detection applications. In the end of this paper, we summarize the research progress and discuss the challenges currently faced by THz technology in microbiology, along with potential solutions. We also
Advancements in artificial intelligence (AI) have transformed many scientific fields, with microbiology and microbiome research now experiencing significant breakthroughs through machine learning applications. This review provides a comprehensive overview of AI-driven approaches tailored for microbiology and microbiome studies, emphasizing both technical advancements and biological insights. We begin with an introduction to foundational AI techniques, including primary machine learning paradigms and various deep learning architectures, and offer guidance on choosing between traditional machine learning and sophisticated deep learning methods based on specific research goals. The primary section on application scenarios spans diverse research areas, from taxonomic profiling, functional annotation \& prediction, microbe-X interactions, microbial ecology, metabolic modeling, precision nutrition, clinical microbiology, to prevention \& therapeutics. Finally, we discuss challenges in this field and highlight some recent breakthroughs. Together, this review underscores AI's transformative role in microbiology and microbiome research, paving the way for innovative methodologies an
Mathematical models are increasingly a part of microbiological research. Here, we share our perspective on how modeling advances the discipline by: (i) enforcing logical consistency, (ii) enabling quantitative prediction, (iii) extracting hidden parameters from data, and (iv) generating intuitive understanding. We map a spectrum of modeling frameworks, from whole-cell simulations to minimal logistic growth equations, and provide interactive examples for some common frameworks. Building on this overview, we outline pragmatic criteria for choosing an appropriate level of description to capture phenomena of interest. Finally, we present a case study in modeling of microbial ecosystems from our own work to illustrate how mechanistic modeling can yield generalizable intuition. This perspective aims to be an introductory roadmap for integrating mathematical modeling into experimental microbiology.
This study addresses from the Optimal Experimental Design perspective the use of the isothermal experimentation procedure to precisely estimate the parameters defining models used in predictive microbiology. Starting from a case study set out in the literature, and taking the Baranyi model as the primary model, and the Ratkowsky square-root model as the secondary, D- and c-optimal designs are provided for isothermal experiments, taking the temperature both as a value fixed by the experimenter and as a variable to be designed. The designs calculated show that those commonly used in practice are not efficient enough to estimate the parameters of the secondary model, leading to greater uncertainty in the predictions made via these models. Finally, an analysis is carried out to determine the effect on the efficiency of the possible reduction in the final experimental time.
The Antibiotic Resistance Microbiology Dataset (ARMD) is a de-identified resource derived from electronic health records (EHR) that facilitates research in antimicrobial resistance (AMR). ARMD encompasses big data from adult patients collected from over 15 years at two academic-affiliated hospitals, focusing on microbiological cultures, antibiotic susceptibilities, and associated clinical and demographic features. Key attributes include organism identification, susceptibility patterns for 55 antibiotics, implied susceptibility rules, and de-identified patient information. This dataset supports studies on antimicrobial stewardship, causal inference, and clinical decision-making. ARMD is designed to be reusable and interoperable, promoting collaboration and innovation in combating AMR. This paper describes the dataset's acquisition, structure, and utility while detailing its de-identification process.
The SSPACE Astrobiology Payload (SAP) series, starting with the SAP-1 project is designed to conduct in-situ microbiology experiments in low earth orbit. This payload series aims to understand the behaviour of microbial organisms in space, particularly those critical for human health, and the corresponding effects due to microgravity and solar/galactic radiation. SAP-1 focuses on studying Bacillus clausii and Bacillus coagulans, bacteria beneficial to humans. It aims to provide a space laboratory for astrobiology experiments under microgravity conditions. The hardware developed for these experiments is indigenous and tailored to meet the unique requirements of autonomous microbiology experiments by controlling pressure, temperature, and nutrition flow to bacteria. A rotating platform, which forms the core design, is innovatively utilised to regulate the flow and mixing of nutrients with dormant bacteria. The technology demonstration models developed at SSPACE have yielded promising results, with ongoing efforts to refine, adapt for space conditions, and prepare for integration with nanosatellites or space modules. The anticipated payload will be compact, approximately 1U in size (1
The function of the organism hinges on the performance of its information-processing networks, which convey information via molecular recognition. Many paths within these networks utilize molecular codebooks, such as the genetic code, to translate information written in one class of molecules into another molecular "language" . The present paper examines the emergence and evolution of molecular codes in terms of rate-distortion theory and reviews recent results of this approach. We discuss how the biological problem of maximizing the fitness of an organism by optimizing its molecular coding machinery is equivalent to the communication engineering problem of designing an optimal information channel. The fitness of a molecular code takes into account the interplay between the quality of the channel and the cost of resources which the organism needs to invest in its construction and maintenance. We analyze the dynamics of a population of organisms that compete according to the fitness of their codes. The model suggests a generic mechanism for the emergence of molecular codes as a phase transition in an information channel. This mechanism is put into biological context and demonstrated
Existing molecular communication systems, both theoretical and experimental, are characterized by low information rates. In this paper, inspired by time-of-flight mass spectrometry (TOFMS), we consider the design of a molecular communication system in which the channel is a vacuum and demonstrate that this method has the potential to increase achievable information rates by many orders of magnitude. We use modelling results from TOFMS to obtain arrival time distributions for accelerated ions and use them to analyze several species of ions, including hydrogen, nitrogen, argon, and benzene. We show that the achievable information rates can be increased using a velocity (Wien) filter, which reduces uncertainty in the velocity of the ions. Using a simplified communication model, we show that data rates well above 1 Gbit/s/molecule are achievable.
The estimation of molecular abundances in interstellar clouds from spectroscopic observations requires radiative transfer calculations, which depend on basic molecular input data. This paper reviews recent developments in the fields of molecular data and radiative transfer. The first part is an overview of radiative transfer techniques, along with a "road map" showing which technique should be used in which situation. The second part is a review of measurements and calculations of molecular spectroscopic and collisional data, with a summary of recent collisional calculations and suggested modeling strategies if collision data are unavailable. The paper concludes with an overview of future developments and needs in the areas of radiative transfer and molecular data.
Molecular recognition, which is essential in processing information in biological systems, takes place in a crowded noisy biochemical environment and requires the recognition of a specific target within a background of various similar competing molecules. We consider molecular recognition as a transmission of information via a noisy channel and use this analogy to gain insights on the optimal, or fittest, molecular recognizer. We focus on the optimal structural properties of the molecules such as flexibility and conformation. We show that conformational changes upon binding, which often occur during molecular recognition, may optimize the detection performance of the recognizer. We thus suggest a generic design principle termed 'conformational proofreading' in which deformation enhances detection. We evaluate the optimal flexibility of the molecular recognizer, which is analogous to the stochasticity in a decision unit. In some scenarios, a flexible recognizer, i.e., a stochastic decision unit, performs better than a rigid, deterministic one. As a biological example, we discuss conformational changes during homologous recombination, the process of genetic exchange between two DNA s
Molecular codes translate information written in one type of molecules into another molecular language. We introduce a simple model that treats molecular codes as noisy information channels. An optimal code is a channel that conveys information accurately and efficiently while keeping down the impact of errors. The equipoise of the three conflicting needs, for minimal error-load, minimal cost of resources and maximal diversity of vocabulary, defines the fitness of the code. The model suggests a mechanism for the emergence of a code when evolution varies the parameters that control this equipoise and the mapping between the two molecular languages becomes non-random. This mechanism is demonstrated by a simple toy model that is formally equivalent to a mean-field Ising magnet.
Microbiology culture reports contain critical information for important clinical and public health applications. However, microbiology reports often have complex, semi-structured, free-text data that present a barrier for secondary use. Here we present the development and validation of an open-source package designed to ingest free-text microbiology reports, determine whether the culture is positive, and return a list of SNOMED-CT mapped bacteria. Our rule-based natural language processing algorithm was developed using microbiology reports from two different electronic health record systems in a large healthcare organization, and then externally validated on the reports of two other institutions with manually-extracted results as a benchmark. Our algorithm achieved F-1 scores >0.95 on all classification tasks across both validation sets. Our concept extraction Python package, MicrobEx, is designed to be reused and adapted to individual institutions as an upstream process for other clinical applications, such as machine learning studies, clinical decision support, and disease surveillance systems.
G-Protein Coupled Receptors (GPCRs) are a big family of eukaryotic cell transmembrane proteins, responsible for numerous biological processes. From a practical viewpoint around 34\% of the drugs approved by the US Food and Drug Administration target these receptors. They can be analyzed from their simulated molecular dynamics, including the prediction of their behavior in the presence of drugs. In this paper, the capability of Long Short-Term Memory Networks (LSTMs) are evaluated to learn and predict the molecular dynamic trajectories of a receptor. Several models were trained with the 3D position of the amino acids of the receptor considering different transformations on the position of the amino acid, such as their centers of mass, the geometric centers and the position of the $α$--carbon for each amino acid. The error of the prediction of the position was evaluated by the mean average error (MAE) and root-mean-square deviation (RMSD). The LSTM models show a robust performance, with results comparable to the state-of-the-art in non-dynamic 3D predictions. The best MAE and RMSD values were found for the mass center of the amino acids with 0.078 Å and 0.156 Å respectively. This wor
Molecular Communication (MC) is a communication strategy that uses molecules as carriers of information, and is widely used by biological cells. As an interdisciplinary topic, it has been studied by biologists, communication theorists and a growing number of information theorists. This paper aims to specifically bring MC to the attention of information theorists. To do this, we first highlight the unique mathematical challenges of studying the capacity of molecular channels. Addressing these problems require use of known, or development of new mathematical tools. Toward this goal, we review a subjective selection of the existing literature on information theoretic aspect of molecular communication. The emphasis here is on the mathematical techniques used, rather than on the setup or modeling of a specific paper. Finally, as an example, we propose a concrete information theoretic problem that was motivated by our study of molecular communication.