Summary: VTX is a molecular visualization software capable to handle most molecular structures and dynamics trajectories file formats. It features a real-time high-performance molecular graphics engine, based on modern OpenGL, optimized for the visualization of massive molecular systems and molecular dynamics trajectories. VTX includes multiple interactive camera and user interaction features, notably free-fly navigation and a fully modular graphical user interface designed for increased usability. It allows the production of high-resolution images for presentations and posters with custom background. VTX design is focused on performance and usability for research, teaching and educative purposes. Availability and implementation: VTX is open source and free for non commercial use. Builds for Windows and Ubuntu Linux are available at http://vtx.drugdesign.fr. The source code is available at https://github.com/VTX-Molecular-Visualization . Supplementary Information: A video displaying free-fly navigation in a whole-cell model is available
FIR and submm observations have established the fundamental role of dust-obscured star formation in the assembly of stellar mass over the past 12 billion years. At z between 2 and 4, the bulk of star formation is enshrouded in dust, and dusty star forming galaxies (DSFGs) contain about half of the total stellar mass density. Star formation develops in dense molecular clouds, and is regulated by a complex interplay between all the ISM components that contribute to the energy budget of a galaxy: gas, dust, cosmic rays, interstellar electromagnetic fields, gravitational field, dark matter. Molecular gas is the actual link between star forming gas and its complex environment, providing by far the richest amount of information about the star formation process. However, molecular lines interpretation requires complex modeling of astrochemical networks, which regulate the molecular formation and establishes molecular abundances in a cloud, and a modeling of the physical conditions of the gas in which molecular energy levels become populated. This paper critically reviews the main astrochemical parameters needed to get predictions about molecular signals in DSFGs. We review the current kno
Molecular systems involve interactions across multiple spatial scales, from local coordination and short-range perturbations to long-range electrostatic and solvent-mediated effects. However, most molecular representation learning methods rely on manually predefined scales, and the task-optimal modeling scale may not coincide with these fixed levels. This study introduces a loss-guided adaptive scale refinement framework for molecular force prediction, treating predefined scales as initial anchors and discovering task-effective resolutions through interpolation, routing, differentiable scale updates, and scale pool refinement. Using a NaCl aqueous ionic system as a minimal testbed, this study constructs short-scale and long-range force prediction branches and analyzes their complementarity. Oracle hard routing reduces the overall force MAE from 399.65 to 382.67, while continuous oracle interpolation further reduces it to 380.96. In close-contact regimes with nearest-ion distance below 0.6 nm, the close-contact MAE decreases from 327.22 to 260.51. A minimal scale pool update experiment shows that starting from endpoint anchors {0,1}, loss-guided updates automatically generate interm
Molecular communication (MC) provides a foundational framework for information transmission in the Internet of Bio-Nano Things (IoBNT), where efficiency and reliability are crucial. However, the inherent limitations of molecular channels, such as low transmission rates, noise, and intersymbol interference (ISI), limit their ability to support complex data transmission. This paper proposes an end-to-end semantic learning framework designed to optimize task-oriented molecular communication, with a focus on biomedical diagnostic tasks under resource-constrained conditions. The proposed framework employs a deep encoder-decoder architecture to efficiently extract, quantize, and decode semantic features, prioritizing taskrelevant semantic information to enhance diagnostic classification performance. Additionally, a probabilistic channel network is introduced to approximate molecular propagation dynamics, enabling gradient-based optimization for end-to-end learning. Experimental results demonstrate that the proposed semantic framework improves diagnostic accuracy by at least 25% compared to conventional JPEG compression with LDPC coding methods under resource-constrained communication sce
Microorganisms are ubiquitous in nature, and microbial activities are closely intertwined with the entire life cycle system and human life. Developing novel technologies for the detection, characterization and manipulation of microorganisms promotes their applications in clinical, environmental and industrial areas. Over the last two decades, terahertz (THz) technology has emerged as a new optical tool for microbiology. The great potential originates from the unique advantages of THz waves including the high sensitivity to water and inter-/intra-molecular motions, the non-invasive and label-free detecting scheme, and their low photon energy. THz waves have been utilized as a stimulus to alter microbial functions, or as a sensing approach for quantitative measurement and qualitative differentiation. This review specifically focuses on recent research progress of THz technology applied in the field of microbiology, including two major parts of THz biological effects and the microbial detection applications. In the end of this paper, we summarize the research progress and discuss the challenges currently faced by THz technology in microbiology, along with potential solutions. We also
AI-assisted molecular property prediction has become a promising technique in early-stage drug discovery and materials design in recent years. However, due to high-cost and complex wet-lab experiments, real-world molecules usually experience the issue of scarce annotations, leading to limited labeled data for effective supervised AI model learning. In light of this, few-shot molecular property prediction (FSMPP) has emerged as an expressive paradigm that enables learning from only a few labeled examples. Despite rapidly growing attention, existing FSMPP studies remain fragmented, without a coherent framework to capture methodological advances and domain-specific challenges. In this work, we present the first comprehensive and systematic survey of few-shot molecular property prediction. We begin by analyzing the few-shot phenomenon in molecular datasets and highlighting two core challenges: (1) cross-property generalization under distribution shifts, where each task corresponding to each property, may follow a different data distribution or even be inherently weakly related to others from a biochemical perspective, requiring the model to transfer knowledge across heterogeneous predi
The study of microorganisms, or microbiology, has demonstrated significant development since its inception and is currently a key field of biological sciences that has a huge impact on modern society and scientific research. Over the centuries, this discipline has undergone significant changes, shaping our understanding of infectious diseases and food safety. Starting from the simplest observations of microscopic organisms such as bacteria, viruses, fungi and protozoa, and ending with modern molecular and genomic research methods. This article describes a brief historical path of microbiology development. The heuristic, morphological, physiological, immunological, and molecular genetic stages are the main periods into which the development of this science is traditionally divided, despite the lack of full-fledged and precise boundaries between them.
Mathematical models are increasingly a part of microbiological research. Here, we share our perspective on how modeling advances the discipline by: (i) enforcing logical consistency, (ii) enabling quantitative prediction, (iii) extracting hidden parameters from data, and (iv) generating intuitive understanding. We map a spectrum of modeling frameworks, from whole-cell simulations to minimal logistic growth equations, and provide interactive examples for some common frameworks. Building on this overview, we outline pragmatic criteria for choosing an appropriate level of description to capture phenomena of interest. Finally, we present a case study in modeling of microbial ecosystems from our own work to illustrate how mechanistic modeling can yield generalizable intuition. This perspective aims to be an introductory roadmap for integrating mathematical modeling into experimental microbiology.
The Antibiotic Resistance Microbiology Dataset (ARMD) is a de-identified resource derived from electronic health records (EHR) that facilitates research in antimicrobial resistance (AMR). ARMD encompasses big data from adult patients collected from over 15 years at two academic-affiliated hospitals, focusing on microbiological cultures, antibiotic susceptibilities, and associated clinical and demographic features. Key attributes include organism identification, susceptibility patterns for 55 antibiotics, implied susceptibility rules, and de-identified patient information. This dataset supports studies on antimicrobial stewardship, causal inference, and clinical decision-making. ARMD is designed to be reusable and interoperable, promoting collaboration and innovation in combating AMR. This paper describes the dataset's acquisition, structure, and utility while detailing its de-identification process.
Advancements in artificial intelligence (AI) have transformed many scientific fields, with microbiology and microbiome research now experiencing significant breakthroughs through machine learning applications. This review provides a comprehensive overview of AI-driven approaches tailored for microbiology and microbiome studies, emphasizing both technical advancements and biological insights. We begin with an introduction to foundational AI techniques, including primary machine learning paradigms and various deep learning architectures, and offer guidance on choosing between traditional machine learning and sophisticated deep learning methods based on specific research goals. The primary section on application scenarios spans diverse research areas, from taxonomic profiling, functional annotation \& prediction, microbe-X interactions, microbial ecology, metabolic modeling, precision nutrition, clinical microbiology, to prevention \& therapeutics. Finally, we discuss challenges in this field and highlight some recent breakthroughs. Together, this review underscores AI's transformative role in microbiology and microbiome research, paving the way for innovative methodologies an
This study addresses from the Optimal Experimental Design perspective the use of the isothermal experimentation procedure to precisely estimate the parameters defining models used in predictive microbiology. Starting from a case study set out in the literature, and taking the Baranyi model as the primary model, and the Ratkowsky square-root model as the secondary, D- and c-optimal designs are provided for isothermal experiments, taking the temperature both as a value fixed by the experimenter and as a variable to be designed. The designs calculated show that those commonly used in practice are not efficient enough to estimate the parameters of the secondary model, leading to greater uncertainty in the predictions made via these models. Finally, an analysis is carried out to determine the effect on the efficiency of the possible reduction in the final experimental time.
The SSPACE Astrobiology Payload (SAP) series, starting with the SAP-1 project is designed to conduct in-situ microbiology experiments in low earth orbit. This payload series aims to understand the behaviour of microbial organisms in space, particularly those critical for human health, and the corresponding effects due to microgravity and solar/galactic radiation. SAP-1 focuses on studying Bacillus clausii and Bacillus coagulans, bacteria beneficial to humans. It aims to provide a space laboratory for astrobiology experiments under microgravity conditions. The hardware developed for these experiments is indigenous and tailored to meet the unique requirements of autonomous microbiology experiments by controlling pressure, temperature, and nutrition flow to bacteria. A rotating platform, which forms the core design, is innovatively utilised to regulate the flow and mixing of nutrients with dormant bacteria. The technology demonstration models developed at SSPACE have yielded promising results, with ongoing efforts to refine, adapt for space conditions, and prepare for integration with nanosatellites or space modules. The anticipated payload will be compact, approximately 1U in size (1
The function of the organism hinges on the performance of its information-processing networks, which convey information via molecular recognition. Many paths within these networks utilize molecular codebooks, such as the genetic code, to translate information written in one class of molecules into another molecular "language" . The present paper examines the emergence and evolution of molecular codes in terms of rate-distortion theory and reviews recent results of this approach. We discuss how the biological problem of maximizing the fitness of an organism by optimizing its molecular coding machinery is equivalent to the communication engineering problem of designing an optimal information channel. The fitness of a molecular code takes into account the interplay between the quality of the channel and the cost of resources which the organism needs to invest in its construction and maintenance. We analyze the dynamics of a population of organisms that compete according to the fitness of their codes. The model suggests a generic mechanism for the emergence of molecular codes as a phase transition in an information channel. This mechanism is put into biological context and demonstrated
This contribution exploits the duality between a viral infection process and macroscopic air-based molecular communication. Airborne aerosol and droplet transmission through human respiratory processes is modeled as an instance of a multiuser molecular communication scenario employing respiratory-event-driven molecular variable-concentration shift keying. Modeling is aided by experiments that are motivated by a macroscopic air-based molecular communication testbed. In artificially induced coughs, a saturated aqueous solution containing a fluorescent dye mixed with saliva is released by an adult test person. The emitted particles are made visible by means of optical detection exploiting the fluorescent dye. The number of particles recorded is significantly higher in test series without mouth and nose protection than in those with a wellfitting medical mask. A simulation tool for macroscopic molecular communication processes is extended and used for estimating the transmission of infectious aerosols in different environments. Towards this goal, parameters obtained through self experiments are taken. The work is inspired by the recent outbreak of the coronavirus pandemic.
Existing molecular communication systems, both theoretical and experimental, are characterized by low information rates. In this paper, inspired by time-of-flight mass spectrometry (TOFMS), we consider the design of a molecular communication system in which the channel is a vacuum and demonstrate that this method has the potential to increase achievable information rates by many orders of magnitude. We use modelling results from TOFMS to obtain arrival time distributions for accelerated ions and use them to analyze several species of ions, including hydrogen, nitrogen, argon, and benzene. We show that the achievable information rates can be increased using a velocity (Wien) filter, which reduces uncertainty in the velocity of the ions. Using a simplified communication model, we show that data rates well above 1 Gbit/s/molecule are achievable.
Molecular recognition, which is essential in processing information in biological systems, takes place in a crowded noisy biochemical environment and requires the recognition of a specific target within a background of various similar competing molecules. We consider molecular recognition as a transmission of information via a noisy channel and use this analogy to gain insights on the optimal, or fittest, molecular recognizer. We focus on the optimal structural properties of the molecules such as flexibility and conformation. We show that conformational changes upon binding, which often occur during molecular recognition, may optimize the detection performance of the recognizer. We thus suggest a generic design principle termed 'conformational proofreading' in which deformation enhances detection. We evaluate the optimal flexibility of the molecular recognizer, which is analogous to the stochasticity in a decision unit. In some scenarios, a flexible recognizer, i.e., a stochastic decision unit, performs better than a rigid, deterministic one. As a biological example, we discuss conformational changes during homologous recombination, the process of genetic exchange between two DNA s
The estimation of molecular abundances in interstellar clouds from spectroscopic observations requires radiative transfer calculations, which depend on basic molecular input data. This paper reviews recent developments in the fields of molecular data and radiative transfer. The first part is an overview of radiative transfer techniques, along with a "road map" showing which technique should be used in which situation. The second part is a review of measurements and calculations of molecular spectroscopic and collisional data, with a summary of recent collisional calculations and suggested modeling strategies if collision data are unavailable. The paper concludes with an overview of future developments and needs in the areas of radiative transfer and molecular data.
Molecular codes translate information written in one type of molecules into another molecular language. We introduce a simple model that treats molecular codes as noisy information channels. An optimal code is a channel that conveys information accurately and efficiently while keeping down the impact of errors. The equipoise of the three conflicting needs, for minimal error-load, minimal cost of resources and maximal diversity of vocabulary, defines the fitness of the code. The model suggests a mechanism for the emergence of a code when evolution varies the parameters that control this equipoise and the mapping between the two molecular languages becomes non-random. This mechanism is demonstrated by a simple toy model that is formally equivalent to a mean-field Ising magnet.
The CDMS was founded 1998 to provide in its catalog section line lists of molecular species which may be observed in various astronomical sources using radio astronomy. The line lists contain transition frequencies with qualified accuracies, intensities, quantum numbers, as well as further auxilary information. They have been generated from critically evaluated experimental line lists, mostly from laboratory experiments, employing established Hamiltonian models. Seperate entries exist for different isotopic species and usually also for different vibrational states. As of December 2015, the number of entries is 792. They are available online as ascii tables with additional files documenting information on the entries. The Virtual Atomic and Molecular Data Centre was founded more than 5 years ago as a common platform for atomic and molecular data. This platform facilitates exchange not only between spectroscopic databases related to astrophysics or astrochemistry, but also with collisional and kinetic databases. A dedicated infrastructure was developed to provide a common data format in the various databases enabling queries to a large variety of databases on atomic and molecular dat
Microbiology culture reports contain critical information for important clinical and public health applications. However, microbiology reports often have complex, semi-structured, free-text data that present a barrier for secondary use. Here we present the development and validation of an open-source package designed to ingest free-text microbiology reports, determine whether the culture is positive, and return a list of SNOMED-CT mapped bacteria. Our rule-based natural language processing algorithm was developed using microbiology reports from two different electronic health record systems in a large healthcare organization, and then externally validated on the reports of two other institutions with manually-extracted results as a benchmark. Our algorithm achieved F-1 scores >0.95 on all classification tasks across both validation sets. Our concept extraction Python package, MicrobEx, is designed to be reused and adapted to individual institutions as an upstream process for other clinical applications, such as machine learning studies, clinical decision support, and disease surveillance systems.