Infinitely many distinct trait values may arise in populations bearing quantitative traits, and modeling their population dynamics is thus a formidable task. While classical models assume fixed or infinite population size, models in which the total population size fluctuates due to demographic noise in births and deaths can behave qualitatively differently from constant or infinite population models due to density-dependent dynamics. In this paper, I present a stochastic field theory for the eco-evolutionary dynamics of finite populations bearing one-dimensional quantitative traits. I derive stochastic field equations that describe the evolution of population densities, trait frequencies, and the mean value of any trait in the population. These equations recover well-known results such as the replicator-mutator equation, Price equation, and gradient dynamics in the infinite population limit. For finite populations, the equations describe the intricate interplay between natural selection, noise-induced selection, eco-evolutionary feedback, and neutral genetic drift in determining evolutionary trajectories. My work uses ideas from statistical physics, calculus of variations, and SPDE
Single species population models and discrete stochastic gene frequency models are two standards of mathematical biology important for the evolution of populations. An agent based model is presented which reproduces these models and then explores where these models agree and disagree under relaxed specifications. For the population models, the requirement of homogeneous mixing prevents prediction of extinctions due to local resource depletion. These models also suggest equilibrium based on attainment of constant population levels though underlying population characteristics may be nowhere close to equilibrium. The discrete stochastic gene frequency models assume well mixed populations at constant levels. The models' predictions for non-constant populations in strongly oscillating and chaotic regimes are surprisingly good, only diverging from the ABM at the most chaotic levels.
Advances in biology have mostly relied on theories that were subsequently revised, expanded or eventually refuted using experimental and other means. Theoretical biology used to primarily provide a basis to rationally examine the frameworks within which biological experiments were carried out and to shed light on overlooked gaps in understanding. Today, however, theoretical biology has generally become synonymous with computational and mathematical biology. This could in part be explained by a relatively recent tendency in which a "data first", rather than a "theory first", approach is preferred. Moreover, generating hypotheses has at times become procedural rather than theoretical. This situation leaves our understanding enmeshed in data, which should be disentangled from much noise. Given the many unresolved questions in biology and medicine, it seems apt to revive the role of pure theory in the biological sciences. This paper makes the case for a "philosophical biology" (philbiology), distinct from but quite complementary to philosophy of biology (philobiology), which would entail biological investigation through philosophical approaches. Philbiology would thus be a reincarnatio
This technical monograph provides a comprehensive overview of the field of quantum biology. It approaches quantum biology from a physical perspective with core quantum mechanical concepts presented foremost to provide a theoretical foundation for the field. An extensive body of research is covered to clarify the significance of quantum biology as a scientific field, outlining the field's long-standing importance in the historical development of quantum theory. This lays the essential groundwork to enable further advances in nanomedicine and biotechnology. Written for academics, biological science researchers, physicists, biochemists, medical technologists, and students of quantum mechanics, this text brings clarity to fundamental advances being made in the emerging science of quantum biology.
A population is considered stationary if the growth rate is zero and the age structure is constant. It thus follows that a population is considered non-stationary if either its growth rate is non-zero and/or its age structure is non-constant. We propose three properties that are related to the stationary population identity (SPI) of population biology by connecting it with stationary populations and non-stationary populations which are approaching stationarity. One of these important properties is that SPI can be applied to partition a population into stationary and non-stationary components. These properties provide deeper insights into cohort formation in real-world populations and the length of the duration for which stationary and non-stationary conditions hold. The new concepts are based on the time gap between the occurrence of stationary and non-stationary populations within the SPI framework that we refer to as Oscillatory SPI and the Amplitude of SPI. This article will appear in Bulletin of Mathematical Biology (Springer)
This article frames the relation between biology and physics by characterizing the former as a subdiscipline rather than a special case of the latter. To do this, we posit biological physics as the science of living matter in contrast to classic biophysics, the study of organismal properties by physical techniques. At the scale of the individual cell, living matter is nonunitary, i.e., not composed of aggregated subunits, and has features (e.g., intracellular organizational arrangements and biomolecular condensates) that are unlike any materials of the nonliving world. In transiently or constitutively multicellular forms (social microorganisms, animals, plants), living matter sustains physical processes that are generic (shared with nonliving matter, e.g., subunit communication by molecular diffusion in cellular slime molds), biogeneric (analogous to nonliving matter but realized through cellular activities, e.g., subunit demixing in animal embryos) or nongeneric (pertaining to sui generis materials, e.g., budding of active solids in plants). This "forms of matter" perspective is philosophically situated in the dialectical materialism of Engels and Hessen and the multilevel physica
Identification of dynamics underlying biochemical pathways of interest in oncology is a primary goal in current systems biology. Understanding structures and interactions that govern the evolution of such systems is believed to be a cornerstone in this research. Systems theory and systems identification theory are primary resources for this task since they both provide a self consistent framework for modelling and manipulating models of dynamical systems that are best suited for the problem under investigation. We address herein the issue of obtaining an informative dataset ZN to be used as starting point for identification of EGFR pathway dynamics. In order to match experimental identifiability criteria we propose a theoretical framework for input stimulus design based on dynamical properties of the system under investigation. A feasible optofluidic design has been designed on the basis of the spectral properties of the driving inputs that maximize information content after the theoretical studies.
We developed a theory showing that under appropriate normalizations and rescalings, temperature response curves show a remarkably regular behavior and follow a general, universal law. The impressive universality of temperature response curves remained hidden due to various curve-fitting models not well-grounded in first principles. In addition, this framework has the potential to explain the origin of different scaling relationships in thermal performance in biology, from molecules to ecosystems. Here, we summarize the background, principles and assumptions, predictions, implications, and possible extensions of this theory.
AlphaFold 3 represents a transformative advancement in computational biology, enhancing protein structure prediction through novel multi-scale transformer architectures, biologically informed cross-attention mechanisms, and geometry-aware optimization strategies. These innovations dramatically improve predictive accuracy and generalization across diverse protein families, surpassing previous methods. Crucially, AlphaFold 3 embodies a paradigm shift toward differentiable simulation, bridging traditional static structural modeling with dynamic molecular simulations. By reframing protein folding predictions as a differentiable process, AlphaFold 3 serves as a foundational framework for integrating deep learning with physics-based molecular
Systems biology relies on mathematical models that often involve complex and intractable likelihood functions, posing challenges for efficient inference and model selection. Generative models, such as normalizing flows, have shown remarkable ability in approximating complex distributions in various domains. However, their application in systems biology for approximating intractable likelihood functions remains unexplored. Here, we elucidate a framework for leveraging normalizing flows to approximate complex likelihood functions inherent to systems biology models. By using normalizing flows in the Simulation-based inference setting, we demonstrate a method that not only approximates a likelihood function but also allows for model inference in the model selection setting. We showcase the effectiveness of this approach on real-world systems biology problems, providing practical guidance for implementation and highlighting its advantages over traditional computational methods.
The immune response to a pathogen has two basic features. The first is the expansion of a few pathogen-specific cells to form a population large enough to control the pathogen. The second is the process of differentiation of cells from an initial naive phenotype to an effector phenotype which controls the pathogen, and subsequently to a memory phenotype that is maintained and responsible for long-term protection. The expansion and the differentiation have been considered largely independently. Changes in cell populations are typically described using ecologically based ordinary differential equation models. In contrast, differentiation of single cells is studied within systems biology and is frequently modeled by considering changes in gene and protein expression in individual cells. Recent advances in experimental systems biology make available for the first time data to allow the coupling of population and high dimensional expression data of immune cells during infections. Here we describe and develop population-expression models which integrate these two processes into systems biology on the multicellular level. When translated into mathematical equations, these models result in
A number of models in mathematical epidemiology have been developed to account for control measures such as vaccination or quarantine. However, COVID-19 has brought unprecedented social distancing measures, with a challenge on how to include these in a manner that can explain the data but avoid overfitting in parameter inference. We here develop a simple time-dependent model, where social distancing effects are introduced analogous to coarse-grained models of gene expression control in systems biology. We apply our approach to understand drastic differences in COVID-19 infection and fatality counts, observed between Hubei (Wuhan) and other Mainland China provinces. We find that these unintuitive data may be explained through an interplay of differences in transmissibility, effective protection, and detection efficiencies between Hubei and other provinces. More generally, our results demonstrate that regional differences may drastically shape infection outbursts. The obtained results demonstrate the applicability of our developed method to extract key infection parameters directly from publically available data so that it can be globally applied to outbreaks of COVID-19 in a number
Understanding the biological mechanisms of disease is crucial for medicine, and in particular, for drug discovery. AI-powered analysis of genome-scale biological data holds great potential in this regard. The increasing availability of single-cell RNA sequencing data has enabled the development of large foundation models for disease biology. However, existing foundation models only modestly improve over task-specific models in downstream applications. Here, we explored two avenues for improving single-cell foundation models. First, we scaled the pre-training data to a diverse collection of 116 million cells, which is larger than those used by previous models. Second, we leveraged the availability of large-scale biological annotations as a form of supervision during pre-training. We trained the \model family of models comprising six transformer-based state-of-the-art single-cell foundation models with 70 million, 160 million, and 400 million parameters. We vetted our models on several downstream evaluation tasks, including identifying the underlying disease state of held-out donors not seen during training, distinguishing between diseased and healthy cells for disease conditions and
In this paper, we propose and study several inverse problems of determining unknown parameters in nonlocal nonlinear coupled PDE systems, including the potentials, nonlinear interaction functions and time-fractional orders. In these coupled systems, we enforce non-negativity of the solutions, aligning with realistic scenarios in biology and ecology. There are several salient features of our inverse problem study: the drastic reduction in measurement/observation data due to averaging effects, the nonlinear coupling between multiple equations, and the nonlocality arising from fractional-type derivatives. These factors present significant challenges to our inverse problem, and such inverse problems have never been explored in previous literature. To address these challenges, we develop new and effective schemes. Our approach involves properly controlling the injection of different source terms to obtain multiple sets of mean flux data. This allows us to achieve unique identifiability results and accurately determine the unknown parameters. Finally, we establish a connection between our study and practical applications in biology, further highlighting the relevance of our work in real-
The coalescent is a stochastic process representing ancestral lineages in a population undergoing neutral genetic drift. Originally defined for a well-mixed population, the coalescent has been adapted in various ways to accommodate spatial, age, and class structure, along with other features of real-world populations. To further extend the range of population structures to which coalescent theory applies, we formulate a coalescent process for a broad class of neutral drift models with arbitrary -- but fixed -- spatial, age, sex, and class structure, haploid or diploid genetics, and any fixed mating pattern. Here, the coalescent is represented as a random sequence of mappings $\mathcal{C} = \left(C_t\right)_{t=0}^\infty$ from a finite set $G$ to itself. The set $G$ represents the ``sites'' (in individuals, in particular locations and/or classes) at which these alleles can live. The state of the coalescent, $C_t:G \to G$, maps each site $g \in G$ to the site containing $g$'s ancestor, $t$ time-steps into the past. Using this representation, we define and analyze coalescence time, coalescence branch length, mutations prior to coalescence, and stationary probabilities of identity-by-de
Evolutionary biology shares many concepts with statistical physics: both deal with populations, whether of molecules or organisms, and both seek to simplify evolution in very many dimensions. Often, methodologies have undergone parallel and independent development, as with stochastic methods in population genetics. We discuss aspects of population genetics that have embraced methods from physics: amongst others, non-equilibrium statistical mechanics, travelling waves, and Monte-Carlo methods have been used to study polygenic evolution, rates of adaptation, and range expansions. These applications indicate that evolutionary biology can further benefit from interactions with other areas of statistical physics, for example, by following the distribution of paths taken by a population through time.
Increasing evidence of the effects of changing climate on physical ocean conditions and long-term changes in fish populations adds to the need to understand the effects of stochastic forcing on marine populations. Cohort resonance is of particular interest because it involves selective sensitivity to specific time scales of environmental variability, including that of mean age of reproduction, and, more importantly, very low frequencies (i.e., trends). We present an age-structured model for two Pacific salmon species with environmental variability in survival rate and in individual growth rate, hence spawning age distribution. We use computed frequency response curves and analysis of the linearized dynamics to obtain two main results. First, the frequency response of the population is affected by the life history stage at which variability affects the population; varying growth rate tends to excite periodic resonance in age structure, while varying survival tends to excite low-frequency fluctuation with more effect on total population size. Second, decreasing adult survival strengthens the cohort resonance effect at all frequencies, a finding that addresses the question of how fish
The last decade has witnessed a rapid growth in understanding of the pivotal roles of mechanical stresses and physical forces in cell biology. As a result an integrated view of cell biology is evolving, where genetic and molecular features are scrutinized hand in hand with physical and mechanical characteristics of cells. Physics of liquid crystals has emerged as a burgeoning new frontier in cell biology over the past few years, fueled by an increasing identification of orientational order and topological defects in cell biology, spanning scales from subcellular filaments to individual cells and multicellular tissues. Here, we provide an account of most recent findings and developments together with future promises and challenges in this rapidly evolving interdisciplinary research direction.
Synthetic biologists have made great progress over the past decade in developing methods for modular assembly of genetic sequences and in engineering biological systems with a wide variety of functions in various contexts and organisms. However, current paradigms in the field entangle sequence and functionality in a manner that makes abstraction difficult, reduces engineering flexibility, and impairs predictability and design reuse. Functional Synthetic Biology aims to overcome these impediments by focusing the design of biological systems on function, rather than on sequence. This reorientation will decouple the engineering of biological devices from the specifics of how those devices are put to use, requiring both conceptual and organizational change, as well as supporting software tooling. Realizing this vision of Functional Synthetic Biology will allow more flexibility in how devices are used, more opportunity for reuse of devices and data, improvements in predictability, and reductions in technical risk and cost.
We study fixation probabilities and times as a consequence of neutral genetic drift in subdivided populations, motivated by a model of the cultural evolutionary process of language change that is described by the same mathematics as the biological process. We focus on the growth of fixation times with the number of subpopulations, and variation of fixation probabilities and times with initial distributions of mutants. A general formula for the fixation probability for arbitrary initial condition is derived by extending a duality relation between forwards- and backwards-time properties of the model from a panmictic to a subdivided population. From this we obtain new formulae, formally exact in the limit of extremely weak migration, for the mean fixation time from an arbitrary initial condition for Wright's island model, presenting two cases as examples. For more general models of population subdivision, formulae are introduced for an arbitrary number of mutants that are randomly located, and a single mutant whose position is known. These formulae contain parameters that typically have to be obtained numerically, a procedure we follow for two contrasting clustered models. These data