Artificial intelligence (AI), propelled by advancements in machine learning, has made significant strides in solving complex tasks. However, the current neural network-based paradigm, while effective, is heavily constrained by inherent limitations, primarily a lack of structural organization and a progression of learning that displays undesirable properties. As AI research progresses without a unifying framework, it either tries to patch weaknesses heuristically or draws loosely from biological mechanisms without strong theoretical foundations. Meanwhile, the recent paradigm shift in evolutionary understanding -- driven primarily by evolutionary developmental biology (EDB) -- has been largely overlooked in AI literature, despite a striking analogy between the Modern Synthesis and contemporary machine learning, evident in their shared assumptions, approaches, and limitations upon careful analysis. Consequently, the principles of adaptation from EDB that reshaped our understanding of the evolutionary process can also form the foundation of a unifying conceptual framework for the next design philosophy in AI, going beyond mere inspiration and grounded firmly in biology's first princip
How can we use AI to discover a new state of the art for a scientific problem? Prior work in test-time scaling, such as AlphaEvolve, performs search by prompting a frozen LLM. We perform reinforcement learning at test time, so the LLM can continue to train, but now with experience specific to the test problem. This form of continual learning is quite special, because its goal is to produce one great solution rather than many good ones on average, and to solve this very problem rather than generalize to other problems. Therefore, our learning objective and search subroutine are designed to prioritize the most promising solutions. We call this method Test-Time Training to Discover (TTT-Discover). Following prior work, we focus on problems with continuous rewards. We report results for every problem we attempted, across mathematics, GPU kernel engineering, algorithm design, and biology. TTT-Discover sets the new state of the art in almost all of them: (i) Erdős' minimum overlap problem and an autocorrelation inequality; (ii) a GPUMode kernel competition (up to $2\times$ faster than prior art); (iii) past AtCoder algorithm competitions; and (iv) denoising problem in single-cell analysi
The discovery of general principles underlying the complexity and diversity of cellular and developmental systems is a central and long-standing aim of biology. Whilst new technologies collect data at an ever-accelerating rate, there is growing concern that conceptual progress is not keeping pace. We contend that this is due to a paucity of appropriate conceptual frameworks to serve as a basis for general theories of mesoscale biological phenomena. In exploring this issue, we have developed a foundation for one such framework, termed the Core and Periphery (C&P) hypothesis, which reveals hidden generality across the diverse and complex behaviors exhibited by cells and tissues. Here, we present the C&P concept, provide examples of its applicability across multiple scales, argue its consistency with evolution, and discuss key implications and open questions. We propose that the C&P hypothesis could unlock new avenues of conceptual progress in cell and developmental biology.
Artificial agents that support human group interactions hold great promise, especially in sensitive contexts such as well-being promotion and therapeutic interventions. However, current systems struggle to mediate group interactions involving people who are not neurotypical. This limitation arises because most AI detection models (e.g., for turn-taking) are trained on data from neurotypical populations. This work takes a step toward inclusive AI by addressing the challenge of eye contact detection, a core component of non-verbal communication, with and for people with Intellectual and Developmental Disabilities. First, we introduce a new dataset, Multi-party Interaction with Intellectual and Developmental Disabilities (MIDD), capturing atypical gaze and engagement patterns. Second, we present the results of a comparative analysis with neurotypical datasets, highlighting differences in class imbalance, speaking activity, gaze distribution, and interaction dynamics. Then, we evaluate classifiers ranging from SVMs to FSFNet, showing that fine-tuning on MIDD improves performance, though notable limitations remain. Finally, we present the insights gathered through a focus group with six
Recent work suggests that Large Language Models (LLMs) are sensitive to the belief states of agents described by text, as measured by the false belief task (FBT), yet persistent concerns of construct validity remain. We adopt a **developmental perspective**, tracing the pattern of mental state reasoning behavior -- and likely **preconditions** for this behavior -- across multiple training stages in the Olmo2 and Pythia language model suites. We find that above-chance FBT performance depends both on model size and sufficient training volume, emerges relatively late in pretraining, and is most improved by post-training interventions (SFT, DPO) in the condition most diagnostic of mentalizing (False Belief, Implicit). However, FBT performance is fragile: consistent with past work, the use of non-factive verbs (e.g., thinks) increases false belief attributions even in the True Belief condition. To contextualize these findings, we track the emergence of **situation modeling**: the ability to report on basic factual properties of a described scene. Situation modeling accuracy generally precedes and exceeds FBT accuracy, yet situational representations also prove surprisingly incoherent in
Cancer, as the uncontrollable cell growth, is related to many branches of biology. In this review, we will discuss three mathematical approaches for studying cancer biology: population dynamics, gene regulation, and developmental biology. If we understand all biochemical mechanisms of cancer cells, we can directly calculate how the cancer cell population behaves. Inversely, just from the cell count data, we can use population dynamics to infer the mechanisms. Cancer cells emerge from certain genetic mutations, which affect the expression of other genes through gene regulation. Therefore, knowledge of gene regulation can help with cancer prevention and treatment. Developmental biology studies acquisition and maintenance of normal cellular function, which is inspiring to cancer biology in the opposite direction. Besides, cancer cells implanted into an embryo can differentiate into normal tissues, which provides a possible approach of curing cancer. This review illustrates the role of mathematics in these three fields: what mathematical models are used, what data analysis tools are applied, and what mathematical theorems need to be proved. We hope that applied mathematicians and even
This technical monograph provides a comprehensive overview of the field of quantum biology. It approaches quantum biology from a physical perspective with core quantum mechanical concepts presented foremost to provide a theoretical foundation for the field. An extensive body of research is covered to clarify the significance of quantum biology as a scientific field, outlining the field's long-standing importance in the historical development of quantum theory. This lays the essential groundwork to enable further advances in nanomedicine and biotechnology. Written for academics, biological science researchers, physicists, biochemists, medical technologists, and students of quantum mechanics, this text brings clarity to fundamental advances being made in the emerging science of quantum biology.
The development of multicellular organisms entails a deep connection between time-dependent biochemical processes taking place at the subcellular level, and the resulting macroscopic phenotypes that arise in populations of up to trillions of cells. A statistical mechanics of developmental processes would help to understand how microscopic genotypes map onto macroscopic phenotypes, a general goal across biology. Here we follow this approach, hypothesizing that development should be understood as a thermodynamic transition between non-equilibrium states. We test this hypothesis in the context of the fruit fly, Drosophila melanogaster, a model organism used widely in genetics and developmental biology for over a century. Applying a variety of information-theoretic measures to public transcriptomics datasets of whole fly embryos during development, we show that the global temporal dynamics of gene expression can be understood as a process that probabilistically guides embryonic dynamics across macroscopic phenotypic stages. In particular, we demonstrate signatures of irreversibility in the information complexity of transcriptomic dynamics, as measured mainly by the permutation entropy
This article frames the relation between biology and physics by characterizing the former as a subdiscipline rather than a special case of the latter. To do this, we posit biological physics as the science of living matter in contrast to classic biophysics, the study of organismal properties by physical techniques. At the scale of the individual cell, living matter is nonunitary, i.e., not composed of aggregated subunits, and has features (e.g., intracellular organizational arrangements and biomolecular condensates) that are unlike any materials of the nonliving world. In transiently or constitutively multicellular forms (social microorganisms, animals, plants), living matter sustains physical processes that are generic (shared with nonliving matter, e.g., subunit communication by molecular diffusion in cellular slime molds), biogeneric (analogous to nonliving matter but realized through cellular activities, e.g., subunit demixing in animal embryos) or nongeneric (pertaining to sui generis materials, e.g., budding of active solids in plants). This "forms of matter" perspective is philosophically situated in the dialectical materialism of Engels and Hessen and the multilevel physica
There is much to learn through synthesis of Developmental Biology, Cognitive Science and Computational Modeling. Our path forward involves a design for developmentally-inspired learning agents based on Braitenberg Vehicles. Continual developmental neurosimulation allows us to consider the role of developmental trajectories in bridging the related phenomena of nervous system morphogenesis, developmental learning, and plasticity. Being closely tied to continual learning, our approach is tightly integrated with developmental embodiment, and can be implemented using a type of agent called developmental Braitenberg Vehicles (dBVs). dBVs begin their lives as a set of undefined structures that transform into agent-based systems including a body, sensors, effectors, and nervous system. This phenotype is characterized in terms of developmental timing: with distinct morphogenetic, critical, and acquisition (developmental learning) periods. We further propose that network morphogenesis can be accomplished using a genetic algorithmic approach, while developmental learning can be implemented using a number of computational methodologies. This approach provides a framework for adaptive agent beh
Biological systems are generally complicated and/or complex. In the former approach, one sets up a model with a large number of parameters to describe the system in detail. The latter approach focuses on understanding the universal aspects of biological systems. In this case, an appropriate simple model represents a universality class. The extraction of universal properties is supported by evolutionary robustness and the reduction of dimensionality in high-dimensional states. Integrating the data-driven omics approach with the universality approach is an important step in systems biology.
Spatially annotated single-cell datasets provide unprecedented opportunities to dissect cell-cell communication in development and disease. Heterotypic signaling includes interactions between different cell types and is well established in tissue development and spatial organization. Epithelial organization requires several different programs that are tightly regulated. Planar cell polarity is the organization of epithelial cells along the planar axis orthogonal to the apical-basal axis. In this study, we investigate planar cell polarity factors and explore the implications of developmental regulators as malignant drivers. Utilizing cancer systems biology analysis, we derive gene expression network for WNT-ligands (WNT) and their cognate frizzled (FZD) receptors in skin cutaneous melanoma. The profiles supported by unsupervised clustering of multiple-sequence alignments identify ligand-independent signaling and implications for metastatic progression based on the underpinning developmental spatial program. Omics studies and spatial biology connect developmental programs with oncological events and explain key spatial features of metastatic aggressiveness. Dysregulation of prominent
Systems biology relies on mathematical models that often involve complex and intractable likelihood functions, posing challenges for efficient inference and model selection. Generative models, such as normalizing flows, have shown remarkable ability in approximating complex distributions in various domains. However, their application in systems biology for approximating intractable likelihood functions remains unexplored. Here, we elucidate a framework for leveraging normalizing flows to approximate complex likelihood functions inherent to systems biology models. By using normalizing flows in the Simulation-based inference setting, we demonstrate a method that not only approximates a likelihood function but also allows for model inference in the model selection setting. We showcase the effectiveness of this approach on real-world systems biology problems, providing practical guidance for implementation and highlighting its advantages over traditional computational methods.
Understanding the biological mechanisms of disease is crucial for medicine, and in particular, for drug discovery. AI-powered analysis of genome-scale biological data holds great potential in this regard. The increasing availability of single-cell RNA sequencing data has enabled the development of large foundation models for disease biology. However, existing foundation models only modestly improve over task-specific models in downstream applications. Here, we explored two avenues for improving single-cell foundation models. First, we scaled the pre-training data to a diverse collection of 116 million cells, which is larger than those used by previous models. Second, we leveraged the availability of large-scale biological annotations as a form of supervision during pre-training. We trained the \model family of models comprising six transformer-based state-of-the-art single-cell foundation models with 70 million, 160 million, and 400 million parameters. We vetted our models on several downstream evaluation tasks, including identifying the underlying disease state of held-out donors not seen during training, distinguishing between diseased and healthy cells for disease conditions and
We developed a theory showing that under appropriate normalizations and rescalings, temperature response curves show a remarkably regular behavior and follow a general, universal law. The impressive universality of temperature response curves remained hidden due to various curve-fitting models not well-grounded in first principles. In addition, this framework has the potential to explain the origin of different scaling relationships in thermal performance in biology, from molecules to ecosystems. Here, we summarize the background, principles and assumptions, predictions, implications, and possible extensions of this theory.
AlphaFold 3 represents a transformative advancement in computational biology, enhancing protein structure prediction through novel multi-scale transformer architectures, biologically informed cross-attention mechanisms, and geometry-aware optimization strategies. These innovations dramatically improve predictive accuracy and generalization across diverse protein families, surpassing previous methods. Crucially, AlphaFold 3 embodies a paradigm shift toward differentiable simulation, bridging traditional static structural modeling with dynamic molecular simulations. By reframing protein folding predictions as a differentiable process, AlphaFold 3 serves as a foundational framework for integrating deep learning with physics-based molecular
Systems Biology has emerged in the last years as a new holistic approach based on the global understanding of cells instead of only being focused on their individual parts (genes or proteins), to better understand the complexity of human cells. Since the Systems Biology still does not provide the most accurate answers to our questions due to the complexity of cells and the limited quality of available information to perform a good gene/protein map analysis, we have created simpler models to ensure easier analysis of the map that represents the human cell. Therefore, a virtual organism has been designed according to the main physiological rules for humans in order to replicate the human organism and its vital functions. This toy model was constructed by defining the topology of its genes/proteins and the biological functions associated to it. There are several examples of these toy models that emulate natural processes to perform analysis of the virtual life in order to design the best strategy to understand real life. The strategy applied in this study combines topological and functional analysis integrating the knowledge about the relative position of a node among the others in th
In this paper, we propose and study several inverse problems of determining unknown parameters in nonlocal nonlinear coupled PDE systems, including the potentials, nonlinear interaction functions and time-fractional orders. In these coupled systems, we enforce non-negativity of the solutions, aligning with realistic scenarios in biology and ecology. There are several salient features of our inverse problem study: the drastic reduction in measurement/observation data due to averaging effects, the nonlinear coupling between multiple equations, and the nonlocality arising from fractional-type derivatives. These factors present significant challenges to our inverse problem, and such inverse problems have never been explored in previous literature. To address these challenges, we develop new and effective schemes. Our approach involves properly controlling the injection of different source terms to obtain multiple sets of mean flux data. This allows us to achieve unique identifiability results and accurately determine the unknown parameters. Finally, we establish a connection between our study and practical applications in biology, further highlighting the relevance of our work in real-
In classical evolutionary theory, genetic variation provides the source of heritable phenotypic variation on which natural selection acts. Against this classical view, several theories have emphasized that developmental variability and learning enhance nonheritable phenotypic variation, which in turn can accelerate evolutionary response. In this paper, I show how developmental variability alters evolutionary dynamics by smoothing the landscape that relates genotype to fitness. In a fitness landscape with multiple peaks and valleys, developmental variability can smooth the landscape to provide a directly increasing path of fitness to the highest peak. Developmental variability also allows initial survival of a genotype in response to novel or extreme environmental challenge, providing an opportunity for subsequent adaptation. This initial survival advantage arises from the way in which developmental variability smooths and broadens the fitness landscape. Ultimately, the synergism between developmental processes and genetic variation sets evolutionary rate.
Though it goes without saying that linear algebra is fundamental to mathematical biology, polynomial algebra is less visible. In this article, we will give a brief tour of four diverse biological problems where multivariate polynomials play a central role -- a subfield that is sometimes called "algebraic biology." Namely, these topics include biochemical reaction networks, Boolean models of gene regulatory networks, algebraic statistics and genomics, and place fields in neuroscience. After that, we will summarize the history of discrete and algebraic structures in mathematical biology, from their early appearances in the late 1960s to the current day. Finally, we will discuss the role of algebraic biology in the modern classroom and curriculum, including resources in the literature and relevant software. Our goal is to make this article widely accessible, reaching the mathematical biologist who knows no algebra, the algebraist who knows no biology, and especially the interested student who is curious about the synergy between these two seemingly unrelated fields.