共找到 20 条结果
Biological systems are generally complicated and/or complex. In the former approach, one sets up a model with a large number of parameters to describe the system in detail. The latter approach focuses on understanding the universal aspects of biological systems. In this case, an appropriate simple model represents a universality class. The extraction of universal properties is supported by evolutionary robustness and the reduction of dimensionality in high-dimensional states. Integrating the data-driven omics approach with the universality approach is an important step in systems biology.
Advances in biology have mostly relied on theories that were subsequently revised, expanded or eventually refuted using experimental and other means. Theoretical biology used to primarily provide a basis to rationally examine the frameworks within which biological experiments were carried out and to shed light on overlooked gaps in understanding. Today, however, theoretical biology has generally become synonymous with computational and mathematical biology. This could in part be explained by a relatively recent tendency in which a "data first", rather than a "theory first", approach is preferred. Moreover, generating hypotheses has at times become procedural rather than theoretical. This situation leaves our understanding enmeshed in data, which should be disentangled from much noise. Given the many unresolved questions in biology and medicine, it seems apt to revive the role of pure theory in the biological sciences. This paper makes the case for a "philosophical biology" (philbiology), distinct from but quite complementary to philosophy of biology (philobiology), which would entail biological investigation through philosophical approaches. Philbiology would thus be a reincarnatio
Recent studies have demonstrated the feasibility of modeling single-cell data as natural languages and the potential of leveraging powerful large language models (LLMs) for understanding cell biology. However, a comprehensive evaluation of LLMs' performance on language-driven single-cell analysis tasks still remains unexplored. Motivated by this challenge, we introduce CellVerse, a unified language-centric question-answering benchmark that integrates four types of single-cell multi-omics data and encompasses three hierarchical levels of single-cell analysis tasks: cell type annotation (cell-level), drug response prediction (drug-level), and perturbation analysis (gene-level). Going beyond this, we systematically evaluate the performance across 14 open-source and closed-source LLMs ranging from 160M to 671B on CellVerse. Remarkably, the experimental results reveal: (1) Existing specialist models (C2S-Pythia) fail to make reasonable decisions across all sub-tasks within CellVerse, while generalist models such as Qwen, Llama, GPT, and DeepSeek family models exhibit preliminary understanding capabilities within the realm of cell biology. (2) The performance of current LLMs falls short
This article frames the relation between biology and physics by characterizing the former as a subdiscipline rather than a special case of the latter. To do this, we posit biological physics as the science of living matter in contrast to classic biophysics, the study of organismal properties by physical techniques. At the scale of the individual cell, living matter is nonunitary, i.e., not composed of aggregated subunits, and has features (e.g., intracellular organizational arrangements and biomolecular condensates) that are unlike any materials of the nonliving world. In transiently or constitutively multicellular forms (social microorganisms, animals, plants), living matter sustains physical processes that are generic (shared with nonliving matter, e.g., subunit communication by molecular diffusion in cellular slime molds), biogeneric (analogous to nonliving matter but realized through cellular activities, e.g., subunit demixing in animal embryos) or nongeneric (pertaining to sui generis materials, e.g., budding of active solids in plants). This "forms of matter" perspective is philosophically situated in the dialectical materialism of Engels and Hessen and the multilevel physica
Cancer, as the uncontrollable cell growth, is related to many branches of biology. In this review, we will discuss three mathematical approaches for studying cancer biology: population dynamics, gene regulation, and developmental biology. If we understand all biochemical mechanisms of cancer cells, we can directly calculate how the cancer cell population behaves. Inversely, just from the cell count data, we can use population dynamics to infer the mechanisms. Cancer cells emerge from certain genetic mutations, which affect the expression of other genes through gene regulation. Therefore, knowledge of gene regulation can help with cancer prevention and treatment. Developmental biology studies acquisition and maintenance of normal cellular function, which is inspiring to cancer biology in the opposite direction. Besides, cancer cells implanted into an embryo can differentiate into normal tissues, which provides a possible approach of curing cancer. This review illustrates the role of mathematics in these three fields: what mathematical models are used, what data analysis tools are applied, and what mathematical theorems need to be proved. We hope that applied mathematicians and even
AlphaFold 3 represents a transformative advancement in computational biology, enhancing protein structure prediction through novel multi-scale transformer architectures, biologically informed cross-attention mechanisms, and geometry-aware optimization strategies. These innovations dramatically improve predictive accuracy and generalization across diverse protein families, surpassing previous methods. Crucially, AlphaFold 3 embodies a paradigm shift toward differentiable simulation, bridging traditional static structural modeling with dynamic molecular simulations. By reframing protein folding predictions as a differentiable process, AlphaFold 3 serves as a foundational framework for integrating deep learning with physics-based molecular
We developed a theory showing that under appropriate normalizations and rescalings, temperature response curves show a remarkably regular behavior and follow a general, universal law. The impressive universality of temperature response curves remained hidden due to various curve-fitting models not well-grounded in first principles. In addition, this framework has the potential to explain the origin of different scaling relationships in thermal performance in biology, from molecules to ecosystems. Here, we summarize the background, principles and assumptions, predictions, implications, and possible extensions of this theory.
Systems biology relies on mathematical models that often involve complex and intractable likelihood functions, posing challenges for efficient inference and model selection. Generative models, such as normalizing flows, have shown remarkable ability in approximating complex distributions in various domains. However, their application in systems biology for approximating intractable likelihood functions remains unexplored. Here, we elucidate a framework for leveraging normalizing flows to approximate complex likelihood functions inherent to systems biology models. By using normalizing flows in the Simulation-based inference setting, we demonstrate a method that not only approximates a likelihood function but also allows for model inference in the model selection setting. We showcase the effectiveness of this approach on real-world systems biology problems, providing practical guidance for implementation and highlighting its advantages over traditional computational methods.
Understanding the biological mechanisms of disease is crucial for medicine, and in particular, for drug discovery. AI-powered analysis of genome-scale biological data holds great potential in this regard. The increasing availability of single-cell RNA sequencing data has enabled the development of large foundation models for disease biology. However, existing foundation models only modestly improve over task-specific models in downstream applications. Here, we explored two avenues for improving single-cell foundation models. First, we scaled the pre-training data to a diverse collection of 116 million cells, which is larger than those used by previous models. Second, we leveraged the availability of large-scale biological annotations as a form of supervision during pre-training. We trained the \model family of models comprising six transformer-based state-of-the-art single-cell foundation models with 70 million, 160 million, and 400 million parameters. We vetted our models on several downstream evaluation tasks, including identifying the underlying disease state of held-out donors not seen during training, distinguishing between diseased and healthy cells for disease conditions and
In this paper, we propose and study several inverse problems of determining unknown parameters in nonlocal nonlinear coupled PDE systems, including the potentials, nonlinear interaction functions and time-fractional orders. In these coupled systems, we enforce non-negativity of the solutions, aligning with realistic scenarios in biology and ecology. There are several salient features of our inverse problem study: the drastic reduction in measurement/observation data due to averaging effects, the nonlinear coupling between multiple equations, and the nonlocality arising from fractional-type derivatives. These factors present significant challenges to our inverse problem, and such inverse problems have never been explored in previous literature. To address these challenges, we develop new and effective schemes. Our approach involves properly controlling the injection of different source terms to obtain multiple sets of mean flux data. This allows us to achieve unique identifiability results and accurately determine the unknown parameters. Finally, we establish a connection between our study and practical applications in biology, further highlighting the relevance of our work in real-
The discovery of general principles underlying the complexity and diversity of cellular and developmental systems is a central and long-standing aim of biology. Whilst new technologies collect data at an ever-accelerating rate, there is growing concern that conceptual progress is not keeping pace. We contend that this is due to a paucity of appropriate conceptual frameworks to serve as a basis for general theories of mesoscale biological phenomena. In exploring this issue, we have developed a foundation for one such framework, termed the Core and Periphery (C&P) hypothesis, which reveals hidden generality across the diverse and complex behaviors exhibited by cells and tissues. Here, we present the C&P concept, provide examples of its applicability across multiple scales, argue its consistency with evolution, and discuss key implications and open questions. We propose that the C&P hypothesis could unlock new avenues of conceptual progress in cell and developmental biology.
Synthetic biology is the engineering of cellular networks. It combines principles of engineering and the knowledge of biological networks to program the behavior of cells. Computational modeling techniques in conjunction with molecular biology techniques have been successful in constructing biological devices such as switches, oscillators, and gates. The ambition of synthetic biology is to construct complex systems from such fundamental devices, much in the same way electronic circuits are built from basic parts. As this ambition becomes a reality, engineering concepts such as interchangeable parts and encapsulation will find their way into biology. We realize that there is a need for computational tools that would support such engineering concepts in biology. As a solution, we have developed the software Athena that allows biological models to be constructed as modules. Modules can be connected to one another without altering the modules themselves. In addition, Athena houses various tools useful for designing synthetic networks including tools to perform simulations, automatically derive transcription rate expressions, and view and edit synthetic DNA sequences. New tools can be i
I believe an atomic biology is needed to supplement present day molecular biology, if we are to design and understand proteins, as well as define, make, and use them. Topics in the paper are molecular biology and atomic biology. Electrodiffusion in the open channel. Electrodiffusion in mixed electrolytes. Models of permeation. State Models of Permeation are Inconsistent with the Electric Field. Making models in atomic biology. Molecular dynamics. Temporal Limitations; Spatial Limitations; Periodic boundary conditions. Hierarchy of models of the open channel. Stochastic Motion of the Channel. Langevin Dynamics. Simulations of the Reaction Path: the Permion. Chemical reactions. What was wrong? Back to the hierarchy: Occam's razor can slit your throat. Poisson-Nernst-Planck PNP Models Flux Ratios; Pumping by Field Coupling. Gating in channels of one conformation. Gating by Field Switching; Gating Current; Gating in Branched Channels; Blocking. Back to the hierarchy: Linking levels. Is there a theory? At what level will the adaptation be found? Simplicity, evolution, and natural function.
It is often stated that there are no laws in biology, where everything is contingent and could have been otherwise, being solely the result of historical accidents. Furthermore, the customary introduction of fundamental biological entities such as individual organisms, cells, genes, catalysts and motors remains largely descriptive; constructive approaches involving deductive reasoning appear, in comparison, almost absent. As a consequence, both the logical content and principles of biology need to be reconsidered. The present article describes an inquiry into the foundations of biology. The foundations of biology are built in terms of elements, logic and principles, using both the language and the general methods employed in other disciplines. This approach assumes the existence of a certain unity of human knowledge that transcends discipline boundaries. Leibniz's principle of sufficient reason is revised through the introduction of the complementary concepts of symmetry and asymmetry and of necessity and contingency. This is used to explain how these four concepts are involved in the elaboration of theories or laws of nature. Four fundamental theories of biology are then identifie
Though it goes without saying that linear algebra is fundamental to mathematical biology, polynomial algebra is less visible. In this article, we will give a brief tour of four diverse biological problems where multivariate polynomials play a central role -- a subfield that is sometimes called "algebraic biology." Namely, these topics include biochemical reaction networks, Boolean models of gene regulatory networks, algebraic statistics and genomics, and place fields in neuroscience. After that, we will summarize the history of discrete and algebraic structures in mathematical biology, from their early appearances in the late 1960s to the current day. Finally, we will discuss the role of algebraic biology in the modern classroom and curriculum, including resources in the literature and relevant software. Our goal is to make this article widely accessible, reaching the mathematical biologist who knows no algebra, the algebraist who knows no biology, and especially the interested student who is curious about the synergy between these two seemingly unrelated fields.
Although reproducibility is a core tenet of the scientific method, it remains challenging to reproduce many results. Surprisingly, this also holds true for computational results in domains such as systems biology where there have been extensive standardization efforts. For example, Tiwari et al. recently found that they could only repeat 50% of published simulation results in systems biology. Toward improving the reproducibility of computational systems research, we identified several resources that investigators can leverage to make their research more accessible, executable, and comprehensible by others. In particular, we identified several domain standards and curation services, as well as powerful approaches pioneered by the software engineering industry that we believe many investigators could adopt. Together, we believe these approaches could substantially enhance the reproducibility of systems biology research. In turn, we believe enhanced reproducibility would accelerate the development of more sophisticated models that could inform precision medicine and synthetic biology.
The last decade has witnessed a rapid growth in understanding of the pivotal roles of mechanical stresses and physical forces in cell biology. As a result an integrated view of cell biology is evolving, where genetic and molecular features are scrutinized hand in hand with physical and mechanical characteristics of cells. Physics of liquid crystals has emerged as a burgeoning new frontier in cell biology over the past few years, fueled by an increasing identification of orientational order and topological defects in cell biology, spanning scales from subcellular filaments to individual cells and multicellular tissues. Here, we provide an account of most recent findings and developments together with future promises and challenges in this rapidly evolving interdisciplinary research direction.
Background: Many mathematical models have now been employed across every area of systems biology. These models increasingly involve large numbers of unknown parameters, have complex structure which can result in substantial evaluation time relative to the needs of the analysis, and need to be compared to observed data. The correct analysis of such models usually requires a global parameter search, over a high dimensional parameter space, that incorporates and respects the most important sources of uncertainty. This can be an extremely difficult task, but it is essential for any meaningful inference or prediction to be made about any biological system. It hence represents a fundamental challenge for the whole of systems biology. Results: Bayesian statistical methodology for the uncertainty analysis of complex models is introduced, which is designed to address the high dimensional global parameter search problem. Bayesian emulators that mimic the systems biology model but which are extremely fast to evaluate are embedded within an iterative history match: an efficient method to search high dimensional spaces within a more formal statistical setting, while incorporating major sources
Quantum Biology is emerging as a new field at the intersection between fundamental physics and biology, promising novel insights into the nature and origin of biological order. We discuss several elements of QBCL (Quantum Biology at Cellular Level), a research program designed to extend the reach of quantum concepts to higher than molecular levels of biological organization. Key words. decoherence, macroscopic superpositions, basis-dependence, formal superposition, non-classical correlations, Basis-Dependent Selection (BDS), synthetic biology, evolvability mechanism loophole.
I reason here that the known folk law in biology that there is no general law in biology because of exceptions is false. The (quantitative) systems biology offers the potential to solve the Borges Dilemma, by transcending it. There have already a plenty of indications on this trend.