Guidelines for managing scientific data have been established under the FAIR principles, requiring that data be Findable, Accessible, Interoperable, and Reusable. In many scientific disciplines, especially computational biology, both data and models are key to progress. For this reason, and recognizing that such models are a very special type of "data", we argue that computational models, especially mechanistic models prevalent in medicine, physiology and systems biology, deserve a complementary set of guidelines. We propose the CURE principles, emphasizing that models should be Credible, Understandable, Reproducible, and Extensible. We delve into each principle, discussing verification, validation, and uncertainty quantification for model credibility; the clarity of model descriptions and annotations for understandability; adherence to standards and open science practices for reproducibility; and the use of open standards and modular code for extensibility and reuse. We outline recommended and baseline requirements for each aspect of CURE, aiming to enhance the impact and trustworthiness of computational models, particularly in biomedical applications where credibility is paramount. Our perspective underscores the need for a more disciplined approach to modeling, aligning with emerging trends such as Digital Twins and emphasizing the importance of data and modeling standards for interoperability and reuse. Finally, we emphasize that given the non-trivial effort required to implement the guidelines, the community should strive to automate as many of the guidelines as possible.
Lecanemab, an anti-amyloid antibody, has demonstrated a significant clinical benefit in slowing cognitive decline in early Alzheimer's disease (AD). A mechanistic Neuro-Dynamic Quantitative Systems Pharmacology (QSP) model was developed to capture the temporal and biological complexity of AD progression. This QSP model incorporates three interlinked modules reflecting core aspects of AD pathology: Aβ accumulation, tau pathology, and cognitive decline, where Aβ accumulation promotes tau pathology, which leads to neuronal damage and cognitive impairment. A large multivariate dataset was assembled from 4056 subjects participating in lecanemab studies and the Alzheimer's Disease Neuroimaging Initiative (ADNI) to inform and validate the model. Virtual populations-based model simulations successfully reproduced the hallmark cascade of AD pathology, consistent with the well-known Jack curve, from amyloid buildup to tau spread and cognitive decline over decades. Simulations accurately predicted all endpoints evaluated from the lecanemab trials and were further validated against data from other anti-Aβ therapies. Importantly, the model revealed that Aβ protofibrils are more potent drivers of tau pathology than plaques. In summary, the Neuro-Dynamic QSP model is the first of its kind to mechanistically link amyloid accumulation, tau pathology, and cognitive decline in AD, providing a powerful framework for simulating clinical scenarios and understanding disease mechanisms.
Planar cell polarity represents a fundamental mechanism by which cells within epithelial sheets align their orientation, enabling coordinated tissue morphogenesis and function. Disruption of PCP leads to developmental defects and disease, highlighting the importance of understanding its establishment and maintenance. While experimental studies have identified key protein molecules that drive PCP, mathematical and computational modeling have become indispensable in connecting molecular interactions to tissue-level outcomes. Over the last couple of decades, diverse approaches, such as agent-based models, Cellular Potts frameworks, Petri nets, continuum theories, and phenomenological models, have been developed to capture distinct aspects of PCP dynamics. These frameworks allow systematic exploration of nonlinear feedback, intracellular and intercellular signaling, and the influence of geometry and mechanics, and noise on polarization. This review summarizes these mathematical and computational developments in PCP modeling, emphasizing methodological assumptions, insights gained, and open challenges. By bridging experiment and theory, PCP modeling advances both mechanistic understanding and predictive capacity for tissue-scale organization.
Morphogenesis in early development involves complex and extreme deformations in response to intra- and intercellular forces. Zebrafish epiboly, the spreading of the blastoderm to cover and engulf the large yolk cell, is a key early event that sets the stage for the establishment of the body plan, but the way the forces driving expansion are generated and mediated is poorly understood. The enveloping layer (EVL), the thin squamous outer epithelium of the blastoderm, plays a central role. Forces generated in the yolk cell are transmitted through tight junctions to the marginal EVL cells, and then propagate through the rest of the EVL. To understand mechanisms of force generation and transduction during epiboly, we first need a mechanical model of the EVL capable of responding to such forces and undergoing the drastic deformation of epiboly. The expanding EVL more than doubles its surface area and experiences significant shear as it deforms from a thin cap at one pole to become a complete sphere, necessarily requiring extensive internal rearrangement. We constructed an agent-based model of the EVL and its response to exogenous forces using the center-based simulation framework, Tissue Forge. Our model captures the large viscoelastoplastic deformation of the EVL by cell rearrangement, and accommodates the required cell neighbor exchanges without losing mechanical integration. Features observed in living embryos, such as the straightening of the initially ragged leading edge, also emerge in the model. We identified two key components required for realistic epiboly in the model: first, a mechanism to enable tissue remodeling by cell rearrangement without tearing the tissue, and second, a negative feedback on the forces driving EVL expansion, to regulate and synchronize the advancement of the EVL margin. We discuss the implications of these findings for the behavior of living EVL and the mechanisms that drive epiboly.
Mild traumatic brain injury (mTBI) disproportionately affects children and adolescents and has been associated with poorer neurocognitive performance, but the biological mechanisms driving symptom variability and severity remain understudied. In accordance with the omnigenic disease model, we integrated gene-by-mTBI interaction genome-wide association studies on neurocognition from the Adolescent Brain Cognitive Development (ABCD) cohort with single-cell RNA sequencing gene regulatory networks to elucidate the cell type-specific key regulators and molecular mechanisms governing neurocognitive outcome of mTBI, specifically learning and memory performance. Our analysis revealed distinct network regulators in neuronal and glial cell types across hippocampal and cortical brain regions to orchestrate key neurodevelopmental pathways. Examples include APP for synaptic signaling in excitatory neurons, COX5A for mitochondrial function in inhibitory neurons, MOG for myelination in oligodendrocytes in the hippocampus; GRM7 for synaptic signaling in excitatory neurons, SV2A for synaptic signaling in inhibitory neurons, and MOG for myelination in oligodendrocytes in the cortex. These mechanisms also associate with learning and memory through pathway-based polygenic risk score modeling in ABCD. Our findings provide brain region- and cell type-specific insights into the complex regulatory network landscape of mTBI pathology and potential therapeutic candidates at the pathway and network levels.
暂无摘要(点击查看详情)
Monoclonal antibody N-glycosylation is a critical quality attribute influencing therapeutic safety and efficacy, and is strongly influenced by bioprocess design. NISTCHO, a publicly available Chinese hamster ovary producer cell line, is increasingly encouraged for use as a reference system. However, the impact of feeding strategies on cellular performance and N-glycosylation has not been assessed. Here, we applied multivariate analysis of compositional N-glycan data to assess how feeding strategies influence N-glycan composition of cNISTmAb. We varied feeding strategies in frequency, glucose supply, and galactose/manganese supplementation. Feeding frequency had minimal impact on quality attributes but strongly affected culture performance, with every-other-day feeding improving titers and cell-specific productivity. High glucose availability supported growth and productivity. Low glucose strategies reduced titers and shifted N-glycosylation towards non-galactosylated and fucosylated species, despite lactate accumulation remaining within favorable ranges. Galactose and manganese consistently increased antibody galactosylation, with galactose additionally serving as an auxiliary carbon source, extending cell viability. Importantly, mAb glycation remained stable across all feeding strategies at harvest. Overall, these results demonstrate that feed composition and timing can be used to tune both cellular performance and mAb glycosylation, establishing NISTCHO as a robust benchmark for standardized process-quality studies.
Biological networks exhibit complex structural organization, from motifs, which are recurring patterns of gene, protein, and molecular interactions, to higher order assemblies such as hypermotifs. Hypermotifs are statistically significant higher order assemblies formed through interactions and combinations of motifs, giving rise to collective structural and functional properties. This review discusses hypermotifs in biological networks, focusing on their structure, dynamics, functions, and patterns such as motif clustering and motif generalizations.
Many cell models deal with constraints for life's persistence, yet we lack principles for how dynamics interact with them and their origins in actual biology. Computational biology needs a theory of viability that confronts the life-death boundary to overcome this. We explore how geometric structures in a model's state space offer organizing principles for cell fate, and how idealized models of emergent individuals may help explain life's intrinsically generated limits.
Comprehensive analysis of the dynamics of Boolean models of biological systems is hampered by the exponentially large state space. Here we introduce the succession-diagram-based Markov chain (SD Markov chain), a coarse-grained representation that uses trap spaces (unescapable state subspaces) of the Boolean model as the states of a Markov chain. These trap spaces and their succession diagram can be efficiently identified, and constitute a dramatic reduction compared to the full state space. The SD Markov chain preserves the decisions that trap the system's dynamics while making the state space computationally tractable. Using an ensemble of random Boolean networks with known state transition matrices, we show that the SD Markov chain accurately reproduces attractors, basins of attraction, convergence probabilities, decision transitions, and sequences of events. We illustrate the insights and predictions that arise from the SD Markov chain by analyzing a published model of cancer cell metastasis. By combining the interpretability of the succession diagram with the probabilistic rigor of Markov analysis, the SD Markov chain offers a compact quantitative description of the attractor landscape and provides a new avenue for studying control and stability in complex biological systems.
Kinetic parameters such as the turnover number (kcat) and Michaelis constant (KM) are essential for modelling enzymatic activity but experimental data remains limited in scale and diversity. Previous methods for predicting enzyme kinetics typically use mean-pooled residue embeddings from a single protein language model to represent the protein. We present KinForm, a machine learning framework designed to improve predictive accuracy and generalisation for kinetic parameters by optimising protein feature representations. KinForm combines several residue-level embeddings (Evolutionary Scale Modelling Cambrian, Evolutionary Scale Modelling 2, and ProtT5-XL-UniRef50), taken from empirically selected intermediate transformer layers, and applies weighted pooling based on per-residue binding-site probability. To counter the resulting high dimensionality, we apply dimensionality reduction using principal-component analysis (PCA) on concatenated protein features, and rebalance the training data via a similarity-based oversampling strategy. KinForm outperforms baseline methods on two benchmark datasets. Improvements are most pronounced in low sequence similarity bins. We observe improvements from binding-site probability pooling, intermediate-layer selection, PCA, and oversampling of low-identity proteins. We also find that removing sequence overlap between folds provides a more realistic evaluation of generalisation and should be the standard over random splitting when benchmarking kinetic prediction models.
In the past decades, a wide suite of design tools for biological systems has been developed, but using these to create biotechnologies that achieve reliable and predictable behaviour remains challenging. Modelling approaches have enabled researchers to traverse the vast search space of genetic circuits more efficiently, while machine learning has proven useful for designing parts and predicting their function or evolutionary properties. Generative algorithms have the potential to leverage these features to design entire genetic circuits from the sequence level, but have only recently begun to be applied to synthetic biological applications. Here, we show that even simple generative models like the conditional variational autoencoder (CVAE) can produce novel genetic circuits that match complex dynamic functions such as signal adaptation. Using in silico RNA simulation, we construct a dataset of RNA sequences and convert them to circuits via RNA interaction predictors, allowing us to estimate functional features alongside evolutionary stability and interpret model-learned features. Our model generates diverse distributions of circuits that match their target adaptation specification well, even when limited to small training data sets. Structures in the embedding space correspond to motifs previously identified as crucial for adaptation and reflect the design rules for adaptable circuits. Framing adaptation as a single design objective outperforms other input representations, reflecting the importance of choosing the correct data encoding for generating genetic circuits. Finally, we show that functional and evolutionary properties can be prompted simultaneously, providing a proof-of-concept for the combined design of phenotype and evotype.
Cuproptosis is a recently identified copper-dependent cell death pathway with growing relevance in tumor biology, yet its involvement in thyroid carcinoma (TC) remains poorly understood. In this study, we integrated multi-omics datasets to characterize the functional roles of cuproptosis-related genes (CRGs) in TC progression. Bulk and single-cell transcriptomic datasets from public repositories were analyzed to classify TC into two CRG-based molecular subtypes that showed significant associations with clinicopathological features and immune cell infiltration. A prognostic model derived from CRG-related differentially expressed genes exhibited high predictive accuracy for patient survival. CDKN2A emerged as the only consistently upregulated CRG in TC and correlated with adverse prognosis. Single-cell analyses further revealed distinct cellular distributions of CRGs within the tumor microenvironment, with notable enrichment in immune cell populations. In addition, a previously unrecognized competing endogenous RNA network, the GAS5/miR-128-3p/CDKN2A axis, was identified and experimentally validated. Functional assays demonstrated that this regulatory circuit modulates TC cell proliferation, invasion, and metastasis in vitro and in vivo, with GAS5 acting for miR-128-3p to regulate CDKN2A expression. These findings provide a comprehensive systems-level perspective on cuproptosis-related mechanisms in TC and highlight the therapeutic promise of targeting the cuproptosis pathway and its regulatory networks in thyroid cancer management.
Protein-protein interactions (PPIs) are essential for cellular processes and play central roles in disease mechanisms, making them important therapeutic targets. This review explores how integrating artificial intelligence (AI) and deep learning (DL) with traditional experimental approaches has advanced the mapping and analysis of PPIs. We discuss how systems biology leverages PPI networks to model diseases and identify novel therapeutic targets, and highlight challenges and future directions in multi-omics integration, AI-driven innovation, and network-based drug development.
Triple-negative breast cancer (TNBC) presents a major clinical challenge owing to its immunosuppressive tumor microenvironment, target scarcity, and poor therapeutic response. Recently, the combination therapy of immune checkpoint blockade and CCL19 has shown significant efficacy in TNBC. To systematically unravel the synergistic mechanisms between CCL19 and anti-PD-1, we developed a mathematical model by integrating cellular and molecular scales to capture essential tumor-immune interactions and predict the dynamics of tumor evolution under various therapies. In this study, we proposed three quantitative indicators: (1) the tumor relative volume index (TRVI), (2) the therapeutic efficacy discrepancy index (TEDI), and (3) the immune heterogeneity treatment response index (IHTRI). Our model validated that the immunostimulatory effect of CCL19 in synergizing with anti-PD-1, and revealed that this synergy is highly modulated by individual baseline immune heterogeneity. Notably, our analysis identified (CTLs × CCL19)/PD-L1 as a novel dynamic biomarker combination with significant predictive (AUC = 0.86) and prognostic value (log-rank p= 0.019). Finally, virtual clinical trials revealed that administering anti-PD-1 therapy prior to CCL19 injection draws more significant clinical benefits in TNBC. Collectively, this study provides a theoretical foundation for elucidating the synergistic mechanism between CCL19-mediated immunostimulation and anti-PD-1 therapy.
Head and neck squamous cell carcinoma (HNSCC) is a complex multivariable disease posing a significant challenge in therapeutics. OXPHOS and PPP are centrally implicated in metabolic heterogeneity influencing tumor behavior, treatment response, and patient outcomes. This study aims to stratify HNSCC based on OXPHOS and PPP gene expression to construct a clinical outcome risk predictive model. HNSCC patients were stratified into four metabolic subtypes including mixed, OXPHOS-leaning, PPP-leaning, and quiescent. The quiescent metabolic subtype showed longest survival with lower proliferation scores, whereas mixed subtype showed the worst survival with higher proliferation scores. Metabolic proteins of the CPTAC and ICPC cohorts affirmed the existence of metabolic heterogeneity among tumor samples. A prognostic risk predictive model was developed based on thirteen genes, which performed better than the OXPHOS-PPP-glycolysis model and other predictive forty-seven metabolic gene's model. Existence of metabolic heterogeneity was successfully determined in HNSCC. OXPHOS and PPP genes enriched in mixed metabolic subtype having the worst clinical outcome are suggestive of higher metastatic potential, which might offer advisement in personalized therapeutics.
Periodontitis (PD) and metabolic dysfunction-associated steatotic liver disease (MASLD) were highly prevalent inflammatory disorders that frequently coexisted, yet the molecular basis of their comorbidity remained poorly defined. Here, we applied an integrative multi-omics strategy that combined bulk RNA sequencing, weighted gene co-expression network analysis, spatial transcriptomics, single-cell profiling, and machine learning. This approach identified 11 hub genes bridging immune activation and metabolic remodeling, among which Annexin A6 (ANXA6) emerged as a key cross-disease candidate. Spatial transcriptomics supported tissue-specific localization of the hub-gene signature, while single-cell analysis revealed selective enrichment of ANXA6 expression in γδ T cells. Notably, ANXA6-high γδ T cells exhibited enhanced signaling interactions with endothelial cells, suggesting immune-vascular crosstalk in PD-MASLD comorbidity. Upstream regulatory analysis further highlighted transcription factors associated with ANXA6 expression, and Connectivity Map-based prediction suggested candidate compounds with the potential to reverse ANXA6-associated transcriptional programs. Collectively, these findings support the hypothesis that ANXA6-high γδ T cell-endothelial communication represents a shared immunological feature of PD-MASLD comorbidity, offering novel insights into common immune programs and potential therapeutic opportunities.
Acute myeloid leukemia (AML) is a hematologic malignancy originating in the bone marrow and often progressing to extramedullary sites. Despite advances in molecularly targeted therapies and hematopoietic stem cell transplantation, clinical outcomes remain poor. Tyrosine kinase inhibitors (TKIs) provide benefit to a subset of AML patients harboring FLT3-ITD mutations; however, relapse and resistance remain common. These therapeutic failures are driven by both intrinsic properties of leukemic stem cells (LSCs)-a quiescent, self-renewing population-and extrinsic cues from the tumor microenvironment. We previously demonstrated that arteriolar endothelial cells (ECs) produce miR-126, which is transferred to LSCs, promoting quiescence, treatment resistance, and niche retention. During disease progression, TNF-α secreted by expanding blasts suppresses EC miR-126 production. Following TKI administration, blast reduction lowers TNF-ɑ levels, restoring EC miR-126 production, and this miR-126 expression enables LSCs to re-enter quiescence-thereby escaping therapy and facilitating relapse. To explore this dynamic, we developed an agent-based computational model of the AML bone marrow microenvironment, parameterized with in vitro and in vivo data. The model captures vascular niche remodeling and feedback between leukemic populations and endothelial signaling. Simulations reveal that LSC protection mediated by miR-126 can be disrupted by combining TKIs with miRisten, a miR-126 inhibitor. When administered on a defined schedule, this combination dismantles the protective niche and enhances LSC eradication. These findings underscore the therapeutic potential of targeting microenvironmental feedback to overcome resistance and prevent AML relapse.
Phenotypic drug discovery (PDD) identifies new drugs by observing the effects of compounds on living systems without prior knowledge of their targets. Advances in biological data and machine learning have made PDD more systematic and data-driven. This review outlines a computational framework, including phenotype representations, key tools, and public datasets. It also discusses major challenges and strategies to improve PDD's efficiency and translational potential, offering a practical guide for researchers in the field.
暂无摘要(点击查看详情)