The number and types of chemical compounds are expanding at an unprecedented rate. To model existing chemicals and aid in the design of novel chemical structures, appropriate computational approaches, tailored to the goals of specific projects, have evolved over time. This review analyzes the expansion of "chemical space" by tracing the historical milestones that have shaped molecular modeling from its inception to the present day. Utilizing data from public compound databases and a systematic bibliometric analysis of peer-reviewed literature, including specialized sources like the Journal of Computer-Aided Molecular Design, we mapped the co-evolution of chemical data and the algorithms designed to process it. While drug discovery has historically been the primary driver of this growth, our discussion extends to other domains, including macromolecular structural space. Due to the nature of public data, this analysis focuses mostly on open-access repositories with a few mentions of proprietary industrial libraries. By examining the trends in publications, this article provides a perspective on the current state of the field and the future trajectories of molecular modeling in an era of big data and artificial intelligence.
Gastric cancer (GC) is one of the main causes of cancer-related global mortality. The emergence of drug resistance and toxicity in current therapies highlights the need for novel treatment strategies. Quercetin, a natural flavonoid, has demonstrated anticancer activity; however, its molecular mechanism, particularly its effect on key targets in GC, remains underexplored. A comprehensive in silico and in vitro methods were used to elucidate the anticancer potential of Quercetin. Network pharmacology analysis was used to identify potential GC-related targets, followed by molecular docking and 200 ns molecular dynamics (MD) simulations to evaluate the binding affinity and stability of the Quercetin-target complex. In vitro experiments, including gene expression analysis and fluorescence binding assays, were conducted using AGS gastric cancer cells to validate the computational findings.nsulin-like growth factor 1 (IGF1) emerged as a key hub gene associated with GC progression. Molecular docking predicted a favorable interaction between Quercetin and IGF1, with a docking score of - 6.3 kcal/mol and multiple hydrogen-bond interactions. MD simulations confirmed the stability of the Quercetin-IGF1 complex, with reduced RMSD values (0.48 nm vs. 0.63 nm for unbound IGF1), favorable free energy profiles, and stable hydrogen bonding. In vitro studies demonstrated a significant downregulation of IGF1 mRNA expression (p < 0.001) and a dose-dependent inhibition of IGF1 activity by Quercetin. The integration of network pharmacology, computational modeling, and experimental validation suggests that Quercetin may modulate the IGF1 signaling axis and influence IGF1-associated pathways in gastric cancer. The stable binding and significant inhibitory effect observed suggests that Quercetin may interrupt IGF1-mediated signaling pathways involved in tumor growth and survival. This study identifies Quercetin as a potential modulator of IGF1-associated signaling pathways with significant therapeutic promise for gastric cancer. The findings provide mechanistic insights supporting the further development of Quercetin as a targeted therapy for IGF1-driven malignancies.
Carbonic anhydrase isoforms I and II (hCA-I and hCA-II) are metalloenzymes involved in essential physiological processes and represent relevant therapeutic targets for disorders such as glaucoma and osteoporosis. Chalcones have emerged as promising scaffolds for carbonic anhydrase inhibition; however, their structure-activity relationships, particularly for non-sulfonamide derivatives, remain insufficiently explored from a computational point of view. In this study, a dataset of 118 chalcone derivatives has been analyzed by using a three-dimensional quantitative structure-activity relationship (3D-QSAR) modeling, which comprises Comparative Molecular Field Analysis (CoMFA) and Comparative Molecular Similarity Index Analysis (CoMSIA). The developed models exhibited strong internal consistency and predictive capability for both isoforms. For hCA-I, steric, electrostatic, hydrophobic, and hydrogen bond acceptor fields has been identified as key contributors to inhibitory activity, whereas for hCA-II, hydrogen bond donor features played a more prominent role. Molecular docking and molecular dynamics simulations have been employed as complementary approaches to analyze ligand-protein interactions and binding stability. In addition, quantum chemical descriptors, derived from density functional theory, that have been integrated with the 3D-QSAR analysis, reveal a consistent correspondence between contour map features and the distribution of frontier molecular orbitals and molecular electrostatic potential. Furthermore, ADME-based pharmacokinetic properties of the proposed compounds have been evaluated to assess their potential drug-likeness. Based on the integrated computational analysis, six new chalcone derivatives, with predicted inhibitory activity in the nanomolar range, are proposed. Overall, this study provides a consistent physicochemical framework for understanding the inhibitory activity of chalcone derivatives and highlights key molecular features that may guide the modulation of activity across hCA-I and hCA-II isoforms.
Lysophosphatidic acid receptor 2 (LPAR2), a G protein-coupled receptor, has been implicated in the progression of fibrosis and is therefore a promising novel drug target for the treatment of fibrosis and related diseases. In this paper, a reliable homology model of LPAR2 was obtained by using three templates (PDB IDs: 4Z34, 7TD0, and 7VIE) and evaluations. A new binding site for a series of selective LPAR2 inhibitors were identified through molecular docking with the reference compound 50. Subsequently, a three-dimensional quantitative structure-activity relationship (3D QSAR) analysis was conducted on a series of N-sulfonyl heterocyclic antagonists of LPAR2. The derived optimal CoMFA model (q2 = 0.792, r2 = 0.999, [Formula: see text] = 0.998, [Formula: see text]  = 0.978) and CoMSIA model (q2 = 0.713, r2 = 0.996, [Formula: see text] = 0.978, [Formula: see text] = 0.958) demonstrated strong statistical robustness and high external predictability. The 3D contour maps generated from these models were analyzed and compared with the binding mode of the reference compound. This provided insights into the structural requirements of these LPAR2-selective inhibitors. Furthermore, the predictive capability of these models was validated by accurately predicting the antagonistic activities of other types of LPAR2-selective inhibitors (CoMFA-SE, [Formula: see text] = 0.862; CoMSIA-SEHDA, [Formula: see text]  = 0.934), confirming the robustness of the optimal 3D QSAR models. The new binding site and the optimal 3D QSAR models will be helpful to design novel molecules and predict their inhibitory activity against LPAR2.
Active transport of small molecules out of the cells mediated by efflux pumps of the ABC superfamily is an important biochemical phenomenon precluding delivery of drug molecules to the site of action, contributing to loss of efficacy and potential drug-drug interactions. Evaluating the efflux potential of new drug candidates is therefore crucial in the earliest stages of drug discovery. Due to its ubiquitous presence in the tissues and extremely broad substrate range, human P-glycoprotein (P-gp) has one of the largest impacts on drug distribution between tissues. A variety of computational approaches have been proposed to predict P-gp efflux, but their utility is mostly restricted to the classification of drugs into substrates and non-substrates, while the actual quantitative effect of efflux on other ADME processes remains elusive. The primary goal of the current study was to address this shortcoming by utilizing a censored-regression based statistical methodology to develop predictive models for the identification of P-gp substrates that would produce quantitative output in the form of P-gp efflux ratio (ER). The models were trained on a data set of about 3,500 compounds with ER values in exact or censored representation (i.e., only known to fall above or below a certain threshold) and a minimal selection of fundamental physicochemical descriptors, such as LogP, pKa, or McGowan Volume, enabling easy interpretation and offering mechanistic insight into the interplay between passive diffusion and active efflux processes. Moreover, we demonstrated how the described models can be used in practice to evaluate how P-gp efflux affects blood-brain barrier penetration and oral bioavailability.
Enterovirus A71 (EV-A71), the primary causative agent of hand, foot, and mouth disease (HFMD), can cause severe neurological complications and even death, particularly in young children. Despite the availability of inactivated vaccines, their protective efficacy has been compromised due to frequent intra- and intertypic recombination events and ongoing mutations among circulating EV-A71 strains. To address this, we employed immunoinformatic approaches and identified conserved epitopes and constructed a multi-epitope vaccine (MEV) candidate against EV-A71. A total of 1,627 structural protein sequences from EV-A71 strains encompassing all major circulating subtypes were retrieved and aligned to generate a consensus sequence. With this consensus sequence, 11 conserved, antigenic, and non-allergenic epitopes capable of eliciting B-cell, T-cell, and interferon-gamma (IFN-γ) responses were identified. The constructed MEV demonstrated superior immunological potential with a high antigenicity score of 0.94 and was predicted to be non-allergenic and non-toxic. Structural characterization via AlphaFold 3 and 300 ns molecular dynamics (MD) simulations confirmed the formation of a stable β-strand framework. Molecular docking followed by trajectory-stabilized interaction analysis revealed that the MEV maintains a high-affinity and stable binding profile with Toll-like receptor 3 (TLR-3). To ensure optimal translational efficiency, the vaccine gene was codon-optimized with a GC content of 52.8%, and the protein was successfully expressed in a bacterial system. Collectively, this study provides a high-performance MEV candidate with robust structural stability and potent immunogenicity, offering a promising and cost-effective strategy for broad-spectrum protection against EV-A71.
Pseudomonas aeruginosa is increasingly becoming resistant to multiple drugs and is held responsible for high rate of mortality and morbidity across the globe. This study aims to examine proteins from outer membrane such as OprE, OprF, OprC, and OprG, for the multi-epitope vaccine design. In this article, the prediction of epitopes for helper T-cells, cytotoxic T-cells, and B-cell epitopes was carried out by the application of various immune-informatics tools. All the predicted epitopes were aligned and assembled into a peptide sequence along with linkers and adjuvant sequences. Further, secondary structure and three-dimensional structure were predicted for the multiepitope construct. The vaccine construct designs were evaluated and validated for allergenicity, toxicity, and antigenicity. The validation of the predicted structure was carried out by determining its physiochemical properties by the application of ProtParam. Protein docking and molecular-dynamic simulation confirmed strong and stable interaction between the vaccine construct and Toll-like receptor-4. The vaccine construct was cloned into pET-29b vector and expressed in Escherichia coli. The designed multiepitope vaccine construct was overexpressed and purified by the application of Ni-NTA affinity chromatography and subjected to SDS-PAGE analysis. The circular dichroism spectroscopy analysis revealed it to be stable and structured. The hemolysis assay demonstrated minimal RBC toxicity suggesting that it was safe to use. The designed vaccine construct could activate both humoral and cellular immune responses as demonstrated by the advanced immunoinformatic approach making it a promising vaccine construct for protection against P. aeruginosa.
Metabolic dysfunction-associated steatotic liver disease (MASLD), previously known as non-alcoholic fatty liver disease, is one of the most prevalent liver diseases globally, contributing to both economic and health-related challenges. We aimed to evaluate the global, regional, and national burden of MASLD from 1990 to 2023, quantify the contribution of identified modifiable risk factors, and project future prevalence up to the year 2050. Estimates of MASLD prevalence and disability-adjusted life-years (DALYs) were produced by age, sex, region, Socio-demographic Index (SDI), and Healthcare Access and Quality (HAQ) index across 204 countries and territories from 1990 to 2023 as part of the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2023. The MASLD burden attributable to three risk factors (smoking, high BMI, and high fasting plasma glucose) was assessed as part of the GBD comparative risk assessment. As a secondary analysis, we used these estimates to forecast MASLD prevalence up to 2050 using fasting plasma glucose and mean BMI as predictors. Furthermore, to examine the relative contributions of population ageing, population growth, and changes in MASLD prevalence rate to the forecasted changes in case counts from 2023 to 2050, we conducted a decomposition analysis. In 2023, approximately 1·3 billion (95% uncertainty interval [UI] 1·2 to 1·4) individuals were estimated to be living with MASLD (ie, 16·1% of the global population), with an age-standardised prevalence rate of 14 429·3 (95% UI 13 268·3 to 15 990·6) per 100 000 population, representing a percentage increase of 142·7% (95% UI 139·2 to 146·7) in crude numbers from 1990 (0·5 billion [0·5 to 0·6]) and of 28·6% (27·8 to 29·5) in the rate (11 217·2 [10 276·8 to 12 467·0] per 100 000 in 1990). An estimated 3·6 million (2·8 to 4·5) total DALYs were attributable to MASLD worldwide in 2023, corresponding to an age-standardised DALY rate of 39·6 (31·2 to 49·9) per 100 000 population. Despite a 116·3% (93·3 to 139·4) increase in crude DALYs (from 1·7 million [1·3 to 2·1] in 1990), its age-standardised estimate remained consistent (1·8% [-8·6 to 12·8]) from 1990 (38·9 [30·1 to 49·8] per 100 000) to 2023. There was substantial variation in age-standardised estimates across regions. North Africa and the Middle East had the highest prevalence rate (29 246·1 [26 848·3 to 32 048·7] per 100 000) and Andean Latin America showed the highest DALY rate (152·3 [114·1 to 194·7] per 100 000). By contrast, the high-income Asia Pacific region had the lowest prevalence rate (8653·5 [7923·7 to 9592·8] per 100 000) and east Asia had the lowest DALY rate (16·3 [13·5 to 19·9] per 100 000) among all GBD regions. North Africa and the Middle East showed disproportionately higher prevalence rates relative to other regions with similar SDIs. Lower SDIs and HAQs were associated with higher age-standardised DALY rates. The age-standardised prevalence rate was consistently higher in males (15 616·4 [14 349·2 to 17 263·3] per 100 000 people in 2023) than in females (13 245·2 [12 132·0 to 14 692·6] per 100 000 people), and peaked at age 80-84 years in both sexes. The number of MASLD prevalent cases was the highest in younger adults, peaking at age 35-39 years for males and age 55-59 years for females. Among the risk factors for MASLD, high fasting plasma glucose presented the largest contribution to the age-standardised DALY rate of total MASLD in 2023 (2·2 [95% UI 1·6 to 3·1] per 100 000 people), followed by high BMI (1·4 [0·6 to 2·4] per 100 000 people) and smoking (1·0 [0·3 to 1·8] per 100 000 people). Our forecasting model estimates that 1·8 billion (95% UI 1·6 to 2·0) individuals are likely to have MASLD by 2050, representing a 42·0% increase from 2023. The age-standardised prevalence rate is expected to increase to 15 774·9 (95% UI 14 613·9 to 17 336·2) per 100 000 people in 2050, representing an average annual percentage change of 0·3% (95% UI 0·3-0·3). According to our decomposition analysis, this change will be primarily due to population growth, particularly in sub-Saharan Africa and North Africa and Middle East, and less by population ageing or epidemiological change. With a global prevalence of 16·1% and approximately 1·3 billion people already living with MASLD in 2023, the condition has and will continue to have substantial health and economic impacts worldwide. An inverse association between the HAQ Index and age-standardised DALY rates suggests that countries with lower health-care access and quality might be less well positioned to manage the growing MASLD burden, underscoring the need for strengthened health-system capacity in these settings. Gates Foundation.
Balancing molecular flexibility and rigidity remains a key challenge in rational drug design. Here, the Daam1 formin homology 2 (FH2) domain, a regulator of tumor metastasis, was used as a model to investigate the role of flexible side chains in targeted inhibition. Molecular docking, micro-scale thermophoresis, and molecular dynamics simulations showed that the flexible peptide thymopentin binds Daam1 more stably than etoposide through sustained interactions with Arg692 and Asn695. Guided by these insights, the 4' position of etoposide was optimized using free energy perturbation combined with enhanced sampling MD. Two derivatives, V2 and V3, exhibited improved binding free energies. In vitro assays demonstrated that V2 and V3 enhanced inhibition of breast cancer cell migration and reduced Daam1 expression to approximately half of the level observrd with etoposide. This study establishes a flexibility-driven optimization strategy and highlights FEP-enhanced MD as a robust framework for rational drug design.
The development of effective antitubercular drugs is necessary because tuberculosis remains a major global healthcare burden. To examine the antimycobacterial properties of fisetin, the current study used an integrated method. The in vitro test against Mycobacterium tuberculosis H37Ra demonstrated notable inhibitory effects, with an MIC of 100 µg/mL and an MBC of 200 µg/mL. In silico ADMET analysis was used to assess the drug-like characteristics and advantageous pharmacokinetic parameters to explore its molecular mechanism. Significant binding affinities were observed in docking studies against eight core proteins of Mtb, with protein kinase B (PknB), a crucial Mtb regulator of cell division and survival, showing the highest affinity. Fisetin’s electron stability and reactivity were revealed by DFT research, indicating that it would be able to bind to biological targets. Specifically, the stability of the fisetin-PknB complex was verified by 100-ns-scale molecular dynamics analysis, which maintained its structure and dynamics. Fisetin was shown to have similar or even better binding qualities than the anti-tubercular medication isoniazid when those were compared using docking and MD analysis. Overall, fisetin is a potential multitarget antitubercular drug that merits additional experimental validation using an integrative approach. [Image: see text] The online version contains supplementary material available at 10.1007/s10822-026-00786-6.
In this work, novel quinoline-2-carboxamide-phenylacetamide hybrids 23a-d were designed, synthesised, and evaluated through antimicrobial and breast cancer-cell cytotoxicity screening. The compounds showed moderate antimicrobial activity against methicillin-resistant Staphylococcus aureus (MRSA), Pseudomonas aeruginosa, and Candida albicans, with compound 23b showing the lowest MIC values of 0.125 and 0.039 mg/mL against MRSA and P. aeruginosa, respectively. Cytotoxicity screening against MCF-7 and MDA-MB-231 breast cancer cell lines identified compound 23d as the most active member of the series against MCF-7 cells, with an IC₅₀ of 6.249 ± 0.30 µg/mL, corresponding to 14.55 ± 0.70 µM using the molecular weight applied in this study. Molar comparison showed that the activity profile was cell-line dependent: 23d displayed a lower molar IC₅₀ than cisplatin against MCF-7 under the assay conditions used, whereas cisplatin remained more potent against MDA-MB-231 cells. Molecular docking against activated CDC42-associated kinase 1 (ACK1) suggested that 23d can adopt a favourable predicted binding pose within the ACK1 ATP-binding pocket, with a GlideScore of - 8.112 kcal/mol compared with - 9.651 kcal/mol for the co-crystallised inhibitor. However, docking alone cannot confirm ACK1 inhibition or establish the mechanism of cytotoxicity. Because non-cancerous-cell cytotoxicity was not performed, cancer-cell selectivity remains unresolved and selectivity indices could not be calculated. Accordingly, compounds 23a-d are presented as preliminary antimicrobial and cytotoxic hits requiring further optimisation, analytical validation, non-cancerous-cell cytotoxicity testing, and mechanistic confirmation.
Cancer immunotherapy targeting the PD-1/PD-L1 pathway has transformed modern oncology; however, developing small-molecule inhibitors as viable alternatives to monoclonal antibodies remains a major challenge. In this study, an integrated computational framework is established in which machine learning, molecular docking, molecular dynamics simulations, and binding free-energy calculations are combined to enable the discovery and optimization of novel PD-L1 inhibitors. A validated ML-QSAR model was constructed using the XGBoost algorithm (R2_train = 0.925, R2_test = 0.743) on a dataset of 74 known inhibitors with consistent assay conditions. Through virtual screening of FDA-approved drugs, Pralatrexate was subsequently identified as a promising repurposing candidate, demonstrating a higher predicted binding affinity than the reference inhibitors. Structure-based modification of Pralatrexate yielded the derivative D1, which exhibited improved computational binding properties across all evaluation methods. Molecular dynamics simulations indicated that the D1-PD-L1 complex achieved greater stability than both the free protein and the reference complex, with reduced RMSD fluctuations and preserved key interactions with tyrosine residues Tyr56(A/B). MM-GBSA calculations further confirmed D1's superior binding affinity (-86.21 kcal/mol vs. - 73.65 kcal/mol for the reference), and predicted IC50 values suggested enhanced inhibitory potential. This multi-stage computational workflow effectively integrates machine learning predictions with atomic-level binding analyses, providing a robust platform for accelerated drug discovery. The optimized derivative D1 thus represents a promising candidate for experimental validation and further development as a potential cancer immunotherapeutic agent.
Achillea arabica Kotschy, known locally as "Thafera'a" in Saudi Arabia, has been widely used in traditional medicine for treating various human ailments, including diabetes and skin inflammation. In the current investigation, we sought to unravel the phytochemical profile, antioxidant, antidiabetic, and anti-inflammatory activities of A. arabica ethanolic extract (AAEE) using in vitro and in silico approaches. The extract contained substantial total phenolic and flavonoid content (TPC = 87.15 ± 1.15 mg GAE/g DE and TFC = 26.2 ± 0.15 mg QE/g DE). Furthermore, UHPLC-QTOF-MS2 analysis exhibited a broad spectrum of metabolites, chiefly phenolic acids and flavonoids. Key compounds included chlorogenic acid, isorhamnetin, kaempferide, Kaempferol-3-O-glucoside, cyanidin-3-O-glucoside, delphinidin-3-O-β-glucopyranoside, naringenin and apigenin. This rich phytochemical profile underpinned the extract's potent bioactivities, as demonstrated by its ability to scavenge DPPH• radicals (IC50 = 135.99 ± 0.87 µg/mL) and ABTS+• radicals (IC50 = 422.02 ± 11.02 µg/mL), reduce metals (FRAP EC50 = 548.70 ± 0.06 µmol Trolox/g dry extract), inhibit α-amylase enzyme (IC50 = 233.65 ± 5.03 µg/mL), and suppression of protein denaturation (IC50 = 138.33 ± 2.23 µg/mL). Docking analysis showed strong binding of flavonoids to the target proteins with energies of -8.3 to -9.8 kcal/mol, while 200 ns molecular dynamics confirmed stable binding of the 1OSE-cosmosiin complex. ADMET predictions indicated favorable pharmacokinetic and safety profiles for naringenin and apigenin, and DFT calculations supported these findings by revealing suitable electronic properties. These results demonstrate that A. arabica is recognized as a significant source of biologically active metabolites with therapeutic potency, validating its traditional medicinal use and warranting further in vivo and clinical investigations to confirm its effectiveness.
Calcineurin represents a prominent target for immunosuppressive drugs, where conventional macrocyclic inhibitors utilize an immunophilin-dependent mechanism for their inhibition but are consequently hindered by adverse effects and variable pharmacokinetics. The flavonoid quercetin has been shown to inhibit calcineurin in a non-competitive, immunophilin-independent manner, but its exact binding site remains ambiguous at the junction between calcineurin subunits A and B, with the open solvent-exposed nature of this region proving challenging to model. In this study, we employ funnel metadynamics with collective variables derived from a deep learning model to identify the binding site of quercetin. A selective strategy using multiple simulations with stricter funnel definitions and a smaller number of carefully chosen descriptors for model training proved more effective than a broad-based approach. These simulations were able to effectively distinguish the binding site of quercetin from three experimentally suggested sites, with a calculated free energy of binding of -8.36 ± 0.60 kcal/mol showing excellent agreement with experiment. Ligand-tryptophan distances similarly corroborated measurements from FRET assays with a r2 of 0.85. The corresponding binding pose showed that quercetin inserts itself into a channel between Arg122, Gly123, Tyr124 on one side and Phe160, Thr161, and Asn345 on the other, stabilizing its binding through a network of hydrogen bonds. The findings of this study provide insights into the modelling of challenging binding sites and ligands using deep learning driven metadynamics simulations and provide the foundations for rational development of immunophilin-independent inhibitors of calcineurin.
Targeting leucine-rich repeat kinase 2 (LRRK2) has emerged as a promising strategy for the treatment of Parkinson's disease (PD). Here, we report the identification of newly identified LRRK2 inhibitors using a multi-stage virtual screening strategy that integrates molecular docking, AI-driven predictive modeling, molecular dynamics (MD) simulations, and binding free energy change (ΔΔG) calculations. A library of 8,617 drug-like small molecules was screened, and ΔΔG analysis was subsequently used as a post-screening prioritization step to identify candidates predicted to maintain or enhance binding affinity against the pathogenic G2019S mutant. Notably, compound 3 exhibited an IC50 value of 14.21 nM against the wild-type (WT) and 14.75 nM against the G2019S mutant, along with a preliminary kinase selectivity in profiling assays. MD simulations further revealed key interaction profiles that stabilize compound binding within the active sites of both WT and G2019S LRRK2. These findings underscore the utility of integrating AI-enhanced virtual screening with ΔΔG-based post-screening prioritizationto identify mutation-resilient inhibitors, offering a robust foundation for further optimization and therapeutic development in PD.
Geminiviral C2 protein is a multifunctional protein that serves as a versatile regulatory hub, orchestrating the trans-activation of viral promoters while simultaneously subverting host immunity by inhibiting gene silencing pathways (TGS/PTGS) and disrupting the ubiquitin-proteasome system via the COP9 signalosome. Despite these critical roles, the structural architecture of C2 remains unelucidated, and the presence of predicted intrinsically disordered regions (IDRs), which may facilitate such functional plasticity, has not been explored in detail. In this study, we employ computational modeling to predict the three-dimensional structure of Tomato yellow leaf curl virus (TYLCV) C2 and characterize its disordered regions. Our findings provide a structural rationale for how this small protein coordinates diverse protein-protein interactions, offering new insights into the molecular mechanisms by which geminiviruses hijack host cellular machinery. Computational analyses predicted prominent intrinsically disordered regions between residues 40-120 and highlighted the zinc finger domain as a key structural element. Based on these computational predictions, domain deletion mutants (del NLS, del ZnFn, del AD) and zinc finger point mutants (C37A, C39A) were generated and transiently expressed in Nicotiana benthamiana as CFP fusion proteins. Subcellular localization was assessed by confocal microscopy, while HR induction and H2O2 accumulation were evaluated by visual inspection and DAB staining. Wild-type C2 localized predominantly to the nucleus and was associated with HR-like responses. In contrast, deletion of the NLS, Zinc finger, or activation domain, or cysteine mutations in the zinc finger were associated with loss of detectable HR-like responses. Molecular dynamics simulations further showed that deletion of the zinc finger destabilized the protein, whereas removal of the NLS or activation domain produced more compact conformations. Overall, these findings indicate that the zinc finger domain might be important for structural integrity and biological activity, while the NLS appears to be important for nuclear localization, highlighting a potential interplay between ordered domains and predicted intrinsically disordered regions in TYLCV C2.
Large-scale pharmacogenomic screens provide extensive measurements of drug response across diverse cancer cell lines; however, most computational approaches emphasize point-wise sensitivity prediction or static ranking, which are poorly aligned with practical decision-making, where only a limited number of candidate drugs can be tested. We propose NetPolicy-RL, a biologically informed and decision-centric framework for pharmacogenomic drug prioritization that integrates network diffusion modeling with offline reinforcement learning. Drug selection for each cell line is formulated as an offline contextual bandit problem, enabling implicit optimization of ranking quality through a decision-oriented reward formulation rather than surrogate regression objectives. Mechanistic biological context is incorporated by propagating drug targets over curated interaction networks (STRING and Reactome) using random walk with restart, and combining the resulting diffusion profiles with cell-specific molecular importance derived from multi-omics data to compute network disruption scores. These biologically grounded signals are integrated with normalized drug response measurements to construct a joint state representation, which is optimized using an offline actor-critic architecture. Across held-out test splits, NetPolicy-RL consistently outperforms global ranking heuristics and learning-to-rank baselines, achieving statistically significant improvements in per-cell Normalized Discounted Cumulative Gain (NDCG@10) and substantial reductions in per-cell regret. Relative to GlobalTopK, the policy improves NDCG@10 for 88.7% of cell lines, while improvements exceed 95% compared with LambdaMART and regression-to-ranking baselines. Ablation analyses indicate that neither empirical response signals nor network-derived features alone are sufficient within the evaluated setting and that their integration yields the most robust performance. Overall, this study demonstrates that combining mechanistic network biology with offline policy learning provides an effective and interpretable framework for drug prioritization in precision oncology.
Aqueous solubility is an important property for assessing the druggability and ecotoxicological effects of molecules. Successful drug candidates should have optimal aqueous solubility to improve bioavailability to target tissues. To effectively screen molecules in a short period of time, reliable predictive models are highly useful. In the present study, we conducted a round-robin exercise using a large, curated dataset of over 6000 compounds to predict aqueous solubility quantitatively. The six participating groups used an array of Machine Learning and Deep Learning algorithms to develop models with strong robustness and external predictive performance. All the models underwent rigorous Leave-One-Out and tenfold cross-validation. The diversity of training sets and descriptor types used by different groups paved the way for exploring the mechanistic basis for the efficient identification of contributing features. The best-performing model was selected using the statistical Sum of Ranking Differences (SRD) approach, considering the performances on training, cross-validation, and test, as well as the performance difference between the training and test sets. Additionally, a curated, true external set was screened by the six different models. Here, the best-performing model was selected using a consensus ranking strategy based on Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and [Formula: see text]. In both approaches, i.e., the inherent model performance in terms of training, test, and cross-validation statistics, and the ability of the model to efficiently predict true external data, the Stacking Ensemble of Deep q-RASPR models emerged as the winner. This model showed comparable predictive performance to the previously reported model, which apparently lacked a proper data curation workflow and contained a significant number of duplicates and mixtures in its dataset, which can inflate model statistics. The insights from the different feature contributions from the different groups identified the useful structural and physicochemical aspects, which can help synthetic chemists to optimize molecules.
Pleurocidin is an antimicrobial peptide (AMP) derived from the winter flounder (Pleuronectes americanus), attracting significant attention due to its unique structure and potent bioactivities. However, systematic evaluation and visual analysis of the fundamental knowledge, current status, and trends in this field remain insufficient. This study employs data analysis and visualization techniques to systematically assess the research landscape of pleurocidin-like peptides, aiming to identify current hotspots and forecast future directions. We analyzed 111 relevant publications from the Web of Science Core Citation database (2005-2025) using tools including Scimago Graphica, VOSviewer, and CiteSpace. The quantitative evaluation encompassed annual publication trends, geographical and institutional distribution, collaboration networks, and keyword evolution. China and the United States are identified as the most influential countries, with Kyungpook National University in South Korea as the central hub in the collaborative network. Current research on pleurocidin-like peptides focuses on their diverse biological activities, mechanisms of action, optimized design, and applications, with computer-aided molecular design emerging as a prominent new trend. Current research particularly highlights the biological activities of these peptides, along with their specific mechanisms. In addition, various innovative modification strategies have been proposed for optimizing pleurocidin-like peptides and enhancing their stability, efficacy, and biocompatibility. This work aims to establish a theoretical foundation for advancing the applications of pleurocidin-like peptides in aquaculture, food preservation, and the pharmaceutical industry.
With the rapid growth of chemical data and information, there is an increasing need for analyzing large chemical datasets and extracting key or feature information. Currently, more than 100,000 types of metal-organic frameworks (MOFs), as the material recently awarded the Nobel Prize in Chemistry, have been experimentally synthesized. The performance of MOFs in adsorption-separation applications depends on their specific void characteristics, including void count, spatial distribution, and volume size. This study presents the entire process including data collection, recognition of key or feature information, the workflow of using Python tools, and the automatic output of results for void information, solvent accessible volume (SAV) and adsorbate molecules. By processing 219 CIF files collected from open-access publications, CCDC, and supporting information files, we successfully extracted 498 total blocks, including 259 blocks with void information, 157 blocks with SAV data, 286 blocks with squeeze details, and 1573 individual voids. In addition, we identified adsorbate molecules (diethyl ether, chloroform, water, ethanol, toluene, carbon dioxide) in MOFs. The method demonstrates computational efficiency, requiring only standard CPU resources to process large datasets.