Before the current pandemic, influenza and respiratory syncytial virus (RSV) were the leading etiological agents of seasonal acute respiratory infections (ARI) around the world. In this setting, medical doctors typically based the diagnosis of ARI on patients' symptoms alone and did not routinely conduct virological tests necessary to identify individual viruses, limiting the ability to study the interaction between multiple pathogens and to make public health recommendations. We consider a stochastic kinetic model (SKM) for two interacting ARI pathogens circulating in a large population and an empirically-motivated background process for infections with other pathogens causing similar symptoms. An extended marginal sampling approach, based on the linear noise approximation to the SKM, integrates multiple data sources and additional model components. We infer the parameters defining the pathogens' dynamics and interaction within a Bayesian model and explore the posterior trajectories of infections for each illness based on aggregate infection reports from six epidemic seasons collected by the state health department and a subset of virological tests from a sentinel program at a gen
This scientometric study analyzes Avian Influenza research from 2014 to 2023 using bibliographic data from the Web of Science database. We examined publication trends, sources, authorship, collaborative networks, document types, and geographical distribution to gain insights into the global research landscape. Results reveal a steady increase in publications, with high contributions from Chinese and American institutions. Journals such as PLoS One and the Journal of Virology published the highest number of studies, indicating their influence in this field. The most prolific institutions include the Chinese Academy of Sciences and the University of Hong Kong, while the College of Veterinary Medicine at South China Agricultural University emerged as the most productive department. China and the USA lead in publication volume, though developed nations like the United Kingdom and Germany exhibit a higher rate of international collaboration. "Articles" are the most common document type, constituting 84.6% of the total, while "Reviews" account for 7.6%. This study provides a comprehensive view of global trends in Avian Influenza research, emphasizing the need for collaborative efforts ac
Virological measurements are often treated as reports of virion structure, mechanics, dielectric response, infectivity, or titer. In practice, an experiment observes a protocol-conditioned projection of a richer latent virion--environment ensemble. This paper defines this process as experimental collapse within protocol-resolved virophysics. Its central object is the null-inclusive observation operator $P_{\mathrm{obs},t}^{\varnothing}(\,\cdot\mid E\,) = \mathcal{M}_{E,t}^{\varnothing}P_{\mathrm{ref},t}$, which maps a reference latent ensemble to the observed ensemble generated by protocol $E$, including null outcomes. The formulation separates latent-state transformation, detection weighting, readout, and non-observation, making protocol effects explicit components rather than bias terms. The framework introduces protocol-conditioned latent ensembles, collapse functionals, protocol blindness, observation equivalence, Fisher-information observability, inverse inference, and multi-protocol consistency. It identifies collapse mechanisms including preparation, surface immobilization, mechanical loading, field steering, medium filtering, amplification, censoring, and detection threshol
Background and Aims: Pegylated interferon (PEG-IFN) combined with oral antiviral agents is currently the most widely used and highly effective treatment regimen for chronic hepatitis B virus (HBV) infection. While effectively suppressing HBV replication, its impact on liver histopathological fibrosis and inflammation remains a critical concern for clinicians and patients. Methods : A total of 625 patients who completed 48 weeks of PEG-IFN combined with oral antiviral therapy were enrolled in this real-world study. Based on their virological response at 48 weeks, patients were categorized into Clearance group and Non-clearance group. Changes in liver biochemistry, fibrosis, and renal function were compared between groups and before/after treatment. Results: No significant differences were observed in baseline blood tests, liver biochemical markers, or histopathological features between the Clearance group and Non-clearance group. Similarly, baseline renal function showed no significant variation. Further analysis revealed that the Clearance group exhibited significant aggravation of liver fibrosis after 48 weeks of treatment, which correlated strongly with alterations in liver enzym
HIV-1 replication can be suppressed with antiretroviral therapy (ART), but individuals who stop taking ART soon become viremic again. Some people experience extended times of detectable viremia despite optimal adherence to ART. In the issue of the JCI, White, Wu, and coauthors elucidate a source of nonsuppressible viremia (NSV) in treatment-adherent patients clonally expanded T cells harboring HIV-1 proviruses with small deletions or mutations in the 5'-leader, the UTR that includes the major splice donor site of viral RNA. These mutations altered viral RNA-splicing efficiency and RNA dimerization and packaging, yet still allowed production of detectable levels of noninfectious virus particles. These particles lacked the HIV-1 Env surface protein required for cell entry and failed to form the mature capsid cone required for infectivity. These studies improve our understanding of NSV and the regulation of viral functions in the 5'-leader with implications for rationalized care in individuals with NSV.
In many areas of systems biology, including virology, pharmacokinetics, and population biology, dynamical systems are commonly used to describe biological processes. These systems can be characterized by estimating their parameters from sampled data. The key problem is how to optimally select sampling points to achieve accurate parameter estimation. Classical approaches often rely on Fisher information matrix-based criteria such as A-, D-, and E-optimality, which require an initial parameter estimate and may yield suboptimal results when the estimate is inaccurate. This study proposes two simulation-based methods for optimal sampling design that do not depend on initial parameter estimates. The first method, E-optimal-ranking (EOR), employs the E-optimal criterion, while the second utilizes a Long Short-Term Memory (LSTM) neural network. Simulation studies based on the Lotka-Volterra and three-compartment models demonstrate that the proposed methods outperform both random selection and classical E-optimal design.
This study systematically evaluates 27 frontier Large Language Models on eight biology benchmarks spanning molecular biology, genetics, cloning, virology, and biosecurity. Models from major AI developers released between November 2022 and April 2025 were assessed through ten independent runs per benchmark. The findings reveal dramatic improvements in biological capabilities. Top model performance increased more than 4-fold on the challenging text-only subset of the Virology Capabilities Test over the study period, with OpenAI's o3 now performing twice as well as expert virologists. Several models now match or exceed expert-level performance on other challenging benchmarks, including the biology subsets of GPQA and WMDP and LAB-Bench CloningScenarios. Contrary to expectations, chain-of-thought did not substantially improve performance over zero-shot evaluation, while extended reasoning features in o3-mini and Claude 3.7 Sonnet typically improved performance as predicted by inference scaling. Benchmarks such as PubMedQA and the MMLU and WMDP biology subsets exhibited performance plateaus well below 100%, suggesting benchmark saturation and errors in the underlying benchmark data. The
We develop a continuous mathematical model of population dynamics that describes the sequential emergence of new genotypes under limited resources. The framework models genotype density as a nonlinear flow in mutation space, combining transport driven by a time-dependent mutation rate with logistic growth and nonlocal competition. For the advection-reaction regime without reverse mutations, we derive analytical solutions using the method of characteristics and obtain explicit expressions for time-varying carrying capacities and mutation velocities. We analyze how decaying and accelerating mutation rates shape the saturation and propagation of population fronts through level-set geometry. When reverse mutations are included, the system becomes a quasilinear parabolic equation with diffusion in genotype space; numerical experiments show that backward mutation flows stabilize the dynamics and smooth the evolving fronts. The proposed model generalizes classical quasispecies and Crow-Kimura formulations by incorporating logistic regulation, variable mutation rates, and reversible transitions, offering a unified approach to evolutionary processes relevant to virology, bacterial adaptatio
Investigations on airborne transmission of pathogens constitute a rapidly expanding field, primarily focused on understanding the expulsion patterns of respiratory particulates from infected hosts and their dispersion in confined spaces. Largely overlooked has been the crucial role of fluid dynamics in guiding inhaled virus-laden particulates within the respiratory cavity, thereby directing the pathogens to the infection-prone upper airway sites. Here, we discuss a multi-scale approach for modeling the onset parameters of airway infection based on flow physics. The findings are backed by Large Eddy Simulations of inhaled airflow and computed trajectories of pathogen-bearing aerosols/droplets within two clinically healthy and anatomically realistic airway geometries reconstructed from computed tomography imaging. As a representative anisotropic pathogen that can transmit aerially, we have picked smallpox from the Poxviridae family to demonstrate the approach. The fluid dynamics findings on inhaled transmission trends are integrated with virological and epidemiological parameters for smallpox (e.g., viral concentration in host ejecta, physical properties of virions, and typical expos
We present the Virology Capabilities Test (VCT), a large language model (LLM) benchmark that measures the capability to troubleshoot complex virology laboratory protocols. Constructed from the inputs of dozens of PhD-level expert virologists, VCT consists of $322$ multimodal questions covering fundamental, tacit, and visual knowledge that is essential for practical work in virology laboratories. VCT is difficult: expert virologists with access to the internet score an average of $22.1\%$ on questions specifically in their sub-areas of expertise. However, the most performant LLM, OpenAI's o3, reaches $43.8\%$ accuracy, outperforming $94\%$ of expert virologists even within their sub-areas of specialization. The ability to provide expert-level virology troubleshooting is inherently dual-use: it is useful for beneficial research, but it can also be misused. Therefore, the fact that publicly available models outperform virologists on VCT raises pressing governance considerations. We propose that the capability of LLMs to provide expert-level troubleshooting of dual-use virology work should be integrated into existing frameworks for handling dual-use technologies in the life sciences.
Introduction. Here we reported the virological, entomological and epidemiological characteristics of the large autochthonous outbreak of dengue (DENV) occurred in a small village of the Lombardy region (Northern Italy) during summer 2023. Methods. After the diagnosis of the first autochthonous case on 18 August 2023, public health measures, including epidemiological investigation and vector control measures, were carried out. A serological screening for DENV antibodies detection was offered to the population. In the case of positive DENV IgM, a second sample was collected to detect DENV RNA and verify seroconversion. Entomological and epidemiological investigations were also performed. A modeling analysis was conducted to estimate the dengue generation time, transmission potential, distance of transmission, and assess diagnostic delays. Results. Overall, 416 subjects participated to the screening program and 20 were identified as DENV-1 cases (15 confirmed and 5 probable). In addition, DENV-1 infection was diagnosed in 24 symptomatic subjects referred to the local Emergency Room Department for suggestive symptoms and 1 case was identified through blood donation screening. The avera
Despite its endemic nature as well as the recent outbreaks, information on the opportunistic DENV in Anambra state has been sparse. This study thus aimed to give seroepidemiological evidence of past dengue virus infection among HIV-infected patients in Onitsha, Anambra State, Nigeria. Plasma from 94 HIV-infected patients who were attending Saint Charles Borromeo Hospital, Onitsha in Anambra State, Nigeria was tested for IgG antibodies specific to the dengue virus by IgG ELISA assay. The prevalence of past dengue virus infection was 61.7% (n = 58/94). This study showed age group 0-15 years (77.30%), female gender (65.1%), married (63.9%) and no formal level (100.0 %) as the highest seropositivity among the study participants. In terms of immunological and virological markers, greater IgG seroprevalence was observed in individuals with a viral load of <40 copies/ml (64.0%) and a CD4 count of >350 cells/ul (63.2%). The high IgG seropositivity of Dengue Virus (DENV) among HIV-infected individuals on Onitsha is cause for concern.
Maybe not. We identify and analyse errors in the popular Massive Multitask Language Understanding (MMLU) benchmark. Even though MMLU is widely adopted, our analysis demonstrates numerous ground truth errors that obscure the true capabilities of LLMs. For example, we find that 57% of the analysed questions in the Virology subset contain errors. To address this issue, we introduce a comprehensive framework for identifying dataset errors using a novel error annotation protocol. Then, we create MMLU-Redux, which is a subset of 5,700 manually re-annotated questions across all 57 MMLU subjects. We estimate that 6.49% of MMLU questions contain errors. Using MMLU-Redux, we demonstrate significant discrepancies with the model performance metrics that were originally reported. Our results strongly advocate for revising MMLU's error-ridden questions to enhance its future utility and reliability as a benchmark. https://huggingface.co/datasets/edinburgh-dawg/mmlu-redux-2.0.
This demo will present the Research Assistant (RA) tool developed to assist with six main types of research tasks defined as standardized instruction templates, instantiated with user input, applied finally as prompts to well-known--for their sophisticated natural language processing abilities--AI tools, such as ChatGPT (https://chat.openai.com/) and Gemini (https://gemini.google.com/app). The six research tasks addressed by RA are: creating FAIR research comparisons, ideating research topics, drafting grant applications, writing scientific blogs, aiding preliminary peer reviews, and formulating enhanced literature search queries. RA's reliance on generative AI tools like ChatGPT or Gemini means the same research task assistance can be offered in any scientific discipline. We demonstrate its versatility by sharing RA outputs in Computer Science, Virology, and Climate Science, where the output with the RA tool assistance mirrored that from a domain expert who performed the same research task.
It is well known that, during replication, RNA viruses spontaneously generate defective viral genomes (DVGs). DVGs are unable to complete an infectious cycle autonomously, and depend on coinfection with a helper wild-type virus (HV) for their replication and/or transmission. The study of the dynamics arising from a HV and its DVGs has been a longstanding question in virology. It has been shown that DVGs can modulate HV replication and, depending on the strength of interference, result in HV extinctions or self-sustained persistent fluctuations. Extensive experimental work has provided mechanistic explanations for DVG generation and compelling evidences of HV-DVGs virus coevolution. Some of these observations have been captured in mathematical models. Here, we develop and investigate an epidemiological-like mathematical model specifically designed to study the dynamics of betacoronavirus in cell cultures experiments. The dynamics of the model is governed by several degenerate normally hyperbolic invariant manifolds given by quasineutral planes - i.e. filled by equilibrium points. Three different quasineutral planes have been identified depending on parameters and involving: (i) pers
This essay reviews some key concepts in mathematical epidemiology and examines the intersection of this field with related scientific disciplines, such as chemical reaction network theory and Lagrange-Hamilton geometry. Through a synthesis of theoretical insights and practical perspectives, we underscore the significance of essentially non-negative kinetic systems in the development and implementation of robust epidemiological models. Our purpose is to make the case that currently mathematical modeling of epidemiology is focusing too much on simple particular cases, and maybe not enough on more complex models, whose challenges would require cooperation with scientific computing experts and with researchers in the "sister disciplines" involving essentially nonnegative kinetic systems (like virology, ecology, chemical reaction networks, population dynamics, etc).
In this paper, we champion the use of structured and semantic content representation of discourse-based scholarly communication, inspired by tools like Wikipedia infoboxes or structured Amazon product descriptions. These representations provide users with a concise overview, aiding scientists in navigating the dense academic landscape. Our novel automated approach leverages the robust text generation capabilities of LLMs to produce structured scholarly contribution summaries, offering both a practical solution and insights into LLMs' emergent abilities. For LLMs, the prime focus is on improving their general intelligence as conversational agents. We argue that these models can also be applied effectively in information extraction (IE), specifically in complex IE tasks within terse domains like Science. This paradigm shift replaces the traditional modular, pipelined machine learning approach with a simpler objective expressed through instructions. Our results show that finetuned FLAN-T5 with 1000x fewer parameters than the state-of-the-art GPT-davinci is competitive for the task.
Our paper reviews some key concepts in chemical reaction network theory and mathematical epidemiology, and examines their intersection, with three goals. The first is to make the case that mathematical epidemiology (ME), and also related sciences like population dynamics, virology, ecology, etc., could benefit by adopting the universal language of essentially non-negative kinetic systems as developed by chemical reaction network (CRN) researchers. In this direction, our investigation of the relations between CRN and ME lead us to propose for the first time a definition of ME models, stated in Open Problem 1. Our second goal is to inform researchers outside ME of the convenient next generation matrix (NGM) approach for studying the stability of boundary points, which do not seem suficiently well known. Last but not least, we want to help students and researchers who know nothing about either ME or CRN to learn them quickly, by offering them a Mathematica package "BootCamp", located at https://github.com/adhalanay/epidemiology_crns, including illustrating notebooks (and certain sections below will contain associated suggested notebooks; however, readers with experience may safely ski
The COVID-19 pandemic has given rise to numerous articles from different scientific fields (epidemiology, virology, immunology, airflow physics...) without any effort to link these different insights. In this review, we aim to establish relationships between epidemiological data and the characteristics of the virus strain responsible for the epidemic wave concerned. We have carried out this study on the Wuhan, Alpha, Delta and Omicron strains allowing us to illustrate the evolution of the relationships we have highlighted according to these different viral strains. We addressed the following questions: 1) How can the mean infectious dose (one quantum, by definition in epidemiology) be measured and expressed as an amount of viral RNA molecules (in genome units, GU) or as a number of replicative viral particles (in plaque-forming units, PFU)? 2) How many infectious quanta are exhaled by an infected person per unit of time? 3) How many infectious quanta are exhaled, on average, integrated over the whole contagious period? 4) How do these quantities relate to the epidemic reproduction rate R as measured in epidemiology, and to the viral load, as measured by molecular biological methods
Cancer progression and monotonic accumulation models were developed to discover dependencies in the irreversible acquisition of binary traits from cross-sectional data. They have been used in computational oncology and virology but also in widely different problems such as malaria progression. These methods have been applied to predict future states of the system, identify routes of feature acquisition, and improve patient stratification, and they hold promise for evolutionary-based treatments. New methods continue to be developed. But these methods have shortcomings, which are yet to be systematically critiqued, regarding key evolutionary assumptions and interpretations. After an overview of the available methods, we focus on why inferences might not be about the processes we intend. Using fitness landscapes, we highlight difficulties that arise from bulk sequencing and reciprocal sign epistasis, from conflating lines of descent, path of the maximum, and mutational profiles, and from ambiguous use of the idea of exclusivity. We examine how the previous concerns change when bulk sequencing is explicitly considered, and underline opportunities for addressing dependencies due to freq