RNA virus (e.g., SARS-CoV-2) evolves in a complex manner. Studying RNA virus evolution is vital for understanding molecular evolution and medicine development. Scientists lack, however, general frameworks to characterize the dynamics of RNA virus evolution directly from empirical data and identify potential physical laws. To fill this gap, we present a theory to characterize the RNA virus evolution as a physical system with absorbing states and avalanche behaviors. This approach maps accessible biological data (e.g., phylogenetic tree and infection) to a general stochastic process of RNA virus infection and evolution, enabling researchers to verify potential self-organized criticality underlying RNA virus evolution. We apply our framework to SARS-CoV-2, the virus accounting for the global epidemic of COVID-19. We find that SARS-CoV-2 exhibits scale-invariant avalanches as mean-field theory predictions. The observed scaling relation, universal collapse, and slowly decaying auto-correlation suggest a self-organized critical dynamics of SARS-CoV-2 evolution. Interestingly, the lineages that emerge from critical evolution processes coincidentally match with threatening lineages of SARS
Reputed intractable, the question of the origin of viruses has long been neglected. In the modern literature 'Virus evolution' has come to refer to study more akin to population genetics, such as the world-wide scrutiny on new polymorphisms appearing daily in the H5N1 avian flu virus [1], than to the fundamental interrogation: where do viruses come from? This situation is now rapidly changing, due to the coincidence of bold new ideas (and sometimes the revival of old ones), the unexpected features exhibited by recently isolated spectacular viruses [2] (see at URL: www.giantvirus.org), as well as the steady increase of genomic sequences for 'regular' viruses and cellular organisms enhancing the power of comparative genomics [3]. After being considered non-living and relegated in the wings by a majority of biologists, viruses are now pushed back on the center stage: they might have been at the origin of DNA, of the eukaryotic cell, and even of today's partition of biological organisms into 3 domains of life: bacteria, archaea and eukarya. Here, I quickly survey some of the recent discoveries and the new evolutionary thoughts they have prompted, before adding to the confusion with one
CRISPR is a newly discovered prokaryotic immune system. Bacteria and archaea with this system incorporate genetic material from invading viruses into their genomes, providing protection against future infection by similar viruses. The conditions for coexistence of prokaryots and viruses is an interesting problem in evolutionary biology. In this work, we show an intriguing phase diagram of the virus extinction probability, which is more complex than that of the classical predator-prey model. As the CRISPR incorporates genetic material, viruses are under pressure to evolve to escape the recognition by CRISPR. When bacteria have a small rate of deleting spacers, a new parameter region in which bacteria and viruses can coexist arises, and it leads to a more complex coexistence patten for bacteria and viruses. For example, when the virus mutation rate is low, the virus extinction probability changes non-montonically with the bacterial exposure rate. The virus and bacteria co-evolution not only alters the virus extinction probability, but also changes the bacterial population structure. Additionally, we show that recombination is a successful strategy for viruses to escape from CRISPR re
Motivated by observations in sequence data of herpesviruses, we introduce a multi-locus model for the joint evolution of different genotypes in a virus population that is distributed across a population of hosts. In the model, virus particles replicate, recombine, and mutate within their hosts at rates that act on different time scales. Furthermore, virus particles are exchanged between hosts at reinfection events and hosts are replaced by primary infected hosts when they die. We determine the asymptotic type distribution observed in a single host in the limit of large host and virus populations under asymptotic rate assumptions by tracing back the ancestry of the sample. The proposed model may serve as a null model for the evolution of virus populations that are capable of persistence and can be used to estimate the strengths of different evolutionary forces driving genetic diversity, see also [4].
Since its emergence in 1968, influenza A (H3N2) has evolved extensively in genotype and antigenic phenotype. Antigenic evolution occurs in the context of a two-dimensional 'antigenic map', while genetic evolution shows a characteristic ladder-like genealogical tree. Here, we use a large-scale individual-based model to show that evolution in a Euclidean antigenic space provides a remarkable correspondence between model behavior and the epidemiological, antigenic, genealogical and geographic patterns observed in influenza virus. We find that evolution away from existing human immunity results in rapid population turnover in the influenza virus and that this population turnover occurs primarily along a single antigenic axis. Thus, selective dynamics induce a canalized evolutionary trajectory, in which the evolutionary fate of the influenza population is surprisingly repeatable and hence, in theory, predictable.
The paper deals with the setting where two viruses (say virus 1 and virus 2) coexist in a population, and they are not necessarily mutually exclusive, in the sense that infection due to one virus does not preclude the possibility of simultaneous infection due to the other. We develop a coupled bi-virus susceptible-infected-susceptible (SIS) model from a 4n-state Markov chain model, where n is the number of agents (i.e., individuals or subpopulation) in the population. We identify a sufficient condition for both viruses to eventually die out, and a sufficient condition for the existence, uniqueness and asymptotic stability of the endemic equilibrium of each virus. We establish a sufficient condition and multiple necessary conditions for local exponential convergence to the boundary equilibrium (i.e., one virus persists, the other one dies out) of each virus. Under mild assumptions on the healing rate, we show that there cannot exist a coexisting equilibrium where for each node there is a nonzero fraction infected only by virus 1; a nonzero fraction infected only by virus 2; but no fraction that is infected by both viruses 1 and 2. Likewise, assuming that healing rates are strictly p
This paper presents a general overview on evolution of concealment methods in computer viruses and defensive techniques employed by anti-virus products. In order to stay far from the anti-virus scanners, computer viruses gradually improve their codes to make them invisible. On the other hand, anti-virus technologies continually follow the virus tricks and methodologies to overcome their threats. In this process, anti-virus experts design and develop new methodologies to make them stronger, more and more, every day. The purpose of this paper is to review these methodologies and outline their strengths and weaknesses to encourage those are interested in more investigation on these areas.
We test whether artificial intelligence architectural evolution obeys the same statistical laws as biological evolution. Compiling 935 ablation experiments from 161 publications, we show that the distribution of fitness effects (DFE) of architectural modifications follows a heavy-tailed Student's t-distribution with proportions (68% deleterious, 19% neutral, 13% beneficial for major ablations, n=568) that place AI between compact viral genomes and simple eukaryotes. The DFE shape matches D. melanogaster (normalized KS=0.07) and S. cerevisiae (KS=0.09); the elevated beneficial fraction (13% vs. 1-6% in biology) quantifies the advantage of directed over blind search while preserving the distributional form. Architectural origination follows logistic dynamics (R^2=0.994) with punctuated equilibria and adaptive radiation into domain niches. Fourteen architectural traits were independently invented 3-5 times, paralleling biological convergences. These results demonstrate that the statistical structure of evolution is substrate-independent, determined by fitness landscape topology rather than the mechanism of selection.
Most viruses are capable of fixing up the first few bytes and repair the original program because they have to return the control back to the infected program. This fact is used by a heuristic cleaner to clean the infected file. As the virus knows how to repair the it uses the same virus to repair the infected file. There are some infections where parts of the files are damaged by the virus. These types of infections are caused by 'file modifying viruses'. In these cases, the chance of recovery is less, but the anti-virus has to apply various methods with hope. The virus cleaner must know the characteristics of a virus in order to remove that virus. It cannot remove an unknown virus whose methods of infection are not known. If a virus is wrongly detected to be a different virus, then the cleaner will do wrong operations and build a garbage file.
Influenza virus contains two highly variable envelope glycoproteins, hemagglutinin (HA) and neuraminidase (NA). The structure and properties of HA, which is responsible for binding the virus to the cell that is being infected, change significantly when the virus is transmitted from avian or swine species to humans. Here we focus on much smaller human individual evolutionary amino acid mutational changes in NA, which cleaves sialic acid groups and is required for influenza virus replication. We show that very small amino acid changes can be monitored very accurately across many Uniprot and NCBI strains using hydropathicity scales to quantify the roughness of water film packages. Quantitative sequential analysis is most effective with the differential hydropathicity scale based on protein self-organized criticality (SOC). NA exhibits punctuated evolution at the molecular scale, millions of times smaller than the more familiar species scale, and thousands of times smaller than the genomic scale. Our analysis shows that large-scale vaccination programs have been responsible for a very large convergent reduction in influenza severity in the last century, a reduction which is hidden from
Recent studies show that newly sampled monkeypox virus (MPXV) genomes exhibit mutations consistent with Apolipoprotein B mRNA Editing Catalytic Polypeptide-like3 (APOBEC3)-mediated editing, compared to MPXV genomes collected earlier. It is unclear whether these single nucleotide polymorphisms (SNPs) result from APOBEC3-induced editing or are a consequence of genetic drift within one or more MPXV animal reservoirs. We develop a simple method based on a generalization of the General-Time-Reversible (GTR) model to show that the observed SNPs are likely the result of APOBEC3-induced editing. The statistical features allow us to extract lineage information and estimate evolutionary events.
Influenza virus contains two highly variable envelope glycoproteins, hemagglutinin (HA) and neuraminidase (NA). The structure and properties of HA, which is responsible for binding the virus to the cell that is being infected, change significantly when the virus is transmitted from avian or swine species to humans. Previously we identified much smaller human individual evolutionary amino acid mutational changes in NA, which cleaves sialic acid groups and is required for influenza virus replication. We showed that these smaller changes can be monitored very accurately across many Uniprot and NCBI strains using hydropathicity scales to quantify the roughness of water film packages, which increases gradually due to migration, but decreases abruptly under large-scale vaccination pressures. Here we show that, while HA evolution is much more complex, it still shows abrupt punctuation changes linked to those of NA. HA exhibits proteinquakes, which resemble earthquakes and are related to hydropathic shifting of sialic acid binding regions. HA proteinquakes based on sialic acid interactions are required for optimal balance between the receptor-binding and receptor-destroying activities of H
The evolution of the hemagglutinin amino acids sequences of Influenza A virus is studied by a method based on an informational metrics, originally introduced by Rohlin for partitions in abstract probability spaces. This metrics does not require any previous functional or syntactic knowledge about the sequences and it is sensitive to the correlated variations in the characters disposition. Its efficiency is improved by algorithmic tools, designed to enhance the detection of the novelty and to reduce the noise of useless mutations. We focus on the USA data from 1993/94 to 2010/2011 for A/H3N2 and on USA data from 2006/07 to 2010/2011 for A/H1N1 . We show that the clusterization of the distance matrix gives strong evidence to a structure of domains in the sequence space, acting as weak attractors for the evolution, in very good agreement with the epidemiological history of the virus. The structure proves very robust with respect to the variations of the clusterization parameters, and extremely coherent when restricting the observation window. The results suggest an efficient strategy in the vaccine forecast, based on the presence of "precursors" (or "buds") populating the most recent
As the anti-viruses run in a trusted kernel level any loophole in the anti-virus program can enable attackers to take full control over the computer system and steal data or do serious damages. Hence the anti-virus engines must be developed with proper security in mind. The ant-virus should be able to any type of specially created executable files, compression packages or documents that are intentionally created to exploit the anti-virus weakness. Viruses are present in almost every system even though there are anti-viruses installed. This is because every anti-virus, however good it may be, leads to some extent of false positives and false negatives. Our faith on the anti-virus system often makes us more careless about hygienic habits which increases the possibility of infection. It is necessary for an anti-virus to detect and destroy the malware before its own files are detected and destroyed by the malware.
We have further extended our compartmental model describing the spread of the infection in Italy. The model is based on the assumption that the time evolution of all of the observable quantities (number of people still positive to the infection, hospitalized and fatalities cases, healed people, and total number of people that has contracted the infection) depend on average parameters, namely people diffusion coefficient, infection cross-section, and population density. The model provides precious information on the tight relationship between the variation of the reported infection cases and a well defined observable physical quantity: the average number of people that lie within the daily displacement area of any single person. The extension of the model now includes self-consistent evaluation of the reproduction index, effect of immunization due to vaccination, and potential impact of virus variants on the dynamical evolution of the outbreak. The model fits very well the epidemic data, and allows us to strictly relate the time evolution of the number of hospitalized case and fatalities to the change of people mobility, vaccination rate, and appearance of an initial concentration o
A system of 4 nonlinearly-coupled Ordinary Differential Equations has been recently introduced to investigate the evolution of human respiratory virus epidemics. In this paper we point out that some explicit solutions of that system can be obtained by algebraic operations, provided the parameters of the model satisfy certain constraints.
Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modelled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modelling both "smooth" conformational changes and "catastrophic" conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence-structure evolutionary motif present in a large number of
In this article, I put forward the idea that the neoplastic process (NP) has deep evolutionary roots and make specific predictions about the connection between cancer and the formation of the first embryo, which allowed for the evolutionary radiation of metazoans. My main hypothesis is that the NP is at the heart of cellular mechanisms responsible for animal morphogenesis and, given its embryological basis, also at the center of animal evolution. It is thus understood that NP-associated mechanisms are deeply rooted in evolutionary history and tied to the formation of the first animal embryo. In my consideration of these arguments, I expound on how cancer biology is perfectly intertwined with evolutionary biology. I describe essential cellular components of unicellular holozoans that served as a basis for the formation of the neoplastic functional module (NFM) and its subsequent exaptation, which brought forth two great biophysical revolutions within the first embryo. Finally, I examine the role of Physics in the modeling of the NFM and its contribution to morphogenesis to reveal the totipotency of the zygote.
In the present work we analyze the problem of adaptation and evolution of RNA virus populations, by defining the basic stochastic model as a multivariate branching process in close relation with the branching process advanced by Demetrius, Schuster and Sigmund ("Polynucleotide evolution and branching processes", Bull. Math. Biol. 46 (1985) 239-262), in their study of polynucleotide evolution. We show that in the absence of beneficial forces the model is exactly solvable. As a result it is possible to prove several key results directly related to known typical properties of these systems like (i) proof, in the context of the theory of branching processes, of the lethal mutagenesis criterion proposed by Bull, Sanjuán and Wilke ("Theory of lethal mutagenesis for viruses", J. Virology 18 (2007) 2930-2939); (ii) a new proposal for the notion of relaxation time with a quantitative prescription for its evaluation and (iii) the quantitative description of the evolution of the expected values in four distinct regimes: transient, "stationary" equilibrium, extinction threshold and lethal mutagenesis. Moreover, new insights on the dynamics of evolving virus populations can be foreseen.
The evolutionary dynamics of human Influenza A virus presents a challenging theoretical problem. An extremely high mutation rate allows the virus to escape, at each epidemic season, the host immune protection elicited by previous infections. At the same time, at each given epidemic season a single quasi-species, that is a set of closely related strains, is observed. A non-trivial relation between the genetic (i.e., at the sequence level) and the antigenic (i.e., related to the host immune response) distances can shed light into this puzzle. In this paper we introduce a model in which, in accordance with experimental observations, a simple interaction rule based on spatial correlations among point mutations dynamically defines an immunity space in the space of sequences. We investigate the static and dynamic structure of this space and we discuss how it affects the dynamics of the virus-host interaction. Interestingly we observe a staggered time structure in the virus evolution as in the real Influenza evolutionary dynamics.