There have been several studies of the genome-wide temporal transcriptional program of viruses, based on microarray experiments, which are generally useful in the construction of gene regulation network. It seems that biological interpretations in these studies are directly based on the normalized data and some crude statistics, which provide rough estimates of limited features of the profile and may incur biases. This paper introduces a hierarchical Bayesian shape restricted regression method for making inference on the time course expression of virus genes. Estimates of many salient features of the expression profile like onset time, inflection point, maximum value, time to maximum value, area under curve, etc. can be obtained immediately by this method. Applying this method to a baculovirus microarray time course expression data set, we indicate that many biological questions can be formulated quantitatively and we are able to offer insights into the baculovirus biology.
Motivated by observations in sequence data of herpesviruses, we introduce a multi-locus model for the joint evolution of different genotypes in a virus population that is distributed across a population of hosts. In the model, virus particles replicate, recombine, and mutate within their hosts at rates that act on different time scales. Furthermore, virus particles are exchanged between hosts at reinfection events and hosts are replaced by primary infected hosts when they die. We determine the asymptotic type distribution observed in a single host in the limit of large host and virus populations under asymptotic rate assumptions by tracing back the ancestry of the sample. The proposed model may serve as a null model for the evolution of virus populations that are capable of persistence and can be used to estimate the strengths of different evolutionary forces driving genetic diversity, see also [4].
The paper deals with the setting where two viruses (say virus 1 and virus 2) coexist in a population, and they are not necessarily mutually exclusive, in the sense that infection due to one virus does not preclude the possibility of simultaneous infection due to the other. We develop a coupled bi-virus susceptible-infected-susceptible (SIS) model from a 4n-state Markov chain model, where n is the number of agents (i.e., individuals or subpopulation) in the population. We identify a sufficient condition for both viruses to eventually die out, and a sufficient condition for the existence, uniqueness and asymptotic stability of the endemic equilibrium of each virus. We establish a sufficient condition and multiple necessary conditions for local exponential convergence to the boundary equilibrium (i.e., one virus persists, the other one dies out) of each virus. Under mild assumptions on the healing rate, we show that there cannot exist a coexisting equilibrium where for each node there is a nonzero fraction infected only by virus 1; a nonzero fraction infected only by virus 2; but no fraction that is infected by both viruses 1 and 2. Likewise, assuming that healing rates are strictly p
Most viruses are capable of fixing up the first few bytes and repair the original program because they have to return the control back to the infected program. This fact is used by a heuristic cleaner to clean the infected file. As the virus knows how to repair the it uses the same virus to repair the infected file. There are some infections where parts of the files are damaged by the virus. These types of infections are caused by 'file modifying viruses'. In these cases, the chance of recovery is less, but the anti-virus has to apply various methods with hope. The virus cleaner must know the characteristics of a virus in order to remove that virus. It cannot remove an unknown virus whose methods of infection are not known. If a virus is wrongly detected to be a different virus, then the cleaner will do wrong operations and build a garbage file.
We study a minimal stochastic individual-based model for a microbial population challenged by a persistent (lytic) virus epidemic. We focus on the situation in which the resident microbial host population and the virus population are in stable coexistence upon arrival of a single new ``mutant'' host individual. We assume that this mutant is capable of switching to a reversible state of dormancy upon contact with virions as a means of avoiding infection by the virus. At the same time, we assume that this new dormancy trait comes with a cost, namely a reduced individual reproduction rate. We prove that there is a non-trivial range of parameters where the mutants can nevertheless invade the resident population with strictly positive probability (bounded away from 0) in the large population limit. Given the reduced reproductive rate, such an invasion would be impossible in the absence of either the dormancy trait or the virus epidemic. We explicitly characterize the parameter regime where this emergence of a (costly) host dormancy trait is possible, determine the success probability of a single invader and the typical amount of time it takes the successful mutants to reach a macroscopi
Spatial transcriptomics (ST) is a novel technology that enables the observation of gene expression at the resolution of individual spots within pathological tissues. ST quantifies the expression of tens of thousands of genes in a tissue section; however, heavy observational noise is often introduced during measurement. In prior studies, to ensure meaningful assessment, both training and evaluation have been restricted to only a small subset of highly variable genes, and genes outside this subset have also been excluded from the training process. However, since there are likely co-expression relationships between genes, low-expression genes may still contribute to the estimation of the evaluation target. In this paper, we propose $Auxiliary \ Gene \ Learning$ (AGL) that utilizes the benefit of the ignored genes by reformulating their expression estimation as auxiliary tasks and training them jointly with the primary tasks. To effectively leverage auxiliary genes, we must select a subset of auxiliary genes that positively influence the prediction of the target genes. However, this is a challenging optimization problem due to the vast number of possible combinations. To overcome this
This paper studies a distributed continuous-time bi-virus model in which two competing viruses spread over a network consisting of multiple groups of individuals. Limiting behaviors of the network are characterized by analyzing the equilibria of the system and their stability. Specifically, when the two viruses spread over possibly different directed infection graphs, the system may have (1) a unique equilibrium, the healthy state, which is globally stable, implying that both viruses will eventually be eradicated, (2) two equilibria including the healthy state and a dominant virus state, which is almost globally stable, implying that one virus will pervade the entire network causing a single-virus epidemic while the other virus will be eradicated, or (3) at least three equilibria including the healthy state and two dominant virus states, depending on certain conditions on the healing and infection rates. When the two viruses spread over the same directed infection graph, the system may have zero or infinitely many coexisting epidemic equilibria, which represents the pervasion of the two viruses. Sensitivity properties of some nontrivial equilibria are investigated in the context of
In the present work, we further study the computational power of virus machines (VMs in short).VMs provide a computing paradigm inspired by the transmission and replication networks of viruses.VMs consist of process units (called hosts) structured by a directed graph whose arcs are called channels and an instruction graph that controls the transmissions of virus objects among hosts. The present work complements our understanding of the computing power of VMs by introducing normal forms; these expressions restrict the features in a given computing model.Some of the features that we restrict in our normal forms include (a) the number of hosts, (b) the number of instructions, and (c) the number of virus objects in each host. After we recall some known results on the computing power of VMs we give our series of normal forms, such as the size of the loops in the network, proving new characterisations of family of sets, such as finite sets, semilinear sets, or recursively enumerable sets (NRE).
As the anti-viruses run in a trusted kernel level any loophole in the anti-virus program can enable attackers to take full control over the computer system and steal data or do serious damages. Hence the anti-virus engines must be developed with proper security in mind. The ant-virus should be able to any type of specially created executable files, compression packages or documents that are intentionally created to exploit the anti-virus weakness. Viruses are present in almost every system even though there are anti-viruses installed. This is because every anti-virus, however good it may be, leads to some extent of false positives and false negatives. Our faith on the anti-virus system often makes us more careless about hygienic habits which increases the possibility of infection. It is necessary for an anti-virus to detect and destroy the malware before its own files are detected and destroyed by the malware.
The recently discovered Acanthamoeba polyphaga Mimivirus is the largest known DNA virus. Its particle size (>400 nm), genome length (1.2 million bp) and large gene repertoire (911 protein coding genes) blur the established boundaries between viruses and parasitic cellular organisms. In addition, the analysis of its genome sequence identified new types of genes not expected to be seen in a virus, such as aminoacyl-tRNA synthetases and other central components of the translation machinery. In this article, we examine how the finding of a giant virus for the first time overlapping with the world of cellular organisms in terms of size and genome complexity might durably influence the way we look at microbial biodiversity, and force us to fundamentally revise our classification of life forms. We propose to introduce the word "girus" to recognize the intermediate status of these giant DNA viruses, the genome complexity of which make them closer to small parasitic prokaryotes than to regular viruses.
Viruses are incapable of autonomous energy production. Although many experimental studies make it clear that viruses are parasitic entities that hijack the host's molecular resources, a detailed estimate for the energetic cost of viral synthesis is largely lacking. To quantify the energetic cost of viruses to their hosts, we enumerated the costs associated with two very distinct but representative DNA and RNA viruses, namely T4 and influenza. We found that for these viruses, translation of viral proteins is the most energetically expensive process. Interestingly, the cost of building a T4 phage and a single influenza virus are nearly the same. Due to influenza's higher burst size, however, the overall cost of a T4 phage infection is only 2-3% of the cost of an influenza infection. The costs of these infections relative to their host's estimated energy budget during the infection reveal that a T4 infection consumes about a third of its host's energy budget, where as an influenza infection consumes only 1%. Building on our estimates for T4, we show how the energetic costs of double-stranded DNA viruses scale with virus size, revealing that the dominant cost of building a virus can sw
Understanding how genes interact and relate to each other is a fundamental question in biology. However, current practices for describing these relationships, such as drawing diagrams or graphs in a somewhat arbitrary manner, limit our ability to integrate various aspects of the gene functions and view the genome holistically. To overcome these limitations, we need a more appropriate way to describe the intricate relationships between genes. Interestingly, category theory, an abstract field of mathematics seemingly unrelated to biology, has emerged as a powerful language for describing relations in general. We propose that category theory could provide a framework for unifying our knowledge of genes and their relationships. As a starting point, we construct a category of genes, with its morphisms abstracting various aspects of the relationships betweens genes. These relationships include, but not limited to, the order of genes on the chromosomes, the physical or genetic interactions, the signalling pathways, the gene ontology causal activity models (GO-CAM) and gene groups. Previously, they were encoded by miscellaneous networks or graphs, while our work unifies them in a consisten
RNA virus (e.g., SARS-CoV-2) evolves in a complex manner. Studying RNA virus evolution is vital for understanding molecular evolution and medicine development. Scientists lack, however, general frameworks to characterize the dynamics of RNA virus evolution directly from empirical data and identify potential physical laws. To fill this gap, we present a theory to characterize the RNA virus evolution as a physical system with absorbing states and avalanche behaviors. This approach maps accessible biological data (e.g., phylogenetic tree and infection) to a general stochastic process of RNA virus infection and evolution, enabling researchers to verify potential self-organized criticality underlying RNA virus evolution. We apply our framework to SARS-CoV-2, the virus accounting for the global epidemic of COVID-19. We find that SARS-CoV-2 exhibits scale-invariant avalanches as mean-field theory predictions. The observed scaling relation, universal collapse, and slowly decaying auto-correlation suggest a self-organized critical dynamics of SARS-CoV-2 evolution. Interestingly, the lineages that emerge from critical evolution processes coincidentally match with threatening lineages of SARS
This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a machine learning-based approach, for analyzing gene expression data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV). We utilize a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen interaction analysis. SMAS effectively combines gene selection based on statistical significance and expression changes, employing linear classifiers such as logistic regression to accurately differentiate between RT-qPCR positive and negative NHP samples. A key finding of our research is the identification of IFI6 and IFI27 as critical biomarkers, demonstrating exceptional predictive performance with 100% accuracy and Area Under the Curve (AUC) metrics in classifying various stages of Ebola infection. Alongside IFI6 and IFI27, genes, including MX1, OAS1, and ISG15, were significantly upregulated, highlighting their essential roles in the immune response to EBOV. Our results underscore the efficacy of the SMAS method in revealing complex genetic interactions and response mechanisms during EBOV infection. This
Determining the full complement of protein-coding genes is a key goal of genome annotation. The most powerful approach for confirming protein coding potential is the detection of cellular protein expression through peptide mass spectrometry experiments. Here we map the peptides detected in 7 large-scale proteomics studies to almost 60% of the protein coding genes in the GENCODE annotation the human genome. We find that conservation across vertebrate species and the age of the gene family are key indicators of whether a peptide will be detected in proteomics experiments. We find peptides for most highly conserved genes and for practically all genes that evolved before bilateria. At the same time there is almost no evidence of protein expression for genes that have appeared since primates, or for genes that do not have any protein-like features or cross-species conservation. We identify 19 non-protein-like features such as weak conservation, no protein features or ambiguous annotations in major databases that are indicators of low peptide detection rates. We use these features to describe a set of 2,001 genes that are potentially non-coding, and show that many of these genes behave m
The genome of bacterial species is much more flexible than that of eukaryotes. Moreover, the distributed genome hypothesis for bacteria states that the total number of genes present in a bacterial population is greater than the genome of every single individual. The pangenome, i.e. the set of all genes of a bacterial species (or a sample), comprises the core genes which are present in all living individuals, and accessory genes, which are carried only by some individuals. In order to use accessory genes for adaptation to environmental forces, genes can be transferred horizontally between individuals. Here, we extend the infinitely many genes model from Baumdicker, Hess and Pfaffelhuber (2010) for horizontal gene transfer. We take a genealogical view and give a construction -- called the Ancestral Gene Transfer Graph -- of the joint genealogy of all genes in the pangenome. As application, we compute moments of several statistics (e.g. the number of differences between two individuals and the gene frequency spectrum) under the infinitely many genes model with horizontal gene transfer.
Our atmosphere is constantly changing and new pathogens are erupting now and then and the existing pathogens are mutating continuously. Some of these pathogens, such as SARS-CoV-2, become so deadly that they put the whole technological advancement of healthcare under challenge. Within this very decade several other deadly virus outbreaks were witnessed by humans such as Zika virus, Ebola virus, MERS-coronavirus etc. Though conventional techniques have succeeded in detecting these viruses to some extent, these techniques are time-consuming, costly, and require trained human-resources. Plasmonic metamaterial-based biosensors might pave the way to low-cost rapid virus detection. So this review discusses in details the latest development in plasmonics and metamaterial-based biosensors for virus, viral particles and antigen detection and the future direction of research in this field. Emergence of quantum properties in biosensing, application of machine learning, artificial intelligence and novel materials in biosensing is also discussed in brief.
High order structures (cavities and cliques) of the gene network of influenza A virus reveal tight associations among viruses during evolution and are key signals that indicate viral cross-species infection and cause pandemics. As indicators for sensing the dynamic changes of viral genes, these higher order structures have been the focus of attention in the field of virology. However, the size of the viral gene network is usually huge, and searching these structures in the networks introduces unacceptable delay. To mitigate this issue, in this paper, we propose a simple-yet-effective model named HyperSearch based on deep learning to search cavities in a computable complex network for influenza virus genetics. Extensive experiments conducted on a public influenza virus dataset demonstrate the effectiveness of HyperSearch over other advanced deep-learning methods without any elaborated model crafting. Moreover, HyperSearch can finish the search works in minutes while 0-1 programming takes days. Since the proposed method is simple and easy to be transferred to other complex networks, HyperSearch has the potential to facilitate the monitoring of dynamic changes in viral genes and help
This paper addressed the issue of estimating the damage caused by a computer virus. First, an individual-level delayed SIR model capturing the spreading process of a digital virus is derived. Second, the damage inflicted by the virus is modeled as the sum of the economic losses and the cost for developing the antivirus. Next, the impact of different factors, including the delay and the network structure, on the damage is explored by means of computer simulations. Thereby some measures of reducing the damage of a virus are recommended. To our knowledge, this is the first time the antivirus-developing cost is taken into account when estimating the damage of a virus.
This paper presents a general overview on evolution of concealment methods in computer viruses and defensive techniques employed by anti-virus products. In order to stay far from the anti-virus scanners, computer viruses gradually improve their codes to make them invisible. On the other hand, anti-virus technologies continually follow the virus tricks and methodologies to overcome their threats. In this process, anti-virus experts design and develop new methodologies to make them stronger, more and more, every day. The purpose of this paper is to review these methodologies and outline their strengths and weaknesses to encourage those are interested in more investigation on these areas.