WebAssembly is a low-level binary format originally designed to enable high-performance applications to run in web browsers. As WebAssembly is increasingly being ported to various environments, the security verification of WebAssembly execution environments is becoming more critical. While a wide variety of WebAssembly binaries is required to verify these environments, collecting binaries from the wild poses clear limitations. To meet this demand, techniques for automatically generating WebAssembly binaries are necessary. However, automatic generation of WebAssembly binaries must consider not only syntactic validity but also the potential lack of diversity in the generated binaries. To overcome these challenges, this paper proposes a series of algorithms for generating valid and semantically rich sub-binaries from a given base WebAssembly binary. First, closure slices are extracted from the base WebAssembly binary using static program slicing. Then, a stack balance correction algorithm is applied to the closure slices to construct syntactically complete functions. Finally, the generated functions are assembled into a complete WebAssembly binary, and instruction-level mutation is applied to introduce diversity. Several experiments were designed to demonstrate the effectiveness and efficiency of these algorithms, and the evaluation results showed that the proposed algorithms are highly promising in generating sub-binaries.
Genomics analyses often rely on command-line tools executed via remote servers, imposing usability barriers for non-technical users and raising privacy concerns. WebAssembly (WASM) enables native-code execution directly in web browsers, eliminating installations and data transfers. We introduce BioChef, a client-side genomic workflow platform that uses WASM. BioChef compiles a genomics toolkit into browser-executable modules and exposes them through a drag-and-drop GUI designed to be intuitive. The system provides real-time validation, flexible input methods (form-based and JSON), intermediate step inspections, and reproducible workflows exportable as bash scripts or configuration files. Performance benchmarks across major browsers (Chromium, Gecko, WebKit) demonstrate rapid initialization (LCP 0.583 s), responsive interactivity (INP 30.5 ms), minimal layout shifts (CLS 0.01), and acceptable overhead (average 181.5 ms initial WASM module load). Although browser execution introduced performance penalties ( ∼ 130 × slower than native), BioChef workflows still significantly outperformed traditional web services such as Galaxy by avoiding network delays and server-side queueing (11.3 × faster in a standard pipeline benchmark). BioChef shows how WebAssembly on the client side can democratize genomic data processing, ensuring privacy, reproducibility and ease of use without external dependencies. To our knowledge, this is the first fully client-side, graphical genomic workflow environment powered by WASM.
Vancomycin-resistant Enterococcus faecium (VREfm) are healthcare-associated opportunistic pathogens of global significance. Genetic tools are needed to understand the molecular basis for VREfm clinically relevant phenotypes, such as persistence within the human gut or antimicrobial resistance. Here, we present a transposon-directed insertion-site sequencing (TraDIS) platform optimized for E. faecium. We engineered a transposon delivery plasmid, pIMTA(tetM), that can generate high-density transposon mutant libraries, combined with Oxford Nanopore Technology amplicon sequencing to map the transposon insertion sites. We have also customized a bioinformatic analysis suite that includes a WebAssembly powered visualization tool called Diana, for TraDIS data exploration and analysis (https://diana.cpg.org.au/). To demonstrate the performance of our platform, we assessed the impact of vancomycin exposure on a library of 48,458 unique transposon mutants. As expected, we could confirm the importance of the vanB operon for VREfm vancomycin resistance. However, we also identified an essential role for both vanWB and vanYB, each previously designated as protein of unknown function and accessory for resistance, respectively. Our end-to-end platform for running TraDIS experiments in VREfm will permit accessible, genome-scale, forward genetic screens to probe molecular mechanisms of persistence and pathogenesis.IMPORTANCEThere are limited genetic tools specifically developed and optimized for function in Enterococcus faecium. Here, we addressed this gap through the development of a transposon-directed insertion-site sequencing platform with a plasmid we engineered to specifically function in E. faecium. The application of nanopore sequencing, with a highly accessible sequence data processing and bioinformatic analysis pipeline, streamlines and simplifies the methodology. These developments will allow the functional genomic analysis of important traits involved in the pathobiology of this understudied bacterium. The approach and tools we have described here are likely applicable to other Gram-positive bacteria.
暂无摘要(点击查看详情)
Chromatic is a novel web-browser tool that enables researchers to visually inspect genomic variations identified through next-generation sequencing of cancer data sets to determine whether such calls are, in fact, valid. It is the first cancer bioinformatics tool developed using WebAssembly technology, which comprises a portable, low-level byte code format that provides for the rapid execution of programs within supported web browsers. It has been designed expressly for ease of use by scientists without extensive expertise in bioinformatics.
Application latency requirements, privacy, and security concerns have naturally pushed computing onto smartphone and IoT devices in a decentralized manner. In response to these demands, researchers have developed micro-runtimes for WebAssembly (Wasm) on IoT devices to enable streaming applications to a runtime that can run the target binaries that are independent of the device. However, the migration of Wasm and the associated security research has neglected the urgent needs of access control on bare-metal, memory management unit (MMU)-less IoT devices that are sensing and actuating upon the physical environment. This paper presents Aerogel, an access control framework that addresses security gaps between the bare-metal IoT devices and the Wasm execution environment concerning access control for sensors, actuators, processor energy usage, and memory usage. In particular, we treat the runtime as a multi-tenant environment, where each Wasm-based application is a tenant. We leverage the inherent sandboxing mechanisms of Wasm to enforce the access control policies to sensors and actuators without trusting the bare-metal operating system. We evaluate our approach on a representative IoT development board: a cortex-M4 based development board (nRF52840). Our results show that Aerogel can effectively enforce compute resource and peripheral access control policies while introducing as little as 0.19% to 1.04% runtime overhead and consuming only 18.8% to 45.9% extra energy.
As the number and complexity of RNA and DNA structures continue to expand, there is a growing need for robust yet accessible tools that support their accurate interpretation, validation, and refinement. We present DNATCO v5.0 (dnatco.datmos.org), an interactive web application for comprehensive structural analysis of nucleic acids. DNATCO integrates the NtC dinucleotide conformational classes and the CANA structural alphabet to provide an intuitive, geometrically complete description of local backbone and base orientations, complemented by interactive visualization of base pairing. The platform performs quantitative validation of conformational similarity and covalent bond lengths and angles, using newly established nucleic-acid valence-geometry standards. Quantitative validation encompasses the confal score and scattergrams mapping the fit between experimental electron density and geometry similarity to the closest NtC class. All outputs are downloadable. Integrated diagnostic tools help users identify unusual or problematic regions, explore alternative conformations, and generate torsion-restraint files for downstream. DNATCO v5.0 is implemented entirely client-side via WebAssembly, ensuring fast performance and preserving data privacy, and supports both PDB and user-provided structural models. By combining a rigorous geometric framework with an approachable interface, DNATCO enables both non-experts and specialists to evaluate nucleic-acid structures with greater confidence and to improve models in ways that support accurate biological interpretation.
Antimony is a human-readable language for defining and sharing models developed by the systems biology community. It enables scientists to describe biochemical networks with a simple syntax, while supporting seamless conversion to and from the Systems Biology Markup Language (SBML) community standard. Since Antimony's original release, both SBML and modeling practices have evolved significantly, creating a need to update Antimony to maintain its standards compliance and practical relevance. In this paper, we introduce Antimony 3, a comprehensive update that formalizes its cumulative improvements and extends its support for SBML Level 3 Core and Flux Balance Constraints (FBC), Distributions, Layout, and Render packages. Antimony 3 enables model specifications that combine kinetic reactions with flux balance analysis, represent uncertainty using probability distributions, add biological context through annotations, and define publication-ready visualizations, all within a unified plain-text format. Antimony 3 is delivered as a lightweight C/C++ library with a stable C API. It is available through official bindings for Python, Julia, and JavaScript/WebAssembly, as well as a cross-platform desktop GUI, which enables straightforward use across scripting environments, desktop applications, and browser-based tools. Antimony 3 is released as open-source software under the BSD 3-Clause License and is available at https://github.com/sys-bio/antimony.
Extracellular matrix (ECM) remodeling is central to a wide variety of healthy and diseased tissue processes. Unfortunately, predicting ECM remodeling under various chemical and mechanical conditions has proven to be excessively challenging, due in part to its complex regulation by intracellular and extracellular molecular reaction networks that are spatially and temporally dynamic. We introduce ECMSim, which is a highly interactive, real-time, and web application designed to simulate heterogeneous matrix remodeling. The current model simulates cardiac scar tissue with configurable input conditions using a large-scale model of the cardiac fibroblast signaling network. Cardiac fibrosis is a major component of many forms of heart failure. ECMSim solves 1.37 million coupled ordinary differential equations (ODEs) and executes approximately 4.84 million operations per time step in real time, encompassing 137 molecular species and 259 regulatory interactions per cell across a 100 × 100 spatial array (10,000 cells), which accounts for inputs, receptors, intracellular signaling cascades, ECM production, feedback loops, and molecular diffusion. The algorithm is represented by a set of ODEs that are coupled with ECM molecular diffusion. The equations are solved on demand using compiled C++ and the WebAssembly standard. The platform includes brush-style cell selection to target a subset of cells with adjustable input molecule concentrations, parameter sliders to adjust parameters on demand, and multiple coupled real-time visualizations of network dynamics at multiple scales. Implementing ECMSim in standard web technologies enables a fully functional application that combines real-time simulation, visual interaction, and model editing. The software enables the investigation of pathological or experimental conditions, hypothetical scenarios, matrix remodeling, or the testing of the effects of an experimental drug(s) with a target receptor.
Mapping single-cell datasets to large atlases is often hindered by server constraints and privacy concerns. We present CytoVerse, a framework that runs scRNA-seq Foundation Models (scFM) entirely in the browser. Three key contributions enable this: (1) deploying models via ONNX without server side compute; (2) using compressed indexing (IVFPQ) to search a more then 20 million cell reference from the client; and (3) a lightweight protocol for sharing embeddings across consortia without exposing raw data. CytoVerse thereby provides a scalable, privacy preserving framework for distributed single-cell analysis.
First released in 2016, Heatmapper provided the first comprehensive, web-based platform for easily visualizing and manipulating heat maps for a wide range of applications in biology, epidemiology, ecology, and many other areas of science and social science. However, as Heatmapper's popularity grew, limitations in its performance and functionality became more apparent, necessitating the development of a new version: Heatmapper2 (https://heatmapper2.ca/). Heatmapper2 represents a substantial upgrade to the original Heatmapper web server, with much of the code being completely rewritten to improve performance, enhance capabilities and integrate new web technologies. Among the key changes are the conversion of the back-end code from R to Python (for better processing speed), the migration away from R Shiny to Shiny Python, and the use of WebAssembly. WebAssembly enables high performance, graphically intense applications to be run client-side in a web browser. Moving computationally intense calculations away from a central server and on to client computers eliminates server congestion and significantly improves performance. In addition to its significantly improved performance, Heatmapper2 now supports a wider range of heat mapping options including: time-series or animated heat maps (for geospatial applications), 3D heat maps (for mapping data on organisms or body parts); protein structure heat maps (for mapping molecular dynamic processes), molecular spatial heat maps (for spatial omics applications), and spectrometric heat maps (for mass spectrometry applications). Heatmapper2's redesigned interface also supports much more extensive customization, more easily editable tables, and more efficient handling of large datasets. These enhancements should make Heatmapper2 much more appealing for a wider range of researchers and research applications.
Harmonization and aggregation of heterogeneous data from Human Biomonitoring (HBM) studies is critical to enhance the reliability of conclusions and move towards FAIR (i.e., Findable, Accessible, Interoperable, Reusable) data. We introduce the HBM Data Toolkit developed by the Flemish Institute for Technological Research (Vlaamse Instelling voor Technologisch Onderzoek - VITO) with the primary goal of optimizing data integrity and interoperability, key steps towards FAIR, while using flexible templates and ensuring data confidentiality. The HBM Data Toolkit was built in 2023-2024 and made available for stakeholders (via https://hbm.vito.be/tools) within the Partnership for the Assessment of Risks from Chemicals (PARC eu-parc.eu). The toolkit consists of 4 modules including data harmonization, data validation, derived variables, and summary statistics calculation. A Python package was created to interpret the templates, making validation and transformation possible. Using Pyodide and WebAssembly, the toolkit runs entirely in the web browser, enabling secure, local execution of Python code without uploading any data. In the validation module, input files in common format (i.e., Excel) were used to configure data templates, aligning with standards and formats as specified under the HBM4EU project (hbm4eu.eu) and PARC. The HBM Data Toolkit allows harmonized data storage in the Personal Exposure and Health (PEH) data platform. Formatted and validated HBM data were made compatible with the Monte Carlo Risk Assessment (MCRA) platform. In the derived variables calculation module, the toolkit also allows users to calculate imputed censored data and standardize/normalize the biomarker data. Furthermore, summary statistics (e.g., geometric mean, percentiles) can be calculated and further visualized in the European HBM dashboard and integrated into the Information Platform for Chemical Monitoring (IPCHEM). In conclusion, the current toolkit proves effective in advancing data quality, harmonization, and aggregation in HBM studies. With local execution, user-friendly codebooks, and standardized schemas, it supports a unified framework that enables consistent analysis and interpretation across diverse studies and datasets.
The Next-Generation IoT integrates diverse technological enablers, allowing the creation of advanced systems with increasingly complex requirements and maximizing the use of available IoT-edge-cloud resources. This paper introduces an orchestrator architecture for dynamic IoT scenarios, inspired by ETSI NFV MANO and Cloud Native principles, where distributed computing nodes often have unfixed and changing networking configurations. Unlike traditional approaches, this architecture also focuses on managing services across massively distributed mobile nodes, as demonstrated in the automotive use case presented. Apart from working as MANO framework, the proposed solution efficiently handles service lifecycle management in large fleets of vehicles without relying on public or static IP addresses for connectivity. Its modular, microservices-based approach ensures adaptability to emerging trends like Edge Native, WebAssembly and RISC-V, positioning it as a forward-looking innovation for IoT ecosystems.
The rapid expansion of the Internet of Things (IoT) has made software security and reliability a critical concern. With multi-language programs running on edge computing, embedded systems, and sensors, each connected device represents a potential attack vector, threatening data integrity and privacy. Symbolic execution is a key technique for automated vulnerability detection. However, unknown function interfaces, such as sensor interactions, limit traditional concrete or concolic execution due to uncertain function returns and missing symbolic expressions. Compared with system simulation, the traditional method is to construct an interface abstraction layer for the symbolic execution engine to reduce the cost of simulation. Nevertheless, the disadvantage of this solution is that the manual modeling of these functions is very inefficient and requires professional developers to spend hundreds of hours. In order to improve efficiency, we propose an LLM-based automated approach for modeling unknown functions. By fine-tuning a 20-billion-parameter language model, it automatically generates function models based on annotations and function names. Our method improves symbolic execution efficiency, reducing reliance on manual modeling, which is a limitation of existing frameworks like KLEE. Experimental results primarily focus on comparing the usability, accuracy, and efficiency of LLM-generated models with human-written ones. Our approach was integrated into one verification platform project and applied to the verification of smart contracts with distributed edge computing characteristics. The application of this method directly reduces the manual modeling effort from a month to just a few minutes. This provides a foundational validation of our method's feasibility, particularly in reducing modeling time while maintaining quality. This work is the first to integrate LLMs into formal verification, offering a scalable and automated verification solution for sensor-driven software, blockchain smart contracts, and WebAssembly systems, expanding the scope of secure IoT development.
The advent of next-generation and long-read sequencing technologies has provided an ever-increasing wealth of phylogenetic data that require specially designed algorithms to decipher the underlying evolutionary relationships. As large-scale data become increasingly accessible, there is a concomitant need for efficient computational libraries that facilitate the development and dissemination of specialized algorithms for phylogenetic comparative biology. We introduce Phylo-rs: a fast, extensible, general-purpose library for phylogenetic analysis and inference written in the Rust programming language. Phylo-rs leverages a combination of speed, memory-safety, and native WebAssembly support offered by Rust to provide a robust set of memory-efficient data structures and elementary phylogenetic algorithms. Phylo-rs focuses on the efficient and convenient deployment of software aimed at large-scale phylogenetic analysis and inference. Scalability analysis against popular libraries shows that Phylo-rs performs comparably or better on key algorithms. We utilized it to assess the phylogenetic diversity of influenza A virus in swine, identifying virus groups that are undergoing evolutionary expansion that could be targeted for control through multivalent vaccines. Additionally, we used Phylo-rs to enhance phylogenetic inference by visualizing tree space from Markov chain Monte Carlo (MCMC) Bayesian analysis, efficiently computing approximately five billion tree pair distances to evaluate convergence and select MCMC runs for genomic epidemiology. Phylo-rs enables the design and implementation of cutting-edge software for phylogenetic analysis, thereby facilitating the application and dissemination of theoretical advancements in biology. Phylo-rs is available under an open-source license on GitHub at https://github.com/sriram98v/phylo-rs , with documentation available at https://docs.rs/phylo/latest/phylo/ .
Extracellular matrix (ECM) remodeling is central to a wide variety of healthy and diseased tissue processes. Unfortunately, predicting ECM remodeling under various chemical and mechanical conditions has proven to be excessively challenging, due in part to its complex regulation by intracellular and extracellular molecular reaction networks that are spatially and temporally dynamic. We introduce ECMSim, which is a highly interactive, real-time and web application designed to simulate heterogeneous matrix remodeling. The current model simulates cardiac scar tissue with configurable input conditions using a large-scale model of the cardiac fibroblast signaling network. Cardiac fibrosis is a major component of many forms of heart failure. ECMSim simulates over 1.3 million equations simultaneously in real time that include more than 125 species and more than 200 edges in each cell in a 100×100 spatial array (10,000 cells), which accounts for inputs, receptors, intracellular signaling cascades, ECM production, and feedback loops, as well as molecular diffusion. The algorithm is represented by a set of ordinary differential equations (ODEs) that are coupled with ECM molecular diffusion. The equations are solved on demand using compiled C++ and the WebAssembly standard. The platform includes brush-style cell selection to target a subset of cells with adjustable input molecule concentrations, parameter sliders to adjust parameters on demand, and multiple coupled real-time visualizations of network dynamics at multiple scales. Implementing ECMSim in standard web technologies enables a fully functional application that combines real-time simulation, visual interaction, and model editing. The software enables the investigation of pathological or experimental conditions, hypothetical scenarios, matrix remodeling, or the testing of the effects of an experimental drug(s) with a target receptor.
The barriers to effective data analysis are sometimes insurmountable. Concerns ranging from privacy, security, and complexity can prevent researchers from using existing data analysis tools. JINet is a web browser-based platform intended to democratise access to advanced clinical and genomic data analysis software. It hosts numerous data analysis applications that are run in the safety of each User's web browser, without the data ever leaving their machine. JINet promotes collaboration, standardisation and reproducibility by sharing scripts rather than data and creating a self-sustaining community around it in which Users and data analysis tools Developers interact thanks to JINet's interoperability primitives.
Compartmentalization is vital for cell biological processes. The field of rule-based stochastic simulation has acknowledged this, and many tools and methods have capabilities for compartmentalization. However, mostly, this is limited to a static compartmental hierarchy and does not integrate compartmental changes. Integrating compartmental dynamics is challenging for the design of the modeling language and the simulation engine. The language should support a concise yet flexible modeling of compartmental dynamics. Our work is based on ML-Rules, a rule-based language for multi-level cell biological modeling that supports a wide variety of compartmental dynamics, whose syntax we slightly adapt. To develop an efficient simulation engine for compartmental dynamics, we combine specific data structures and new and existing algorithms and implement them in the Rust programming language. We evaluate the concept and implementation using two case studies from existing cell-biological models. The execution of these models outperforms previous simulations of ML-Rules by two orders of magnitude. Finally, we present a prototype of a WebAssembly-based implementation to allow for a low barrier of entry when exploring the language and associated models without the need for local installation.
An integrated computer software system for macromolecular crystallography (MX) data collection at the BL02U1 and BL10U2 beamlines of the Shanghai Synchrotron Radiation Facility is described. The system, Finback, implements a set of features designed for the automated MX beamlines, and is marked with a user-friendly web-based graphical user interface (GUI) for interactive data collection. The Finback client GUI can run on modern browsers and has been developed using several modern web technologies including WebSocket, WebGL, WebWorker and WebAssembly. Finback supports multiple concurrent sessions, so on-site and remote users can access the beamline simultaneously. Finback also cooperates with the deployed experimental data and information management system, the relevant experimental parameters and results are automatically deposited to a database.
Neuroimaging involves the acquisition of extensive 3D images and 4D time series data to gain insights into brain structure and function. The analysis of such data necessitates both spatial and temporal processing. In this context, "fslmaths" has established itself as a foundational software tool within our field, facilitating domain-specific image processing. Here, we introduce "niimath," a clone of fslmaths. While the term "clone" often carries negative connotations, we illustrate the merits of replicating widely-used tools, touching on aspects of licensing, performance optimization, and portability. For instance, our work enables the popular functions of fslmaths to be disseminated in various forms, such as a high-performance compiled R package known as "imbibe", a Windows executable, and a WebAssembly plugin compatible with JavaScript. This versatility is demonstrated through our NiiVue live demo web page. This application allows 'edge computing' where image processing can be done with a zero-footprint tool that runs on any web device without requiring private data to be shared to the cloud. Furthermore, our efforts have contributed back to FSL, which has integrated the optimizations that we've developed. This synergy has enhanced the overall transparency, utility and efficiency of tools widely relied upon in the neuroimaging community.