The internet is a major source of medical information for patients, yet the quality of online health content remains highly variable. Existing assessment tools are often labor-intensive, invalidated, or limited in scope. We developed and validated MedReadr, an in-browser, rule-based natural language processing (NLP) algorithm that automatically estimates the reliability of consumer health articles for patients and providers. Thirty-five consumer medical articles were independently assessed by two reviewers using validated manual scoring systems (QUEST and Sandvik). Interrater reliability was evaluated with Cohen's κ, and metrics with κ > 0.6 were selected for model fitting. MedReadr extracted key features from article text and metadata using predefined NLP rules. A multivariable linear regression model was trained to predict manual reliability scores, with internal validation performed on an independent set of 20 articles. High interrater reliability was achieved across all QUEST and most Sandvik domains (Cohen's κ > 0.6). The MedReadr model demonstrated strong performance, achieving R 2 = 0.90 and RMSE = 0.05 on the development set and R 2 = 0.83 and RMSE = 0.07 on the validation set. All model coefficients were statistically significant (p < 0.05). Key predictive features included currency and reference scores, sentiment polarity, engagement content, and the frequency of provider contact, intervention endorsement, intervention mechanism, and intervention uncertainty phrases. MedReadr demonstrates that structural reliability scoring of online health articles can be automated using a transparent, rule-based NLP approach. Applied to English-language articles from mainstream search results on common medical conditions, the tool showed strong agreement with validated manual scoring systems. However, it has only been validated on a narrow scope of content and is not designed to analyze search results for specific questions or detect misinformation. Future research should assess its performance across a broader range of web content and evaluate whether its integration improves patient comprehension, digital health literacy, and clinician-patient communication. Many people use the internet to find medical information, but not all websites provide accurate or trustworthy content. It can be hard for patients and their families to know which articles they can rely on, especially when the information is complicated or presented in a confusing way. We created a tool called MedReadr to help solve this problem. MedReadr is a free browser extension that works in Google Chrome. When a person visits a health-related article online, MedReadr can quickly check the article and give it a reliability score. The score is based on specific features such as whether the article has up-to-date information, includes links to trustworthy sources, uses balanced language, and talks about the uncertainty of treatments. In this study, we tested MedReadr by comparing its scores to those given by human reviewers using two trusted scoring systems. We found that MedReadr’s scores were very similar to the scores given by people, which means it can be a reliable tool for checking medical information online. However, MedReadr is not designed to catch false or harmful information. It also does not evaluate things like how easy the article is to read or how the page is visually designed. Future versions may include more advanced features, but for now, MedReadr can help patients and healthcare providers quickly evaluate the general quality of health articles found on the web.
The diversity and utility of cinematic volume rendering (CVR) for medical image visualization have grown rapidly in recent years. At the same time, volume rendering on augmented and virtual reality systems is attracting greater interest with the advance of the WebXR standard. This paper introduces CVR extensions to the open-source visualization toolkit (vtk.js) that supports WebXR. This paper also summarizes two studies that were conducted to evaluate the speed and quality of various CVR techniques on a variety of medical data. This work is intended to provide the first open-source solution for CVR that can be used for in-browser rendering as well as for WebXR research and applications. This paper aims to help medical imaging researchers and developers make more informed decision when selecting CVR algorithms for their applications. Our software and this paper also provide a foundation for new research and product development at the intersection of medical imaging, web visualization, XR, and CVR.
Despite the ample progress made toward faster and more accurate Monte Carlo (MC) simulation tools over the past decade, the limited usability and accessibility of these advanced modeling tools remain key barriers to widespread use among the broad user community. An open-source, high-performance, web-based MC simulator that builds upon modern cloud computing architectures is highly desirable to deliver state-of-the-art MC simulations and hardware acceleration to general users without the need for special hardware installation and optimization. We have developed a configuration-free, in-browser 3D MC simulation platform-Monte Carlo eXtreme (MCX) Cloud-built upon an array of robust and modern technologies, including a Docker Swarm-based cloud-computing backend and a web-based graphical user interface (GUI) that supports in-browser 3D visualization, asynchronous data communication, and automatic data validation via JavaScript Object Notation (JSON) schemas. The front-end of the MCX Cloud platform offers an intuitive simulation design, fast 3D data rendering, and convenient simulation sharing. The Docker Swarm container orchestration backend is highly scalable and can support high-demand GPU MC simulations using MCX over a dynamically expandable virtual cluster. MCX Cloud makes fast, scalable, and feature-rich MC simulations readily available to all biophotonics researchers without overhead. It is fully open-source and can be freely accessed at http://mcx.space/cloud.
Simulators used in teaching are interactive applications comprising a mathematical model of the system under study and a graphical user interface (GUI) that allows the user to control the model inputs and visualize the model results in an intuitive and educational way. Well-designed simulators promote active learning, enhance problem-solving skills, and encourage collaboration and small group discussion. However, creating simulators for teaching purposes is a challenging process that requires many contributors including educators, modelers, graphic designers, and programmers. The availability of a toolchain of user-friendly software tools for building simulators can facilitate this complex task. This paper aimed to describe an open-source software toolchain termed Bodylight.js that facilitates the creation of browser-based client-side simulators for teaching purposes, which are platform independent, do not require any installation, and can work offline. The toolchain interconnects state-of-the-art modeling tools with current Web technologies and is designed to be resilient to future changes in the software ecosystem. We used several open-source Web technologies, namely, WebAssembly and JavaScript, combined with the power of the Modelica modeling language and deployed them on the internet with interactive animations built using Adobe Animate. Models are implemented in the Modelica language using either OpenModelica or Dassault Systèmes Dymola and exported to a standardized Functional Mock-up Unit (FMU) to ensure future compatibility. The C code from the FMU is further compiled to WebAssembly using Emscripten. Industry-standard Adobe Animate is used to create interactive animations. A new tool called Bodylight.js Composer was developed for the toolchain that enables one to create the final simulator by composing the GUI using animations, plots, and control elements in a drag-and-drop style and binding them to the model variables. The resulting simulators are stand-alone HyperText Markup Language files including JavaScript and WebAssembly. Several simulators for physiology education were created using the Bodylight.js toolchain and have been received with general acclaim by teachers and students alike, thus validating our approach. The Nephron, Circulation, and Pressure-Volume Loop simulators are presented in this paper. Bodylight.js is licensed under General Public License 3.0 and is free for anyone to use. Bodylight.js enables us to effectively develop teaching simulators. Armed with this technology, we intend to focus on the development of new simulators and interactive textbooks for medical education. Bodylight.js usage is not limited to developing simulators for medical education and can facilitate the development of simulators for teaching complex topics in a variety of different fields.
P: ileup.js is a new browser-based genome viewer. It is designed to facilitate the investigation of evidence for genomic variants within larger web applications. It takes advantage of recent developments in the JavaScript ecosystem to provide a modular, reliable and easily embedded library. The code and documentation for pileup.js is publicly available at https://github.com/hammerlab/pileup.js under the Apache 2.0 license. correspondence@hammerlab.org.
Background: Confident chemical annotation in nontarget small-molecule mass spectrometry critically depends on the availability of high-quality tandem mass spectral (MS2) reference libraries. While community efforts have driven significant expansion of open-access repositories, technical challenges in assembling standardized, metadata-rich records continue to limit broader participation, underscoring the need for improved computational tools to assist contributors. Methods: To promote the creation and sharing of standardized reference MS2 spectral records, we have developed Librarian, a free, open-access web application designed for rapid and scalable assembly of high-resolution MS2 libraries. Librarian integrates automated retrieval and harmonization of chemical identifiers and metadata from PubChem, compound mixture design for high-resolution mass spectrometry (HRMS) acquisition, and assembly of curated MS2 spectra into repository-ready records compatible with public spectral databases. Results: Through a simple in-browser interface, Librarian offers a flexible end-to-end workflow compatible with popular open-source pre-processing tools to lower technical barriers and facilitate broader community participation in library development. As a demonstration, we used Librarian to create and deposit a spectral library comprising over 1500 new MS2 records into MassBank, which was further applied in retrospective analysis of environmental datasets. Conclusions: Librarian streamlines the creation of standardized, metadata-rich and repository-ready MS2 reference records. Addressing a key bottleneck in community spectral library development and sharing, Librarian supports the continued growth of open-access resources for metabolomics, exposomics, and environmental mass spectrometry. The Librarian web application is publicly accessible via the SciLifeLab Serve platform.
Deployment complexity and specialized hardware requirements hinder the adoption of deep learning models in neuroimaging. We present MindGrab, a lightweight, fully convolutional model for volumetric skull stripping across the evaluated imaging modalities. MindGrab's architecture is designed from first principles using a spectral interpretation of dilated convolutions, and demonstrates state-of-the-art performance on the tested benchmarks (mean Dice score across datasets and modalities: 95.9 ± 1.6), with up to 40-fold speedups and substantially lower memory demands compared to established methods. Its minimal footprint allows for fast, full-volume processing in resource-constrained environments, including direct in-browser execution. MindGrab is delivered via the BrainChop platform as both a simple command-line tool (pip install brainchop) and a zero-installation web application (brainchop.org). By removing traditional deployment barriers without sacrificing accuracy, MindGrab makes state-of-the-art neuroimaging analysis broadly accessible.
Advanced biological imaging analysis platforms such as Activity Quantification and Analysis (AQuA2) enable accurate spatiotemporal activity analysis across diverse cell populations within many species. These tools are increasingly important for investigating cellular signaling dynamics and behavior. However, despite advances in the accuracy and species capability of AQuA2, it remains computationally demanding for analysis of long time-series datasets and requires all users to maintain a MATLAB license, which may limit accessibility and large-scale deployment. To address these limitations, we have designed and made available AQuA2-Cloud, a portable software stack and web platform developed as an improvement and further evolution of AQuA2. This container-deployable system permits multi-user cloud-based high accuracy activity quantification with intuitive workflows, export of analysis data and project files, and comparable processing times. The platform offers integrated features such as in-browser analysis control interfaces, asynchronous program state control, multiple users and user management, support for unreliable connections, file uploading and downloading via web browsers and File Transfer Protocol, and centralized organization of analysis output. AQuA2-Cloud constitutes a cloud-native solution for laboratories or research groups seeking to centralize analysis of spatiotemporal biological imaging datasets while reducing software installation and licensing barriers for end users. The platform enables researchers with minimal technical expertise to perform advanced bioimaging analysis through standard web browsers while maintaining the analytical capabilities of AQuA2. AQuA2-Cloud source code, deployment procedures, and documentation are freely available at (https://github.com/yu-lab-vt/AQuA2-Cloud).
In our previous work, we demonstrated that it is feasible to perform analysis on mutation signature data without the need for downloads or installations and analyze individual patient data without compromising privacy. Building on this foundation, we developed an in-browser Software Development Kit (a JavaScript SDK), mSigSDK, to facilitate the orchestration of distributed data processing workflows and graphic visualization of mutational signature analysis results. We strictly adhered to modern web computing standards, particularly the modularization standards set by the ECMAScript ES6 framework (JavaScript modules). Our approach allows for the computation to be entirely performed by secure delegation to the computational resources of the user's own machine (in-browser), without any downloads or installations. The mSigSDK was developed primarily as a companion library to the mSig Portal resource of the National Cancer Institute Division of Cancer Epidemiology and Genetics (NIH/NCI/DCEG), with a focus on FAIR extensibility as components of other researchers' own data science constructs. Anticipated extensions include the programmatic operation of other mutation signature API ecosystems such as SIGNAL and COSMIC, advancing towards a data commons for mutational signature research (Grossman et al., 2016).
Spatial reasoning is essential for solving complex tasks in dynamic and high-dimensional environments. However, current training models for spatial tasks are computationally demanding and heavily reliant on human input. To address this gap, we present Snake-ML, a web-based simulation tool and proof-of-concept framework designed to demonstrate client-side training of spatial reasoning tasks. Snake-ML serves as an efficient and intuitive test bed for developing spatial navigation strategies in browser-based environments. We chose the snake game as our test bed because it is well suited for demonstrating spatial reasoning in low-dimensional visual spaces while remaining relevant to higher-dimensional tasks, compared to alternative methods. Through quantitative analysis, on the edge alone, Snake-ML achieves a 4.58× speedup in model inference. Additionally, we developed a direct TensorFlow.js GPU pipeline that achieves up to a 32× speedup in training time without any CPU/GPU synchronization. This pipeline has the potential to improve many edge-based AI visualization projects. Snake-ML shows potential for adaptability to complex spatial tasks, such as autonomous systems, robotics, and AI-driven environments. Our code and web-based simulation tool are publicly available.
Segmentation and automated genome annotation (SAGA) techniques, such as Segway and ChromHMM, assign labels to every part of the genome, identifying similar patterns across multiple genomic input signals. Inferring biological meaning in these patterns remains challenging. Doing so requires a time-consuming process of manually downloading reference data, running multiple analysis methods, and interpreting many individual results. To simplify these tasks, we developed the turnkey system Segzoo. As input, Segzoo only requires a genome annotation file in browser extensible data (BED) format. It automatically downloads the rest of the data required for comparisons. Segzoo performs analyses using these data and summarizes results in a single visualization. The source code for Python ≥ 3.7 on Linux is freely available for download at https://github.com/hoffmangroup/segzoo under the GNU General Public License (GPL) version 2. Segzoo is also available in the Bioconda package segzoo: https://anaconda.org/bioconda/segzoo. We have deposited in Zenodo the version of the Segzoo source which produced the results in this article (https://doi.org/10.5281/zenodo.10988775), other code and data used to produce the results (https://doi.org/10.5281/zenodo.10477083), and the results (https://doi.org/10.5281/zenodo.10477106).
Generative Artificial Intelligence (AI) is a cutting-edge technology capable of producing text, images, and various media content leveraging generative models and user prompts. Between 2022 and 2023, generative AI surged in popularity with a plethora of applications spanning from AI-powered movies to chatbots. This paper investigates the potential of generative AI within the realm of the World Wide Web, specifically focusing on image generation. Web developers already harness generative AI to help craft text and images, while Web browsers might use it in the future to locally generate images for tasks such as repairing broken webpages, conserving bandwidth, and enhancing privacy. To explore this research area, this paper developed WebDiffusion, a tool that allows to simulate a Web powered by stable diffusion, a popular text-to-image model, from both a client and server perspective. Such a tool is the first of its kind, paving the way towards a futuristic world wide web where web images can be created using generative AI. WebDiffusion further supports crowdsourcing of user opinions, which is used to evaluate the quality and accuracy of 409 AI-generated images sourced from 60 webpages. Our findings suggest that generative AI is already capable of producing pertinent and high-quality Web images, even without requiring Web designers to manually input prompts, just by leveraging contextual information available within the webpages. However, direct in-browser image generation remains a challenge, as only highly powerful GPUs, such as the A40 and A100, can (partially) compete with classic image downloads. Nevertheless, this approach could be valuable for a subset of the images, for example, when fixing broken webpages or handling highly private content.
Existing disease discovery in papaya leaves is most significant in achieving yield and profitability stability in the tropics but has proven difficult in the presence of deficiencies in manual exploration and tailored crop models in crop-AI systems. Therefore, this study introduces PapayaNet, a lightweight attention-guided convolutional network specifically structured for the automated classification of six papaya leaf states, including major diseases and healthy leaves. For real-world deployment in scarce-resource farming contexts, PapayaNet adopts batch norm and hierarchical attention steps in five convolution stages and accelerates both computational celerity and discriminability. Trained on 6618 manually annotated orchard images sourced from orchards in Bangladesh at a very high resolution, it has a 98.79% classification accuracy, all of which was realized using 483,926 parameters and an average infer time of 0.01 s, which is significantly better when evaluated using EfficientNetB6, DenseNet121, and VGG16. XAI methods, including Grad-CAM and LIME, showed model decisions towards the biologically informative parts of the leaf, thus boosting interpretability and user confidence. Systematic ablation analysis also confirmed the importance of distributed attention in ensuring robust generalization towards visually similar disease classes. An in-browser diagnostic portal deployed using Gradio provides intra-browser predictive deployment and interpretability overlay in real time, thus inviting field practicability. Given its low-latency inference and minimal computational footprint, PapayaNet is well-suited for integration into edge devices and drone platforms, offering a scalable solution for real-time in-situ crop health monitoring. This study advances the field of precision agriculture by delivering a crop-specialized, explainable, and deployable AI system for sustainable management of papaya diseases.
In our previous work, we demonstrated that it is feasible to perform analysis on mutation signature data without the need for downloads or installations and analyze individual patient data at scale without compromising privacy. Building on this foundation, we developed an in-browser Software Development Kit (a JavaScript SDK), mSigSDK, to facilitate the orchestration of distributed data processing workflows and graphic visualization of mutational signature analysis results. We strictly adhered to modern web computing standards, particularly the modularization standards set by the ECMAScript ES6 framework (JavaScript modules). Our approach allows for the computation to be entirely performed by secure delegation to the computational resources of the user's own machine (in-browser), without any downloads or installations. The mSigSDK was developed primarily as a companion library to the mSig Portal resource of the National Cancer Institute Division of Cancer Epidemiology and Genetics (NIH/NCI/DCEG), with a focus on FAIR extensibility as components of other researchers' own data science constructs. Anticipated extensions include the programmatic operation of other mutation signature API ecosystems such as SIGNAL and COSMIC, advancing towards a data commons for mutational signature research (Grossman et al., 2016).
It remains unclear which interventions are effective in promoting more environmentally sustainable food choices within online grocery shopping environments. We set out to (1) use a plug-in (browser extension) to implement a pilot randomised controlled trial of eco-labels providing information on the environmental impact of specific food products, and (2) collect data to inform a larger trial investigating the effectiveness of eco-labels and other interventions promoting environmentally sustainable online food purchases. The plug-in was custom-built and active on a large UK supermarket website, accessed using the Google Chrome browser on a desktop or laptop. Of the 504 participants screened, 161 met eligibility criteria and were invited to participate in the study. 57 of these downloaded the plug-in (23 in the control group, 34 in the intervention group), of which 22 shopped at least once over the 1-month trial. There was no significant difference in average eco-score of purchases between the control and intervention groups (mean ± SD: 32 ± 13 vs. 41 ± 14; p = 0.22). 69/161 eligible participants responded to a follow-up survey and suggested technical support, reminders, greater incentives, and more information about eco-labels were needed for the full trial. We showed that it is feasible to evaluate online grocery shopping interventions without the collaboration of a supermarket using a web browser extension. This pilot trial was not registered, as its main purpose was to test the implementation of the plugin and gather data useful for planning the main trial, which is registered under ISRCTN18800054 as of 27/03/2024.
Motivation: The proliferation of genetic testing and consumer genomics represents a logistic challenge to the personalized use of GWAS data in VCF format. Specifically, the challenge of retrieving target genetic variation from large compressed files filled with unrelated variation information. Compounding the data traversal challenge, privacy-sensitive VCF files are typically managed as large stand-alone single files (no companion index file) composed of variable-sized compressed chunks, hosted in consumer-facing environments with no native support for hosted execution. Results: A portable JavaScript module was developed to support in-browser fetching of partial content using byte-range requests. This includes on-the-fly decompressing irregularly positioned compressed chunks, coupled with a binary search algorithm iteratively identifying chromosome-position ranges. The in-browser zero-footprint solution (no downloads, no installations) enables the interoperability, reusability, and user-facing governance advanced by the FAIR principles for stewardship of scientific data. Availability - https://episphere.github.io/vcf, including supplementary material.
This report presents the findings of a project from the 8th Biomedical Linked Annotation Hackathon (BLAH) to explore lightweight technology stacks to enhance assistive linked annotations. Using modern JavaScript frameworks and edge functions, in-browser Named Entity Recognition (NER), serverless embedding and vector search within web interfaces, and efficient serverless full-text search were implemented. Through this experimental approach, a proof of concept to demonstrate the feasibility and performance of these technologies was demonstrated. The results show that lightweight stacks can significantly improve the efficiency and cost-effectiveness of annotation tools and provide a local-first, privacy-oriented, and secure alternative to traditional server-based solutions in various use cases. This work emphasizes the potential of developing annotation interfaces that are more responsive, scalable, and user-friendly, which would benefit bioinformatics researchers, practitioners, and software developers.
Currently, the Polygenic Score (PGS) Catalog curates over 400 publications on over 500 traits corresponding to over 3000 polygenic risk scores (PRSs). To assess the feasibility of privately calculating the underlying multivariate relative risk for individuals with consumer genomics data, we developed an in-browserPRS calculator for genomic data that does not circulate any data or engage in any computation outside of the user's personal device. A prototype personal risk score calculator, created for research purposes, was developed to demonstrate how the PGS Catalog can be privately and readily applied to readily available direct-to-consumer genetic testing services, such as 23andMe. No software download, installation, or configuration is needed. The PRS web calculator matches individual PGS catalog entries with an individual's 23andMe genome data composed of 600k to 1.4 M single-nucleotide polymorphisms (SNPs). Beta coefficients provide researchers with a convenient assessment of risk associated with matched SNPs. This in-browser application was tested in a variety of personal devices, including smartphones, establishing the feasibility of privately calculating personal risk scores with up to a few thousand reference genetic variations and from the full 23andMe SNP data file (compressed or not). The PRScalc web application is developed in JavaScript, HTML, and CSS and is available at GitHub repository (https://episphere.github.io/prs) under an MIT license. The datasets were derived from sources in the public domain: [PGS Catalog, Personal Genome Project].
The analysis of data over space and time is a core part of descriptive epidemiology, but the complexity of spatiotemporal data makes this challenging. There is a need for methods that simplify the exploration of such data for tasks such as surveillance and hypothesis generation. In this paper, we use combined clustering and dimensionality reduction methods (hereafter referred to as 'cluster embedding' methods) to spatially visualize patterns in epidemiological time-series data. We compare several cluster embedding techniques to see which performs best along a variety of internal cluster validation metrics. We find that methods based on k-means clustering generally perform better than self-organizing maps on real world epidemiological data, with some minor exceptions. We also introduce EpiVECS, a tool which allows the user to perform cluster embedding and explore the results using interactive visualization. EpiVECS is available as a privacy preserving, in-browser open source web application at https://episphere.github.io/epivecs .
Neuroimaging research requires sophisticated tools for analyzing complex data, but efficiently leveraging these tools can be a major challenge, especially on large datasets. CBRAIN is a web-based platform designed to simplify the use and accessibility of neuroimaging research tools for large-scale, collaborative studies. In this paper, we describe how CBRAIN's unique features and infrastructure were leveraged to integrate TAPAS PhysIO, an open-source MATLAB toolbox for physiological noise modeling in fMRI data. This case study highlights three key elements of CBRAIN's infrastructure that enable streamlined, multimodal tool integration: a user-friendly GUI, a Brain Imaging Data Structure (BIDS) data-entry schema, and convenient in-browser visualization of results. By incorporating PhysIO into CBRAIN, we achieved significant improvements in the speed, ease of use, and scalability of physiological preprocessing. Researchers now have access to a uniform and intuitive interface for analyzing data, which facilitates remote and collaborative evaluation of results. With these improvements, CBRAIN aims to become an essential open-science tool for integrative neuroimaging research, supporting FAIR principles and enabling efficient workflows for complex analysis pipelines.