搜索 — ResearchTracker

Data visualization is a fundamental tool in genomics research, enabling the exploration, interpretation, and communication of complex genomic features. While machine learning models show promise for transforming data into insightful visualizations, current models lack the training foundation for domain-specific tasks. In an effort to provide a foundational resource for genomics-focused model training, we present a framework for generating a dataset that pairs abstract, low-level questions about genomics data with corresponding visualizations. Building on prior work with statistical plots, our approach adapts to the complexity of genomics data and the specialized representations used to depict them. We further incorporate multiple linked queries and visualizations, along with justifications for design choices, figure captions, and image alt-texts for each item in the dataset. We use genomics data retrieved from three distinct genomics data repositories (4DN, ENCODE, Chromoscope) to produce GQVis: a dataset consisting of 1.14 million single-query data points, 628k query pairs, and 589k query chains. The GQVis dataset and generation code are available at https://huggingface.co/dataset

Processing-in-memory for genomics workloads

arXiv2025-05-31作者：William Andrew Simon, Leonid Yavits, Konstantina Koliogeorgi

Low-cost, high-throughput DNA and RNA sequencing (HTS) data is the backbone of the life sciences. Genome sequencing is now becoming a part of Predictive, Preventive, Personalized, and Participatory (termed 'P4') medicine. All genomic data are currently processed in energy-hungry computer clusters and centers, necessitating data transfer, consuming substantial energy, and wasting valuable time. Therefore, there is a need for fast, energy-efficient, and cost-efficient technologies that enable genomics research without requiring data centers and cloud platforms. We recently launched the BioPIM Project to leverage emerging processing-in-memory (PIM) technologies to enable energy- and cost-efficient analysis of bioinformatics workloads. The BioPIM Project focuses on co-designing algorithms and data structures commonly used in genomics with several PIM architectures to achieve the highest cost, energy, and time savings.

搜索结果：Genomics & informatics

GQVis: A Dataset of Genomics Data Questions and Visualizations for Generative AI

Processing-in-memory for genomics workloads

Genomic reproducibility in the bioinformatics era

GREGoR: Accelerating Genomics for Rare Diseases

Linking heterogeneous microstructure informatics with expert characterization knowledge through customized and hybrid vision-language representations for industrial qualification

Leveraging State Space Models in Long Range Genomics

gggenomes: effective and versatile visualizations for comparative genomics

Revolutionizing Genomics with Reinforcement Learning Techniques

Large AI Models in Health Informatics: Applications, Challenges, and the Future

Cancer-inspired Genomics Mapper Model for the Generation of Synthetic DNA Sequences with Desired Genomics Signatures

AABAC -- Automated Attribute Based Access Control for Genomics Data

The big challenge for livestock genomics is to make sequence data pay

Distilling Genomic Models for Efficient mRNA Representation Learning via Embedding Matching

A Global Cybersecurity Standardization Framework for Healthcare Informatics

Pathway Tools version 28.0: Integrated Software for Pathway/Genome Informatics and Systems Biology

When repeats drive the vocabulary: a Byte-Pair Encoding analysis of T2T primate genomes

Genome-on-Diet: Taming Large-Scale Genomic Analyses via Sparsified Genomics

The Mechanistic Invariance Test: Genomic Language Models Fail to Learn Positional Regulatory Logic

bioETH-Beacon: A Confidential On-Chain Genomic Beacon with Encrypted Counts, Filters, and Bounded Noise over a Fully Homomorphic EVM

Deep Learning for Genomics: A Concise Overview