Online crowdsourcing platforms have made it increasingly easy to perform evaluations of algorithm outputs with survey questions like "which image is better, A or B?", leading to their proliferation in vision and graphics research papers. Results of these studies are often used as quantitative evidence in support of a paper's contributions. On the one hand we argue that, when conducted hastily as an afterthought, such studies lead to an increase of uninformative, and, potentially, misleading conclusions. On the other hand, in these same communities, user research is underutilized in driving project direction and forecasting user needs and reception. We call for increased attention to both the design and reporting of user studies in computer vision and graphics papers towards (1) improved replicability and (2) improved project direction. Together with this call, we offer an overview of methodologies from user experience research (UXR), human-computer interaction (HCI), and applied perception to increase exposure to the available methodologies and best practices. We discuss foundational user research methods (e.g., needfinding) that are presently underutilized in computer vision and g
Various data visualization applications such as reverse engineering and interactive authoring require a vocabulary that describes the structure of visualization scenes and the procedure to manipulate them. A few scene abstractions have been proposed, but they are restricted to specific applications for a limited set of visualization types. A unified and expressive model of data visualization scenes for different applications has been missing. To fill this gap, we present Manipulable Semantic Components (MSC), a computational representation of data visualization scenes, to support applications in scene understanding and augmentation. MSC consists of two parts: a unified object model describing the structure of a visualization scene in terms of semantic components, and a set of operations to generate and modify the scene components. We demonstrate the benefits of MSC in three applications: visualization authoring, visualization deconstruction and reuse, and animation specification.
Information visualization and natural language are intricately linked. However, the majority of research and relevant work in information and data visualization (and human-computer interaction) involve English-speaking populations as both researchers and participants, are published in English, and are presented predominantly at English-speaking venues. Although several solutions can be proposed such as translating English texts in visualization to other languages, there is little research that looks at the intersection of data visualization and different languages, and the implications that current visualization practices have on non-English speaking communities. In this position paper, we argue that linguistically diverse communities abound beyond the English-speaking world and offer a richness of experiences for the visualization research community to engage with. Through a case study of how two non-English languages interplay with data visualization reasoning in Madagascar, we describe how monolingualism in data visualization impacts the experiences of underrepresented populations and emphasize potential harm to these communities. Lastly, we raise several questions towards advoc
Trust plays a critical role in visual data communication and decision-making, yet existing visualization research employs varied trust measures, making it challenging to compare and synthesize findings across studies. In this work, we first took a bottom-up, data-driven approach to understand what visualization readers mean when they say they "trust" a visualization. We compiled and adapted a broad set of trust-related statements from existing inventories and collected responses to visualizations with varying degrees of trustworthiness. Through exploratory factor analysis, we derived an operational definition of trust in visualizations. Our findings indicate that people perceive a trustworthy visualization as one that presents credible information and is comprehensible and usable. Building on this insight, we developed an eight-item inventory: four core items measuring trust in visualizations and four optional items controlling for individual differences in baseline trust tendency. We established the inventory's internal consistency reliability using McDonald's omega, confirmed its content validity by demonstrating alignment with theoretically-grounded trust dimensions, and validat
Misleading visualizations pose a significant challenge to accurate data interpretation. While recent research has explored the use of Large Language Models (LLMs) for detecting such misinformation, practical tools that also support explanation and correction remain limited. We present MisVisFix, an interactive dashboard that leverages both Claude and GPT models to support the full workflow of detecting, explaining, and correcting misleading visualizations. MisVisFix correctly identifies 96% of visualization issues and addresses all 74 known visualization misinformation types, classifying them as major, minor, or potential concerns. It provides detailed explanations, actionable suggestions, and automatically generates corrected charts. An interactive chat interface allows users to ask about specific chart elements or request modifications. The dashboard adapts to newly emerging misinformation strategies through targeted user interactions. User studies with visualization experts and developers of fact-checking tools show that MisVisFix accurately identifies issues and offers useful suggestions for improvement. By transforming LLM-based detection into an accessible, interactive platfo
While large vision-language models can generate motion graphics animations from text prompts, they regularly fail to include all spatio-temporal properties described in the prompt. We introduce MoVer, a motion verification DSL based on first-order logic that can check spatio-temporal properties of a motion graphics animation. We identify a general set of such properties that people commonly use to describe animations (e.g., the direction and timing of motions, the relative positioning of objects, etc.). We implement these properties as predicates in MoVer and provide an execution engine that can apply a MoVer program to any input SVG-based motion graphics animation. We then demonstrate how MoVer can be used in an LLM-based synthesis and verification pipeline for iteratively refining motion graphics animations. Given a text prompt, our pipeline synthesizes a motion graphics animation and a corresponding MoVer program. Executing the verification program on the animation yields a report of the predicates that failed and the report can be automatically fed back to LLM to iteratively correct the animation. To evaluate our pipeline, we build a synthetic dataset of 5600 text prompts paire
We introduce two novel visualization designs to support practitioners in performing identification and discrimination tasks on large value ranges (i.e., several orders of magnitude) in time-series data: (1) The order of magnitude horizon graph, which extends the classic horizon graph; and (2) the order of magnitude line chart, which adapts the log-line chart. These new visualization designs visualize large value ranges by explicitly splitting the mantissa m and exponent e of a value v = m * 10e . We evaluate our novel designs against the most relevant state-of-the-art visualizations in an empirical user study. It focuses on four main tasks commonly employed in the analysis of time-series and large value ranges visualization: identification, discrimination, estimation, and trend detection. For each task we analyse error, confidence, and response time. The new order of magnitude horizon graph performs better or equal to all other designs in identification, discrimination, and estimation tasks. Only for trend detection tasks, the more traditional horizon graphs reported better performance. Our results are domain-independent, only requiring time-series data with large value ranges.
The IEEE VIS Conference (VIS) recently rebranded itself as a unified conference and officially positioned itself within the discipline of Data Science. Driven by this movement, we investigated (1) who contributed to VIS, and (2) where VIS stands in the scientific world. We examined the authors and fields of study of 3,240 VIS publications in the past 32 years based on data collected from OpenAlex and IEEE Xplore, among other sources. We also examined the citation flows from referenced papers (i.e., those referenced in VIS) to VIS, and from VIS to citing papers (i.e., those citing VIS). We found that VIS has been becoming increasingly popular and collaborative. The number of publications, of unique authors, and of participating countries have been steadily growing. Both cross-country collaborations, and collaborations between educational and non-educational affiliations, namely "cross-type collaborations", are increasing. The dominance of the US is decreasing, and authors from China are now an important part of VIS. In terms of author affiliation types, VIS is increasingly dominated by authors from universities. We found that the topics, inspirations, and influences of VIS research
We present the VIS30K dataset, a collection of 29,689 images that represents 30 years of figures and tables from each track of the IEEE Visualization conference series (Vis, SciVis, InfoVis, VAST). VIS30K's comprehensive coverage of the scientific literature in visualization not only reflects the progress of the field but also enables researchers to study the evolution of the state-of-the-art and to find relevant work based on graphical content. We describe the dataset and our semi-automatic collection process, which couples convolutional neural networks (CNN) with curation. Extracting figures and tables semi-automatically allows us to verify that no images are overlooked or extracted erroneously. To improve quality further, we engaged in a peer-search process for high-quality figures from early IEEE Visualization papers. With the resulting data, we also contribute VISImageNavigator (VIN, visimagenavigator.github.io), a web-based tool that facilitates searching and exploring VIS30K by author names, paper keywords, title and abstract, and years.
We present a method for registration and visualization of corresponding supine and prone virtual colonoscopy scans based on eigenfunction analysis and fold modeling. In virtual colonoscopy, CT scans are acquired with the patient in two positions, and their registration is desirable so that physicians can corroborate findings between scans. Our algorithm performs this registration efficiently through the use of Fiedler vector representation (the second eigenfunction of the Laplace-Beltrami operator). This representation is employed to first perform global registration of the two colon positions. The registration is then locally refined using the haustral folds, which are automatically segmented using the 3D level sets of the Fiedler vector. The use of Fiedler vectors and the segmented folds presents a precise way of visualizing corresponding regions across datasets and visual modalities. We present multiple methods of visualizing the results, including 2D flattened rendering and the corresponding 3D endoluminal views. The precise fold modeling is used to automatically find a suitable cut for the 2D flattening, which provides a less distorted visualization. Our approach is robust, an
Geospatial analysis is crucial for addressing many of the world's most pressing challenges. Given this, there is immense value in improving and expanding the visualization techniques used to communicate geospatial data. In this work, we explore this important intersection -- between geospatial analytics and visualization -- by examining a set of recent IEEE VIS Conference papers (a selection from 2017-2019) to assess the inclusion of geospatial data and geospatial analyses within these papers. After removing the papers with no geospatial data, we organize the remaining literature into geospatial data domain categories and provide insight into how these categories relate to VIS Conference paper types. We also contextualize our results by investigating the use of geospatial terms in IEEE Visualization publications over the last 30 years. Our work provides an understanding of the quantity and role of geospatial subject matter in recent IEEE VIS publications and supplies a foundation for future meta-analytical work around geospatial analytics and geovisualization that may shed light on opportunities for innovation.
Graphical overlays that layer visual elements onto charts, are effective to convey insights and context in financial narrative visualizations. However, automating graphical overlays is challenging due to complex narrative structures and limited understanding of effective overlays. To address the challenge, we first summarize the commonly used graphical overlays and narrative structures, and the proper correspondence between them in financial narrative visualizations, elected by a survey of 1752 layered charts with corresponding narratives. We then design FinFlier, a two-stage innovative system leveraging a knowledge-grounding large language model to automate graphical overlays for financial visualizations. The text-data binding module enhances the connection between financial vocabulary and tabular data through advanced prompt engineering, and the graphics overlaying module generates effective overlays with narrative sequencing. We demonstrate the feasibility and expressiveness of FinFlier through a gallery of graphical overlays covering diverse financial narrative visualizations. Performance evaluations and user studies further confirm system's effectiveness and the quality of gen
Analyzing user behavior from usability evaluation can be a challenging and time-consuming task, especially as the number of participants and the scale and complexity of the evaluation grows. We propose uxSense, a visual analytics system using machine learning methods to extract user behavior from audio and video recordings as parallel time-stamped data streams. Our implementation draws on pattern recognition, computer vision, natural language processing, and machine learning to extract user sentiment, actions, posture, spoken words, and other features from such recordings. These streams are visualized as parallel timelines in a web-based front-end, enabling the researcher to search, filter, and annotate data across time and space. We present the results of a user study involving professional UX researchers evaluating user data using uxSense. In fact, we used uxSense itself to evaluate their sessions.
Sensory illusions - where a sensory stimulus causes people to perceive effects that are altered by a different sensory stimulus - have the potential to enrich mixed-reality based interactions. The well-known colour-temperature illusion is a sensory illusion that causes people to, somewhat counterintuitively, perceive blue objects to feel warmer and red objects to feel colder. There is currently little information about whether this illusion can be recreated in mixed reality (MR). Additionally, it is unknown whether dynamic graphical effects made possible by mixed-reality systems could create a similar or potentially stronger effect to the color-temperature illusion. The results of our study (n=30) support that the color-temperature illusion can be recreated in MR and that dynamic graphics can create a new temperature-sensory illusion. Our dynamic-graphics-temperature illusion creates a stronger effect than the color-temperature illusion and has more intuitive relationship between the stimulus and the effect: cold graphical effects (a virtual ice ball) are perceived as colder and hot graphical effects (a virtual fire ball) as hotter. Our results demonstrate that mixed reality has th
We present a novel distributed union-find algorithm that features asynchronous parallelism and k-d tree based load balancing for scalable visualization and analysis of scientific data. Applications of union-find include level set extraction and critical point tracking, but distributed union-find can suffer from high synchronization costs and imbalanced workloads across parallel processes. In this study, we prove that global synchronizations in existing distributed union-find can be eliminated without changing final results, allowing overlapped communications and computations for scalable processing. We also use a k-d tree decomposition to redistribute inputs, in order to improve workload balancing. We benchmark the scalability of our algorithm with up to 1,024 processes using both synthetic and application data. We demonstrate the use of our algorithm in critical point tracking and super-level set extraction with high-speed imaging experiments and fusion plasma simulations, respectively.
Non-rigid registration is challenging because it is ill-posed with high degrees of freedom and is thus sensitive to noise and outliers. We propose a robust non-rigid registration method using reweighted sparsities on position and transformation to estimate the deformations between 3-D shapes. We formulate the energy function with position and transformation sparsity on both the data term and the smoothness term, and define the smoothness constraint using local rigidity. The double sparsity based non-rigid registration model is enhanced with a reweighting scheme, and solved by transferring the model into four alternately-optimized subproblems which have exact solutions and guaranteed convergence. Experimental results on both public datasets and real scanned datasets show that our method outperforms the state-of-the-art methods and is more robust to noise and outliers than conventional non-rigid registration methods.
This paper evaluates the visualization literacy of modern Large Language Models (LLMs) and introduces a novel prompting technique called Charts-of-Thought. We tested three state-of-the-art LLMs (Claude-3.7-sonnet, GPT-4.5 preview, and Gemini-2.0-pro) on the Visualization Literacy Assessment Test (VLAT) using standard prompts and our structured approach. The Charts-of-Thought method guides LLMs through a systematic data extraction, verification, and analysis process before answering visualization questions. Our results show Claude-3.7-sonnet achieved a score of 50.17 using this method, far exceeding the human baseline of 28.82. This approach improved performance across all models, with score increases of 21.8% for GPT-4.5, 9.4% for Gemini-2.0, and 13.5% for Claude-3.7 compared to standard prompting. The performance gains were consistent across original and modified VLAT charts, with Claude correctly answering 100% of questions for several chart types that previously challenged LLMs. Our study reveals that modern multimodal LLMs can surpass human performance on visualization literacy tasks when given the proper analytical framework. These findings establish a new benchmark for LLM vi
This paper presents a survey of ocean simulation and rendering methods in computer graphics. To model and animate the ocean's surface, these methods mainly rely on two main approaches: on the one hand, those which approximate ocean dynamics with parametric, spectral or hybrid models and use empirical laws from oceanographic research. We will see that this type of methods essentially allows the simulation of ocean scenes in the deep water domain, without breaking waves. On the other hand, physically-based methods use Navier-Stokes Equations (NSE) to represent breaking waves and more generally ocean surface near the shore. We also describe ocean rendering methods in computer graphics, with a special interest in the simulation of phenomena such as foam and spray, and light's interaction with the ocean surface.
Leveraging hypergraph structures to model advanced processes has gained much attention over the last few years in many areas, ranging from protein-interaction in computational biology to image retrieval using machine learning. Hypergraph models can provide a more accurate representation of the underlying processes while reducing the overall number of links compared to regular representations. However, interactive visualization methods for hypergraphs and hypergraph-based models have rarely been explored or systematically analyzed. This paper reviews the existing research landscape for hypergraph and hypergraph model visualizations and assesses the currently employed techniques. We provide an overview and a categorization of proposed approaches, focusing on performance, scalability, interaction support, successful evaluation, and the ability to represent different underlying data structures, including a recent demand for a temporal representation of interaction networks and their improvements beyond graph-based methods. Lastly, we discuss the strengths and weaknesses of the approaches and give an insight into the future challenges arising in this emerging research field.
Identifying an appropriate radius for unbiased kernel estimation is crucial for the efficiency of radiance estimation. However, determining both the radius and unbiasedness still faces big challenges. In this paper, we first propose a statistical model of photon samples and associated contributions for progressive kernel estimation, under which the kernel estimation is unbiased if the null hypothesis of this statistical model stands. Then, we present a method to decide whether to reject the null hypothesis about the statistical population (i.e., photon samples) by the F-test in the Analysis of Variance. Hereby, we implement a progressive photon mapping (PPM) algorithm, wherein the kernel radius is determined by this hypothesis test for unbiased radiance estimation. Secondly, we propose VCM+, a reinforcement of Vertex Connection and Merging (VCM), and derive its theoretically unbiased formulation. VCM+ combines hypothesis testing-based PPM with bidirectional path tracing (BDPT) via multiple importance sampling (MIS), wherein our kernel radius can leverage the contributions from PPM and BDPT. We test our new algorithms, improved PPM and VCM+, on diverse scenarios with different light