Specializing an OS to optimize the performance of a particular application is typically a manual process that requires great expertise. Specialization through configuration lends itself well to automation; however, it is challenging due to the sheer size of the configuration space of modern OSes, the difficulty to quantify that space, the long time it takes to evaluate a configuration, and the large number of invalid configurations. Hence, existing attempts at specializing OSes automatically are limited to switching features on and off to minimize memory consumption or attack surface, and cannot target metrics such as performance. We present Wayfinder, a framework specializing the configuration of OSes completely automatically and without expert knowledge. It can specialize all aspects of an OS configuration (compile-/boot-/run-time) towards any quantifiable performance, resource consumption, or security metric, for an application processing a given workload on a given hardware setup. Wayfinder consists of an automated OS benchmarking platform, and a neural network-based search algorithm driving the specialization process. This is achieved by learning on the fly which configuration
Navigation aids are central to immersive virtual reality (VR) experiences that involve physical locomotion. Their effectiveness depends not only on how much spatial information they provide, but also on how directly that information supports movement decisions. We compared three common guidance techniques for immersive VR wayfinding: a directional arrow, a minimap, and a compass. In a controlled room-scale VR study with 42 participants completing 1008 trials, participants navigated to target landmarks in a time-pressured maze with reduced visibility and forced route replanning. Across behavioral and eye-tracking measures, arrow guidance produced the strongest navigation performance, minimap guidance yielded intermediate performance, and compass cues performed worst, suggesting that during immersive locomotion users benefit from guidance that can be interpreted rapidly while moving. These results suggest that in demanding immersive locomotion tasks, interfaces that translate spatial information directly into actionable movement cues can outperform richer but more interpretive spatial representations. Our findings highlight the importance of designing XR navigation interfaces that mi
Robotic guidance systems have shown promise in supporting blind and visually impaired (BVI) individuals with wayfinding and obstacle avoidance. However, most existing systems assume a clear path and do not support a critical aspect of navigation - environmental interactions that require manipulating objects to enable movement. These interactions are challenging for a human-robot pair because they demand (i) precise localization and manipulation of interaction targets (e.g., pressing elevator buttons) and (ii) dynamic coordination between the user's and robot's movements (e.g., pulling out a chair to sit). We present a collaborative human-robot approach that combines our robotic guide dog's precise sensing and localization capabilities with the user's ability to perform physical manipulation. The system alternates between two modes: lead mode, where the robot detects and guides the user to the target, and adaptation mode, where the robot adjusts its motion as the user interacts with the environment (e.g., opening a door). Evaluation results show that our system enables navigation that is safer, smoother, and more efficient than both a traditional white cane and a non-adaptive guidin
Independent navigation in unfamiliar environments remains a major challenge for blind and visually impaired individuals, despite the availability of assistive technologies. This paper presents the results of a fully accessible online survey investigating navigation experiences, challenges, and technology preferences among people with visual impairments worldwide. The survey was distributed through individuals and organizations supporting visually impaired communities. Our results indicate that smartphone-based applications are the most used digital navigation aids, while a substantial proportion of participants report not using any assistive navigation technology due to cost, accessibility, or usability barriers. Participants reported persistent difficulties in obstacle detection, wayfinding, and navigation in complex environments. Despite a widespread focus on smartphone-based solutions, they expressed a clear preference for wearable and hands-free systems, highlighting a gap between current technology use and user needs. The findings provide a user-centered overview of navigation needs and offer insights into the design and evaluation of future assistive navigation systems.
This paper investigates the potential of vision-language models (VLMs) to assist people with blindness and low vision (pBLV) in navigation tasks. We evaluate state-of-the-art closed-source models, including GPT-4V, GPT-4o, Gemini-1.5-Pro, and Claude-3.5-Sonnet, alongside open-source models, such as Llava-v1.6-mistral and Llava-onevision-qwen, to analyze their capabilities in foundational visual skills: counting ambient obstacles, relative spatial reasoning, and common-sense wayfinding-pertinent scene understanding. We further assess their performance in navigation scenarios, using pBLV-specific prompts designed to simulate real-world assistance tasks. Our findings reveal notable performance disparities between these models: GPT-4o consistently outperforms others across all tasks, particularly in spatial reasoning and scene understanding. In contrast, open-source models struggle with nuanced reasoning and adaptability in complex environments. Common challenges include difficulties in accurately counting objects in cluttered settings, biases in spatial reasoning, and a tendency to prioritize object details over spatial feedback, limiting their usability for pBLV in navigation tasks.
Wayfinding, or the ability to navigate one's surroundings, is crucial for independent living and requires a complex combination of cognitive abilities, environmental awareness, and technology to manage this successfully. Individuals with cognitive impairment (IwCI) often face significant challenges in learning and navigating their environment. Despite its importance, mainstream navigation technologies are rarely designed with their diverse needs in mind. This study reframes the search for places as a socially distributed task and emphasizes the role of proxy stakeholders, who act on behalf or in coordination with IwCI during navigation. Using a qualitatively led mixed-methods approach, which includes an international survey and a three-stage interview study, we examine the real-world strategies that proxy stakeholders employ to support daily navigation. The findings are synthesized into a set of empirically grounded design recommendations that emphasize customisability, collaborative use, and support for routine-based navigation. Our findings highlight key challenges and adaptive practices, which are synthesized into design recommendations that prioritize customisability, routine-b
Navigating health questions can be daunting in the modern information landscape. Large language models (LLMs) may provide tailored, accessible information, but also risk being inaccurate, biased or misleading. We present insights from 4 mixed-methods studies (total N=163), examining how people interact with LLMs for their own health questions. Qualitative studies revealed the importance of context-seeking in conversational AIs to elicit specific details a person may not volunteer or know to share. Context-seeking by LLMs was valued by participants, even if it meant deferring an answer for several turns. Incorporating these insights, we developed a "Wayfinding AI" to proactively solicit context. In a randomized, blinded study, participants rated the Wayfinding AI as more helpful, relevant, and tailored to their concerns compared to a baseline AI. These results demonstrate the strong impact of proactive context-seeking on conversational dynamics, and suggest design patterns for conversational AI to help navigate health topics.
This paper deals with improving the capabilities of Large Language Models (LLM) to provide route instructions for pedestrian wayfinders by means of qualitative spatial relations.
Visual neuroprostheses (bionic eye) aim to restore a rudimentary form of vision by translating camera input into patterns of electrical stimulation. To improve scene understanding under extreme resolution and bandwidth constraints, prior work has explored computer vision techniques such as semantic segmentation and depth estimation. However, presenting all task-relevant information simultaneously can overwhelm users in cluttered environments. We compare two complementary approaches to semantic preprocessing in immersive virtual reality: SemanticEdges, which highlights all relevant objects at once, and SemanticRaster, which staggers object categories over time to reduce visual clutter. Using a biologically grounded simulation of prosthetic vision, 18 sighted participants performed a wayfinding task in a dynamic urban environment across three conditions: edge-based baseline (Control), SemanticEdges, and SemanticRaster. Both semantic strategies improved performance and user experience relative to the baseline, with each offering distinct trade-offs: SemanticEdges increased the odds of success, while SemanticRaster boosted the likelihood of collision-free completions. These findings un
Working memory involves the temporary retention of information over short periods. It is a critical cognitive function that enables humans to perform various online processing tasks, such as dialing a phone number, recalling misplaced items' locations, or navigating through a store. However, inherent limitations in an individual's capacity to retain information often result in forgetting important details during such tasks. Although previous research has successfully utilized wearable and assistive technologies to enhance long-term memory functions (e.g., episodic memory), their application to supporting short-term recall in daily activities remains underexplored. To address this gap, we present Memento, a framework that uses multimodal wearable sensor data to detect significant changes in cognitive state and provide intelligent in situ cues to enhance recall. Through two user studies involving 15 and 25 participants in visual search navigation tasks, we demonstrate that participants receiving visual cues from Memento achieved significantly better route recall, improving approximately 20-23% compared to free recall. Furthermore, Memento reduced cognitive load and review time by 46%
Hospitals are among the most cognitively demanding indoor environments, especially for patients and visitors unfamiliar with their layout. This study investigates the effectiveness of an augmented reality (AR)-based handheld navigation system compared to traditional paper maps in a large hospital setting. Through a mixed-methods experiment with 32 participants, we measured navigation performance, cognitive workload (NASA-TLX), situational anxiety (STAI-State), spatial behavior, and user satisfaction. Results show that AR users completed navigation tasks significantly faster, made fewer errors, and reported lower anxiety and workload. However, paper map users demonstrated stronger spatial memory in sketch-based recall tasks, highlighting a trade-off between real-time efficiency and long-term spatial learning. We discuss implications for inclusive AR design, spatial cognition, and healthcare accessibility, offering actionable design strategies for adaptive indoor navigation tools.
Brain computer interfaces enable real-time monitoring of cognitive load, but their effectiveness in dynamic navigation contexts is not well established. Using an existing VR navigation dataset, we examined whether EEG signals can classify cognitive load during map-based wayfinding and whether classification accuracy depends more on task complexity or on individual traits. EEG recordings from forty-six participants navigating routes with 3, 5, or 7 map landmarks were analyzed with a nested cross-validation framework across multiple machine learning models. Classification achieved mean accuracies up to 90.8% for binary contrasts (3 vs. 7 landmarks) and 78.7% for the three-class problem, both well above chance. Demographic and cognitive variables (age, gender, spatial ability, working memory) showed no significant influence. These findings demonstrate that task demands outweigh individual differences in shaping classification performance, highlighting the potential for task-adaptive navigation systems that dynamically adjust map complexity in response to real-time cognitive states.
Wayfinding, the ability to recall the environment and navigate through it, is an essential cognitive skill relied upon almost every day in a person's life. A crucial component of wayfinding is the construction of cognitive maps, mental representations of the environments through which a person travels. Age, disease or injury can severely affect cognitive mapping, making assessment of this basic survival skill particularly important to clinicians and therapists. Cognitive mapping has also been the focus of decades of basic research by cognitive psychologists. Both communities have evolved a number of techniques for assessing cognitive mapping ability. We present the Cognitive Map Probe (CMP), a new computerized tool for assessment of cognitive mapping ability that increases consistency and promises improvements in flexibility, accessibility, sensitivity and control. The CMP uses a tangible user interface that affords spatial manipulation. We describe the design of the CMP, and find that it is sensitive to factors known to affect cognitive mapping performance in extensive experimental testing.
Navigational signs are common aids for human wayfinding and scene understanding, but are underutilized by robots. We argue that they benefit robot navigation and scene understanding, by directly encoding privileged information on actions, spatial regions, and relations. Interpreting signs in open-world settings remains a challenge owing to the complexity of scenes and signs, but recent advances in vision-language models (VLMs) make this feasible. To advance progress in this area, we introduce the task of navigational sign understanding which parses locations and associated directions from signs. We offer a benchmark for this task, proposing appropriate evaluation metrics and curating a test set capturing signs with varying complexity and design across diverse public spaces, from hospitals to shopping malls to transport hubs. We also provide a baseline approach using VLMs, and demonstrate their promise on navigational sign understanding. Code and dataset are available on Github.
Accurate indoor localization underpins applications ranging from wayfinding and emergency response to asset tracking and smart-building services. Radio-frequency solutions (e.g. Wi-Fi, RFID, UWB) are widely adopted but remain vulnerable to multipath fading, interference, and uncontrollable coverage variation. We explore an orthogonal modality -- visible light communication (VLC) -- and demonstrate that the spectral signatures captured by a low-cost AS7341 sensor can serve as robust location fingerprints. We introduce a two-stage framework that (i) trains a multi-layer perceptron (MLP) on real spectral measurements and (ii) enlarges the training corpus with synthetic samples produced by TabGAN. The augmented dataset reduces the mean localization error from 62.9cm to 49.3cm -- a 20% improvement -- while requiring only 5% additional data-collection effort. Experimental results obtained on 42 reference points in a U-shaped laboratory confirm that GAN-based augmentation mitigates data-scarcity issues and enhances generalization.
Modeling the cognitive and experiential factors of human navigation is central to deepening our understanding of human-environment interaction and to enabling safe social navigation and effective assistive wayfinding. Most existing methods focus on forecasting motions in fully observed scenes and often neglect human factors that capture how people feel and respond to space. To address this gap, We propose EgoCogNav, a multimodal egocentric navigation framework that predicts perceived path uncertainty as a latent state and jointly forecasts trajectories and head motion by fusing scene features with sensory cues. To facilitate research in the field, we introduce the Cognition-aware Egocentric Navigation (CEN) dataset consisting 6 hours of real-world egocentric recordings capturing diverse navigation behaviors in real-world scenarios. Experiments show that EgoCogNav learns the perceived uncertainty that highly correlates with human-like behaviors such as scanning, hesitation, and backtracking while generalizing to unseen environments.
We introduce two iOS apps that have been designed to support wayfinding and backtracking for blind travelers navigating in indoor building environments. Wayfinding involves determining and following a route through the building's corridors to reach a destination, and assumes that the app has access to the floor plan of the building. Backtracking one's route, on the other hand, requires no map knowledge. Our apps only use the inertial and magnetic sensors of the smartphone, and thus require no infrastructure modification (e.g., installation and support of BLE beacons). Unlike systems that use the phone's camera, users of our apps can conveniently keep their phone tucked inside a pocket, while interacting with the apps using a smartwatch. Routing directions are given via speech. Both apps were tested in a user study with seven blind participants, who used them while navigating a campus building.
We present a novel approach to automatically synthesize "wayfinding instructions" for an embodied robot agent. In contrast to prior approaches that are heavily reliant on human-annotated datasets designed exclusively for specific simulation platforms, our algorithm uses in-context learning to condition an LLM to generate instructions using just a few references. Using an LLM-based Visual Question Answering strategy, we gather detailed information about the environment which is used by the LLM for instruction synthesis. We implement our approach on multiple simulation platforms including Matterport3D, AI Habitat and ThreeDWorld, thereby demonstrating its platform-agnostic nature. We subjectively evaluate our approach via a user study and observe that 83.3% of users find the synthesized instructions accurately capture the details of the environment and show characteristics similar to those of human-generated instructions. Further, we conduct zero-shot navigation with multiple approaches on the REVERIE dataset using the generated instructions, and observe very close correlation with the baseline on standard success metrics (< 1% change in SR), quantifying the viability of generated
Objective. Evaluate the feasibility of Virtual Reality (VR) wayfinding training with aging adults, and examine the impact of the training on wayfinding performance. Design. Design involved wayfinding tasks in a study with three groups: active VR training, passive video training, and no training, assigned randomly. The training featured 5 tasks in a digital version of a real building. Post-training assessments had 10 tasks in this building, half familiar from training and half new. The study was double-blinded, with each intervention lasting 10 minutes. Participants. A convenience sample of 49 participants; inclusion criteria: age >58, unfamiliar with the building; exclusion criteria: mobility or vision impairments, history of motion sickness, or medical implants. Outcomes. Time spent and Distance traveled on each wayfinding task with a fixed 10-min limit. Results. Participants in VR group reported moderate usability (63.82, SD=14.55) with respect to the training intervention and high Self Location (3.71, SD=0.94). There were no differences in task performance among the groups in the similar tasks. In the new tasks, compared to the control condition, Time spent on tasks was margi
In the quest for understanding human executive function, eye movements represent a unique insight into how we process and comprehend our environment. Eye movements reveal patterns in how we focus, navigate, and make decisions across various contexts. The proposed dataset includes electrooculography (EOG) signals from 27 healthy subjects, capturing both vertical and horizontal eye movements. The recorded signals were obtained during the video-watching stage of the Leiden Navigation Test, designed to assess spatial navigation abilities. In addition to other data, the dataset includes scores from the Mini- Mental State Examination and the Wayfinding Questionnaire. The dataset comprises carefully curated components, including relevant information, the Mini-Mental State Examination scores, and the Wayfinding Questionnaire scores, encompassing navigation, orientation, distance estimation, spatial anxiety, as well as raw and processed EOG signals. These assessments contribute more information about the participants' cognitive function and navigational abilities. This dataset can be valuable for researchers investigating spatial navigation abilities through EOG signal analysis.