JavaScript Web applications are a common product in industry. As with most applications, Web applications can acquire software flaws (known as bugs), whose symptoms are seen during the development stage and, even worse, in production. The use of debuggers is beneficial for detecting bugs. Unfortunately, most JavaScript debuggers (1) only support the "step into/through" feature in an execution program to detect a bug, and (2) do not allow developers to go back-in-time at the application execution to take actions to detect the bug accurately. For example, the second limitation does not allow developers to modify the value of a variable to fix a bug while the application is running or test if the same bug is triggered with other values of that variable. Using concepts such as continuations and static analysis, this article presents a usable debugger for JavaScript, named DeloreanJS, which enables developers to go back-in-time in different execution points and resume the execution of a Web application to improve the understanding of a bug, or even experiment with hypothetical scenarios around the bug. Using an online and available version, we illustrate the benefits of DeloreanJS through five examples of bugs in JavaScript. Although DeloreanJS is developed for JavaScript, a dynamic prototype-based object model with side effects (mutable variables), we discuss our proposal with the state-of-art/practice of debuggers in terms of features. For example, modern browsers like Mozilla Firefox include a debugger in their distribution that only support for the breakpoint feature. However DeloreanJS uses a graphical user interface that considers back-in-time features. The aim of this study is to evaluate and compare the usability of DeloreanJS and Mozilla Firefox's debugger using the system usability scale approach. We requested 30 undergraduate students from two computer science programs to solve five tasks. Among the findings, we highlight two results. First, we found that 100% (15) of participants recommended DeloreanJS, and only 53% (eight) recommended Firefox's debugger to complete the tasks. Second, whereas the average score for DeloreanJS is 71.6 ("Good"), the average score for Firefox's debugger is 55.8 ("Acceptable").
Training large language models (LLMs) on Python execution traces grounds them in code execution and enables the line-by-line execution prediction of whole Python programs, effectively turning them into neural interpreters (FAIR CodeGen Team et al., 2025). However, developers rarely execute programs step by step; instead, they use debuggers to stop execution at certain breakpoints and step through relevant portions only while inspecting or modifying program variables. Existing neural interpreter approaches lack such interactive control. To address this limitation, we introduce neural debuggers: language models that emulate traditional debuggers, supporting operations such as stepping into, over, or out of functions, as well as setting breakpoints at specific source lines. We show that neural debuggers -- obtained via fine-tuning large LLMs or pre-training smaller models from scratch -- can reliably model both forward execution (predicting future states and outputs) and inverse execution (inferring prior states or inputs) conditioned on debugger actions. Evaluated on CruxEval, our models achieve strong performance on both output and input prediction tasks, demonstrating robust condit
Large Language Models (LLMs) frequently generate buggy code with complex logic errors that are challenging to diagnose. While existing LLM-based self-repair approaches conduct intensive static semantic analysis or reply on superficial execution logs, they miss the in-depth runtime behaviors that often expose bug root causes-lacking the interactive dynamic analysis capabilities that make human debugging effective. We present InspectCoder, the first agentic program repair system that empowers LLMs to actively conduct dynamic analysis via interactive debugger control. Our dual-agent framework enables strategic breakpoint placement, targeted state inspection, and incremental runtime experimentation within stateful debugger sessions. Unlike existing methods that follow fixed log collection procedures, InspectCoder adaptively inspects and perturbs relevant intermediate states at runtime, and leverages immediate process rewards from debugger feedback to guide multi-step reasoning, transforming LLM debugging paradigm from blind trial-and-error into systematic root cause diagnosis. We conduct comprehensive experiments on two challenging self-repair benchmarks: BigCodeBench-R and LiveCodeBen
Field-Programmable Gate Array (FPGA) development tool chains are widely used in FPGA design, simulation, and verification in critical areas like communications, automotive electronics, and aerospace. Commercial FPGA tool chains such as Xilinx' Vivado aids developers in swiftly identifying and rectifying bugs and issues in FPGA designs through a robust built-in debugger, ensuring the correctness and development efficiency of the FPGA design. Hardening such FPGA chip debugger tools by testing is crucial since engineers might misinterpret code and introduce incorrect fixes, leading to security risks. However, FPGA chip debugger tools are challenging to test as they require assessing both RTL designs and a series of debugging actions, including setting breakpoints and stepping through the code. To address this issue, we propose a interactive differential testing approach called DB-Hunter to detect bugs in Vivado's FPGA chip debugger tools. Specifically, DB-Hunter consists of three components: RTL design transformation component, debug action transformation component, and interactive differential testing component. By performing RTL design and debug action transformations, DB-Hunter gen
The Visual Debugger is an IntelliJ IDEA plugin that presents debug information as an object diagram to enhance program understanding. Reflecting on our past development, we detail the lessons learned and roadblocks we have experienced while implementing and integrating the Visual Debugger into the IntelliJ IDEA. Furthermore, we describe recent improvements to the Visual Debugger, greatly enhancing the plugin in the present. Looking into the future, we propose solutions to overcome the roadblocks encountered while developing the plugin and further plans for the Visual Debugger.
Automated debugging, long pursued in a variety of fields from software engineering to cybersecurity, requires a framework that offers the building blocks for a programmable debugging workflow. However, existing debuggers are primarily tailored for human interaction, and those designed for programmatic debugging focus on kernel space, resulting in limited functionality in userland. To fill this gap, we introduce libdebug, a Python library for programmatic debugging of userland binary executables. libdebug offers a user-friendly API that enables developers to build custom debugging tools for various applications, including software engineering, reverse engineering, and software security. It is released as an open-source project, along with comprehensive documentation to encourage use and collaboration across the community. We demonstrate the versatility and performance of libdebug through case studies and benchmarks, all of which are publicly available. We find that the median latency of syscall and breakpoint handling in libdebug is 3 to 4 times lower compared to that of GDB.
Compiler diagnostics for type inference failures are notoriously bad, and type classes only make the problem worse. By introducing a complex search process during inference, type classes can lead to wholly inscrutable or useless errors. We describe a system, Argus, for interactively visualizing type class inferences to help programmers debug inference failures, applied specifically to Rust's trait system. The core insight of Argus is to avoid the traditional model of compiler diagnostics as one-size-fits-all, instead providing the programmer with different views on the search tree corresponding to different debugging goals. Argus carefully uses defaults to improve debugging productivity, including interface design (e.g., not showing full paths of types by default) and heuristics (e.g., sorting obligations based on the expected complexity of fixing them). We evaluated Argus in a user study where $N = 25$ participants debugged type inference failures in realistic Rust programs, finding that participants using Argus correctly localized $2.2\times$ as many faults and localized $3.3\times$ faster compared to not using Argus.
Stress, arising from the dynamic interaction between external stressors, individual appraisals, and physiological or psychological responses, significantly impacts health yet is often underreported and inconsistently documented. When documented, stress-related information is often captured as unstructured narrative text, limiting systematic assessment, secondary use, and computational analysis. This study aimed to develop a mental stress ontology and to explore the feasibility of using a Large Language Model (LLM) to extract and structure stress-related information from narrative text in an ontology-guided manner. Mental Stress Ontology (MeSO) was developed using Protégé by integrating theoretical frameworks on stress with concepts derived from 11 validated stress assessment instruments. MeSO was evaluated for content coverage using additional concepts collected from 58 text sources and for structural quality using the OntOlogy Pitfall Scanner! (OOPS!) and the Protégé Debugger. A mental health expert provided an overall qualitative evaluation of the ontology. Ontology-guided extraction of stress-related information was performed on 35 Reddit posts using an LLM (Claude Sonnet 4) and MeSO for six categories of stress-related information including stressor, stress response, coping strategy, duration, onset, and temporal profile. Human reviewers assessed the appropriateness of the extracted information and MeSO coverage of the identified stress concepts. The final ontology included 181 concepts across eight top-level classes. Human reviewers identified 220 extractable stress-related items from 35 Reddit posts. Ontology-guided extraction using an LLM resulted in 172 correctly extracted items (78.2%), with 27 items (12.3%) misclassified and 21 items (9.5%) missed. Of the extracted items, 22 represented numeric stress duration values and were excluded from ontology-based concept mapping. Of the remaining 150 items, 120 were successfully mapped to MeSO. This study provides initial evidence that ontology-guided large language models may facilitate the structuring of stress-related information from narrative text, offering a foundation for future research toward systematic stress assessment and documentation.
As IoT devices are being widely used, malicious code is increasingly appearing in Linux environments. Sophisticated Linux malware employs various evasive techniques to deter analysis. The embedded trace microcell (ETM) supported by modern Arm CPUs is a suitable hardware tracer for analyzing evasive malware because it is almost artifact-free and has negligible overhead. In this paper, we present an efficient method to automatically find debugger-detection routines using the ETM hardware tracer. The proposed scheme reconstructs the execution flow of the compiled binary code from ETM trace data. In addition, it automatically identifies and patches the debugger-detection routine by comparing two traces (with and without the debugger). The proposed method was implemented using the Ghidra plug-in program, which is one of the most widely used disassemblers. To verify its effectiveness, 15 debugger-detection techniques were investigated in the Arm-Linux environment to determine whether they could be detected. We also confirmed that our implementation works successfully for the popular malicious Mirai malware in Linux. Experiments were further conducted on 423 malware samples collected from the Internet, demonstrating that our implementation works well for real malware samples.
Open source software (OSS) has become one of the modern software development methods. OSS is mainly developed by developers, volunteers, and users all over the world, but its reliability has been widely questioned. When OSS faults are detected, volunteers or users send them to developers by email or network. After the developer confirms the fault, it will be randomly assigned to the debugger who may be a developer, a volunteer, or a user. These open source community contributors also have the phenomenon of learning when removing faults. When the detected faults are removed, the number of introduced faults decreases gradually. Therefore, this study proposes a software reliability model with the decreasing trend of fault introduction in the process of OSS development and testing. The validity of the proposed model and the accuracy of estimating residual faults are verified by experiments. The proposed model can be used to evaluate the reliability and predict the remaining faults in the actual OSS development and testing process.
Developers use cloud computing platforms to process a large quantity of data in parallel when developing big data analytics. Debugging the massive parallel computations that run in today's data-centers is time consuming and error-prone. To address this challenge, we design a set of interactive, real-time debugging primitives for big data processing in Apache Spark, the next generation data-intensive scalable cloud computing platform. This requires re-thinking the notion of step-through debugging in a traditional debugger such as gdb, because pausing the entire computation across distributed worker nodes causes significant delay and naively inspecting millions of records using a watchpoint is too time consuming for an end user. First, BIGDEBUG's simulated breakpoints and on-demand watchpoints allow users to selectively examine distributed, intermediate data on the cloud with little overhead. Second, a user can also pinpoint a crash-inducing record and selectively resume relevant sub-computations after a quick fix. Third, a user can determine the root causes of errors (or delays) at the level of individual records through a fine-grained data provenance capability. Our evaluation shows that BIGDEBUG scales to terabytes and its record-level tracing incurs less than 25% overhead on average. It determines crash culprits orders of magnitude more accurately and provides up to 100% time saving compared to the baseline replay debugger. The results show that BIGDEBUG supports debugging at interactive speeds with minimal performance impact.
In medical visualization, segmentation is an important step prior to rendering. However, it is also a difficult procedure because of the restrictions imposed by variations in image characteristics, human anatomy, and pathology. Moreover, what is interesting from clinical point of view is usually not only an organ or a tissue itself, but also its properties together with adjacent organs or related vessel systems that are going in and coming out. For an informative rendering, these necessitate the usage of different segmentation methods in a single application, and combining/representing the results together in a proper way. This paper describes the implementation of an interface, which can be used to plug-in and then apply a segmentation method to a medical image series. The design is based on handling each segmentation procedure as an object where all parameters of each object can be specified individually. Thus, it is possible to use different plug-ins with different interfaces and parameters for the segmentation of different tissues in the same dataset while rendering all of the results together is still possible. The design allows access to insight registration and segmentation toolkit, Java, and MATLAB functionality together, eases sharing and comparing segmentation techniques, and serves as a visual debugger for algorithm developers.
The Insight Toolkit (ITK) initiative from the National Library of Medicine has provided a suite of state-of-the-art segmentation and registration algorithms ideally suited to volume visualization and analysis. A volume visualization application that effectively utilizes these algorithms provides many benefits: it allows access to ITK functionality for non-programmers, it creates a vehicle for sharing and comparing segmentation techniques, and it serves as a visual debugger for algorithm developers. This paper describes the integration of image processing functionalities provided by the ITK into VolView, a visualization application for high performance volume rendering. A free version of this visualization application is publicly available and is available in the online version of this paper. The process for developing ITK plugins for VolView according to the publicly available API is described in detail, and an application of ITK VolView plugins to the segmentation of Abdominal Aortic Aneurysms (AAAs) is presented. The source code of the ITK plugins is also publicly available and it is included in the online version.
Domain experts think and reason at a high level of abstraction when they solve problems in their domain of expertise. We present the design and motivation behind a domain specific language called Phi-LOG to enable biologists (domain experts) to program solutions to phylogenetic inference problems at a very high level of abstraction. The implementation infrastructure (interpreter, compiler, debugger) for the DSL is automatically obtained through a software engineering framework based on Denotational Semantics and Logic Programming.
Domain experts think and reason at a high level of abstraction when they solve problems in their domain of expertise. We present the design and motivation behind a domain specific language, called phi LOG, to enable biologists to program solutions to phylogenetic inference problems at a very high level of abstraction. The implementation infrastructure (interpreter, compiler, debugger) for the DSL is automatically obtained through a software engineering framework based on Denotational Semantics and Logic Programming.
Intelligent robots are part of a new generation of robots that are able to sense the surrounding environment, plan their own actions and eventually reach their targets. In recent years, reliance upon robots in both daily life and industry has increased. The protocol proposed in this paper describes the design and production of a handling robot with an intelligent search algorithm and an autonomous identification function. First, the various working modules are mechanically assembled to complete the construction of the work platform and the installation of the robotic manipulator. Then, we design a closed-loop control system and a four-quadrant motor control strategy, with the aid of debugging software, as well as set steering gear identity (ID), baud rate and other working parameters to ensure that the robot achieves the desired dynamic performance and low energy consumption. Next, we debug the sensor to achieve multi-sensor fusion to accurately acquire environmental information. Finally, we implement the relevant algorithm, which can recognize the success of the robot's function for a given application. The advantage of this approach is its reliability and flexibility, as the users can develop a variety of hardware construction programs and utilize the comprehensive debugger to implement an intelligent control strategy. This allows users to set personalized requirements based on their needs with high efficiency and robustness.
Integrated development environments (IDEs) provide many useful tools such as a code editor, a compiler, and a debugger for creating software. These tools are highly sophisticated, and their development requires a significant effort. Traditionally, an IDE supports different programming languages via plugins that are not usually reusable in other IDEs. Given the high complexity and constant evolution of popular programming languages, such as C++ and even Java, the effort to update those plugins has become unbearable. Thus, recent work aims to modularize IDEs and reuse the existing parser implementation directly in compilers. However, when IDE debugging tools are insufficient at detecting performance defects in large and multithreaded systems, developers must use tracing and trace visualization tools in their software development process. Those tools are often standalone applications and do not interoperate with the new modular IDEs, thus losing the power and the benefits of many features provided by the IDE. The structure and use cases of tracing tools, with the potentially massive execution traces, significantly differ from the other tools in IDEs. Thus, it is a considerable challenge, one which has not been addressed previously, to integrate them into the new modular IDEs. In this paper, we propose an efficient modular client-server architecture for trace analysis and visualization that solves those problems. The proposed architecture is well suited for performance analysis on Internet of Things (IoT) devices, where resource limitations often prohibit data collection, processing, and visualization all on the same device. The experimental evaluation demonstrated that our proposed flexible and reusable solution is scalable and has a small acceptable performance overhead compared to the standalone approach.
In this work, we present a mixed software/hardware implementation of 2-D signals encoder/decoder using dyadic discrete wavelet transform (DWT) based on quadrature mirror filters (QMF); using fast wavelet Mallat's algorithm. This work is designed and compiled on the embedded development kit EDK6.3i, and the synthesis software, ISE6.3i, which is available with Xilinx Virtex-IIV2MB1000 FPGA. Huffman coding scheme is used to encode the wavelet coefficients so that they can be transmitted progressively through an Ethernet TCP/IP based connection. The possible reconfiguration can be exploited to attain higher performance. The design will be integrated with the neutron radiography system that is used with the Es-Salem research reactor.
Emergent (http://grey.colorado.edu/emergent) is a powerful tool for the simulation of biologically plausible, complex neural systems that was released in August 2007. Inheriting decades of research and experience in network algorithms and modeling principles from its predecessors, PDP++ and PDP, Emergent has been redesigned as an efficient workspace for academic research and an engaging, easy-to-navigate environment for students. The system provides a modern and intuitive interface for programming and visualization centered around hierarchical, tree-based navigation and drag-and-drop reorganization. Emergent contains familiar, high-level simulation constructs such as Layers and Projections, a wide variety of algorithms, general-purpose data handling and analysis facilities and an integrated virtual environment for developing closed-loop cognitive agents. For students, the traditional role of a textbook has been enhanced by wikis embedded in every project that serve to explain, document, and help newcomers engage the interface and step through models using familiar hyperlinks. For advanced users, the software is easily extensible in all respects via runtime plugins, has a powerful shell with an integrated debugger, and a scripting language that is fully symmetric with the interface. Emergent strikes a balance between detailed, computationally expensive spiking neuron models and abstract, Bayesian or symbolic systems. This middle level of detail allows for the rapid development and successful execution of complex cognitive models while maintaining biological plausibility.
Communication systems that work in jeopardized environments such as space are affected by soft errors that can cause malfunctions in the behavior of the circuits such as, for example, single event upsets (SEUs) or multiple bit upsets (MBUs). In order to avoid this erroneous functioning, this kind of systems are usually protected using redundant logic such as triple modular redundancy (TMR) or error correction codes (ECCs). After the implementation of the protected modules, the communication modules must be tested to assess the achieved reliability. These tests could be driven into accelerator facilities through ionization processes or they can be performed using fault injection tools based on software simulation such as the SEUs simulation tool (SST), or based on field-programmable gate array (FPGA) emulation like the one described in this work. In this paper, a tutorial for the setup of a fault injection emulation platform based on the Xilinx soft error mitigation (SEM) intellectual property (IP) controller is depicted step by step, showing a complete cycle. To illustrate this procedure, an online repository with a complete project and a step-by-step guide is provided, using as device under test a classical communication component such as a finite impulse response (FIR) filter. Finally, the integration of the automatic configuration memory error-injection (ACME) tool to speed up the fault injection process is explained in detail at the end of the paper.