搜索 — ResearchTracker

Continuous Integration (CI) systems often run many builds concurrently. In this setting, a legitimate build failure may not be caused by the code push that triggered it. Such unrelated build failures can waste developer effort because developers must determine whether the failure is actionable for their current change. We study 77,354 CI build failures from seven open source Apache projects to understand and predict unrelated build failures. We find that developers spend a median of 4 hours identifying whether a failure is related or unrelated to their push. We also perform a document analysis of 371 confirmed unrelated build failures sampled from 10,316 potentially unrelated failures. The analysis shows that unrelated test failures account for 20% of the cases in which developers classify build failures as unrelated. To predict unrelated build failures, we extract 33 features from issue reports, issue comments, and commits associated with the triggering push. We build semi-supervised Positive and Unlabeled (PU) learning models for seven Apache projects. The models achieve precision from 0.70 to 0.88, recall from 0.30 to 1.00, F1-score from 0.44 to 0.91, and AUC from 0.63 to 0.97.

Characterizing Metastable Faults and Failures

arXiv2026-05-31作者：Ali Farahbakhsh, Qingjie Lu, Lorenzo Alvisi

Metastable failures are hard to detect, prevent, and mitigate. During a metastable failure, a system exhibits self-sustaining bad behavior even in the absence of adversarial conditions. Prior work focuses on symptoms and has portrayed metastable failures as instances of self-sustaining overload. This characterization leaves the underlying failure causes and dynamics unknown, and does not account for metastable failures that do not manifest as overload. We present the first causal characterization of metastable failures by identifying their origin in metastable faults, i.e., structural destabilizing cycles of interaction among systems components that, in isolation, are stabilizing. Metastable failures arise when scheduling decisions let these destabilizing interactions gain the upper hand over the individual components' stabilizing tendencies. We then derive a methodology to predict metastable failures, and to build metastable-fault-tolerant (MFT) systems. We apply our methodology to three case studies, showcasing the generality of our results.

搜索结果：Failures

Is this Build Failure Related to my Patch? An Empirical Study of Unrelated Build Failures in Continuous Integration

Characterizing Metastable Faults and Failures

Towards CXL Resilience to CPU Failures

Large Language Model Reasoning Failures

Invisible failures in human-AI interactions

Bank Failures: The Roles of Solvency and Liquidity

Mass-Producing Failures of Multimodal Systems with Language Models

230,439 Test Failures Later: An Empirical Evaluation of Flaky Failure Classifiers

Understanding Silent Failures in Medical Image Classification

A Mixed-Methods Approach to Understanding User Trust after Voice Assistant Failures

On the Diagnosis of Flaky Job Failures: Understanding and Prioritizing Failure Categories

Predicting Cascading Failures with a Hyperparametric Diffusion Model

PREVENT: An Unsupervised Approach to Predict Software Failures in Production

FAIL: Analyzing Software Failures from the News Using LLMs

A model of actors and grey failures

RIS-aided Localization under Pixel Failures

Using Online Customer Reviews to Classify, Predict, and Learn about Domestic Robot Failures

Can We Detect Failures Without Failure Data? Uncertainty-Aware Runtime Failure Detection for Imitation Learning Policies

Rational Uniform Consensus with General Omission Failures

Gazing at Failure: Investigating Human Gaze in Response to Robot Failure in Collaborative Tasks