Third-party dependency updates can cause a build to fail if the new dependency version introduces a change that is incompatible with the usage: this is called a breaking dependency update. Research on breaking dependency updates is active, with works on characterization, understanding, automatic repair of breaking updates, and other software engineering aspects. All such research projects require a benchmark of breaking updates that has the following properties: 1) it contains real-world breaking updates; 2) the breaking updates can be executed; 3) the benchmark provides stable scientific artifacts of breaking updates over time, a property we call reproducibility. To the best of our knowledge, such a benchmark is missing. To address this problem, we present BUMP, a new benchmark that contains reproducible breaking dependency updates in the context of Java projects built with the Maven build system. BUMP contains 571 breaking dependency updates collected from 153 Java projects. BUMP ensures long-term reproducibility of dependency updates on different platforms, guaranteeing consistent build failures. We categorize the different causes of build breakage in BUMP, providing novel insig
This paper investigates how Conflict-free Replicated Data Types (CRDTs) can be used for dynamic software updates of distributed applications. We propose to model application updates as a new App CRDT that stores the application code associated with a semantic version, which defines a total order of the code updates. The App CRDT works with an API-compatible message delivery middleware, which allows applications to continue working with partially updated components in the face of backwards-incompatible software updates. We implemented our approach in AmbientTalk, an ambient-oriented programming language designed for distributed systems. We show how this CRDT can be integrated with existing AmbientTalk applications, requiring minimal changes. We also implemented our approach in LuAT, an ambient-oriented programming framework for Lua. This shows that our approach of using CRDTs to replicate code can be generalised to other programming languages.
Safety guarantees are a prerequisite to the deployment of reinforcement learning (RL) agents in safety-critical tasks. Often, deployment environments exhibit non-stationary dynamics or are subject to changing performance goals, requiring updates to the learned policy. This leads to a fundamental challenge: how to update an RL policy while preserving its safety properties on previously encountered tasks? The majority of current approaches either do not provide formal guarantees or verify policy safety only a posteriori. We propose a novel a priori approach to safe policy updates in continual RL by introducing the Rashomon set: a region in policy parameter space certified to meet safety constraints within the demonstration data distribution. We then show that one can provide formal, provable guarantees for arbitrary RL algorithms used to update a policy by projecting their updates onto the Rashomon set. Empirically, we validate this approach across grid-world navigation environments (Frozen Lake and Poisoned Apple) where we guarantee an a priori provably deterministic safety on the source task during downstream adaptation. In contrast, we observe that regularisation-based baselines e
Due to its speed and simplicity, subgradient descent is one of the most used optimization algorithms in convex machine learning algorithms. However, tuning its learning rate is probably its most severe bottleneck to achieve consistent good performance. A common way to reduce the dependency on the learning rate is to use implicit/proximal updates. One such variant is the Importance Weight Aware (IWA) updates, which consist of infinitely many infinitesimal updates on each loss function. However, IWA updates' empirical success is not completely explained by their theory. In this paper, we show for the first time that IWA updates have a strictly better regret upper bound than plain gradient updates in the online learning setting. Our analysis is based on the new framework, generalized implicit Follow-the-Regularized-Leader (FTRL) (Chen and Orabona, 2023), to analyze generalized implicit updates using a dual formulation. In particular, our results imply that IWA updates can be considered as approximate implicit/proximal updates.
In the training of large language models (LLMs), updating parameters more efficiently and stably has always been an important challenge. To achieve efficient parameter updates, existing methods usually achieve performance comparable to full parameter updates through methods such as low-dimensional decomposition or layer-wise selective updates. In this work, we propose AlphaAdam, an optimization framework for LLM from the perspective of intra-layer parameter updates. By decoupling parameter updates and dynamically adjusting their strength, AlphaAdam accelerates convergence and improves training stability. We construct parameter masks based on the consistency of historical momentum and gradient direction and combine them with an adaptive mask strength strategy to ensure efficient optimization and theoretical convergence guarantees, which is also applicable to most momentum-based optimizers. Extensive experiments show that AlphaAdam outperforms state-of-the-art methods such as AdamW in terms of convergence speed and computational efficiency across tasks, including GPT-2 pre-trained and fine-tuned RoBERTa and Llama-7B. Our AlphaAdam implements an optimizer enhancement framework for LLM
As machine learning models become increasingly embedded in societal infrastructure, auditing them for bias is of growing importance. However, in real-world deployments, auditing is complicated by the fact that model owners may adaptively update their models in response to changing environments, such as financial markets. These updates can alter the underlying model class while preserving certain properties of interest, raising fundamental questions about what can be reliably audited under such shifts. In this work, we study group fairness auditing under arbitrary updates. We consider general shifts that modify the pre-audit model class while maintaining invariance of the audited property. Our goals are two-fold: (i) to characterize the information complexity of allowable updates, by identifying which strategic changes preserve the property under audit; and (ii) to efficiently estimate auditing properties, such as group fairness, using a minimal number of labeled samples. We propose a generic framework for PAC auditing based on an Empirical Property Optimization (EPO) oracle. For statistical parity, we establish distribution-free auditing bounds characterized by the SP dimension, a
Federated learning has recently emerged as a decentralized approach to learn a high-performance model without access to user data. Despite its effectiveness, federated learning gives malicious users opportunities to manipulate the model by uploading poisoned model updates to the server. In this paper, we propose a review mechanism called FedReview to identify and decline the potential poisoned updates in federated learning. Under our mechanism, the server randomly assigns a subset of clients as reviewers to evaluate the model updates on their training datasets in each round. The reviewers rank the model updates based on the evaluation results and count the number of the updates with relatively low quality as the estimated number of poisoned updates. Based on review reports, the server employs a majority voting mechanism to integrate the rankings and remove the potential poisoned updates in the model aggregation process. Extensive evaluation on multiple datasets demonstrate that FedReview can assist the server to learn a well-performed global model in an adversarial environment.
Safety-critical environments are inherently dynamic. Distribution shifts, emerging vulnerabilities, and evolving requirements demand continuous updates to machine learning models. Yet even benign parameter updates can have unintended consequences, such as catastrophic forgetting in classical models or alignment drift in foundation models. Existing heuristic approaches (e.g., regularization, parameter isolation) can mitigate these effects but cannot certify that updated models continue to satisfy required performance specifications. We address this problem by introducing a framework for provably safe model updates. Our approach first formalizes the problem as computing the largest locally invariant domain (LID): a connected region in parameter space where all points are certified to satisfy a given specification. While exact maximal LID computation is intractable, we show that relaxing the problem to parameterized abstract domains (orthotopes, zonotopes) yields a tractable primal-dual formulation. This enables efficient certification of updates - independent of the data or algorithm used - by projecting them onto the safe domain. Our formulation further allows computation of multipl
Orthonormalized updates accelerate training, improve stability, and enable robust hyperparameter transfer, but existing methods like Muon rely on dense matrix operations that clash with sharded weights in large-scale LLM training, causing high compute and communication cost. We introduce Dion (Distributed Orthonormalization), a scalable and efficient update rule that replaces Newton-Schulz iteration with amortized power iteration on a momentum buffer, avoiding full-matrix reconstruction and integrating cleanly with weight sharding. The rank-fraction parameter with error feedback enables low-rank updates that balance quality with significant cost savings. On language models from 160M to 3B parameters, Dion retains the benefits of orthonormalized updates, while markedly reducing wall-clock time at scale, making it a practical optimizer for next-generation foundation models. Code is available at: https://github.com/microsoft/dion/
Decentralized Federated learning is a distributed edge intelligence framework by exchanging parameter updates instead of training data among participators, in order to retrain or fine-tune deep learning models for mobile intelligent applications. Considering the various topologies of edge networks in mobile internet, the impact of transmission delay of updates during model training is non-negligible for data-intensive intelligent applications on mobile devices, e.g., intelligent medical services, automated driving vehicles, etc.. To address this problem, we analyze the impact of delayed updates for decentralized federated learning, and provide a theoretical bound for these updates to achieve model convergence. Within the theoretical bound of updating period, the latest versions for the delayed updates are reused to continue aggregation, in case the model parameters from a specific neighbor are not collected or updated in time.
Serving many task-specialized LLM variants is often limited by the large size of fine-tuned checkpoints and the resulting cold-start latency. Since fine-tuned weights differ from their base model by relatively small structured residuals, a natural approach is to represent them as compressed deltas. We propose a simple 1-bit delta scheme that stores only the sign of the weight difference together with lightweight per-axis (row/column) FP16 scaling factors, learned from a small calibration set. This design preserves the compactness of 1-bit deltas while more accurately capturing variation across weight dimensions, leading to improved reconstruction quality over scalar alternatives. From a systems perspective, a streamlined loader that transfers packed deltas in a single operation per module reduces cold-start latency and storage overhead, with artifacts several times smaller than a full FP16 checkpoint. The method is drop-in, requires minimal calibration data, and maintains inference efficiency by avoiding dense reconstruction. Our experimental setup and source code are available at https://github.com/kuiumdjiev/Per-Axis-Weight-Deltas-for-Frequent-Model-Updates.
We consider a system where the updates from independent sources are disseminated via a publish-subscribe mechanism. The sources are the publishers and a decision process (DP), acting as a subscriber, derives decision updates from the source data. We derive the stationary expected age of information (AoI) of decision updates delivered to a monitor. We show that a lazy computation policy in which the DP may sit idle before computing its next decision update can reduce the average AoI at the monitor even though the DP exerts no control over the generation of source updates. This AoI reduction is shown to occur because lazy computation can offset the negative effect of high variance in the computation time.
We consider an information updating system where a source produces updates as requested by a transmitter. The transmitter further processes these updates in order to generate $partial$ $updates$, which have smaller information compared to the original updates, to be sent to a receiver. We study the problem of generating partial updates, and finding their corresponding real-valued codeword lengths, in order to minimize the average age experienced by the receiver, while maintaining a desired level of mutual information between the original and partial updates. This problem is NP hard. We relax the problem and develop an alternating minimization based iterative algorithm that generates a pmf for the partial updates, and the corresponding age-optimal real-valued codeword length for each update. We observe that there is a tradeoff between the attained average age and the mutual information between the original and partial updates.
We study a pull-based communication system where a sensing agent updates an actuation agent using a query control policy, which is adjusted in the evolution of an observed information source and the usefulness of each update for achieving a specific goal. For that, a controller decides whether to pull an update at each slot, predicting what is probably occurring at the source and how much effective impact that update could have at the endpoint. Thus, temporal changes in the source evolution could modify the query arrivals so as to capture important updates. The amount of impact is determined by a grade of effectiveness (GoE) metric, which incorporates both freshness and usefulness attributes of the communicated updates. Applying an iterative algorithm, we derive query decisions that maximize the long-term average GoE for the communicated packets, subject to cost constraints. Our analytical and numerical results show that the proposed query policy exhibits higher effectiveness than existing periodic and probabilistic query policies for a wide range of query arrival rates.
Many convex optimization methods are conceived of and analyzed in a largely separate fashion. In contrast to this traditional separation, this manuscript points out and demonstrates the utility of an important but largely unremarked common thread running through many prominent optimization methods. In particular, we show that methods such as successive orthogonal projection, gradient descent, projected gradient descent, the proximal-point method, forward-backward splitting, the alternating direction method of multipliers, and under- or over-relaxed variants of the preceding all involve updates that are of a common type --- namely, the updates satisfy a property known as pseudocontractivity. Moreover, since the property of pseudocontractivity is preserved under both composition and convex combination, updates constructed via these operations from pseudocontractive updates are themselves pseudocontractive. Having demonstrated that pseudocontractive updates are to be found in many optimization methods, we then provide a unified basic analysis of methods with pseudocontractive updates. Specifically, we prove a novel bound satisfied by the norm of the difference in iterates of pseudocon
Updates to network configurations are notoriously difficult to implement correctly. Even if the old and new configurations are correct, the update process can introduce transient errors such as forwarding loops, dropped packets, and access control violations. The key factor that makes updates difficult to implement is that networks are distributed systems with hundreds or even thousands of nodes, but updates must be rolled out one node at a time. In networks today, the task of determining a correct sequence of updates is usually done manually -- a tedious and error-prone process for network operators. This paper presents a new tool for synthesizing network updates automatically. The tool generates efficient updates that are guaranteed to respect invariants specified by the operator. It works by navigating through the (restricted) space of possible solutions, learning from counterexamples to improve scalability and optimize performance. We have implemented our tool in OCaml, and conducted experiments showing that it scales to networks with a thousand switches and tens of switches updating.
This survey highlights and discusses remote OTA software updates in the automotive sector, mainly from the security perspective. In particular, the major objective of this survey is to provide a comprehensive and structured outline of various research directions and approaches in OTA update technologies in vehicles. At first, we discuss the connected car technology and then integrate the relationship of remote OTA update features with the connected car. We also present the benefits of remote OTA updates for cars along with relevant statistics. Then, we emphasize on the security challenges and requirements of remote OTA updates along with use cases and standard road safety regulations followed in different countries. We also provide for a classification of the existing works in literature that deal with implementing different secured techniques for remote OTA updates in vehicles. We further provide an analytical discussion on the present scenario of remote OTA updates with respect to care manufacturers. Finally, we identify possible future research directions of remote OTA updates for automobiles, particularly in the area of security.
Network updates such as policy and routing changes occur frequently in Software Defined Networks (SDN). Updates should be performed consistently, preventing temporary disruptions, and should require as little overhead as possible. Scalability is increasingly becoming an essential requirement in SDN. In this paper we propose to use time-triggered network updates to achieve consistent updates. Our proposed solution requires lower overhead than existing update approaches, without compromising the consistency during the update. We demonstrate that accurate time enables far more scalable consistent updates in SDN than previously available. In addition, it provides the SDN programmer with fine-grained control over the tradeoff between consistency and scalability.
Consider a stream of status updates generated by a source, where each update is of one of two types: high priority or ordinary (low priority). These updates are to be transmitted through a network to a monitor. However, the transmission policy of each packet depends on the type of stream it belongs to. For the low priority stream, we analyze and compare the performances of two transmission schemes: (i) Ordinary updates are served in a First-Come-First-Served (FCFS) fashion, whereas, in (ii), the ordinary updates are transmitted according to an M/G/1/1 with preemption policy. In both schemes, high priority updates are transmitted according to an M/G/1/1 with preemption policy and receive preferential treatment. An arriving priority update discards and replaces any currently-in-service high priority update, and preempts (with eventual resume for scheme (i)) any ordinary update. We model the arrival processes of the two kinds of updates, in both schemes, as independent Poisson processes. For scheme (i), we find the arrival and service rates under which the system is stable and give closed-form expressions for average peak age and a lower bound on the average age of the ordinary stream
Dynamic vector commitments that enable local updates of opening proofs have applications ranging from verifiable databases with membership changes to stateless clients on blockchains. In these applications, each user maintains a relevant subset of the committed messages and the corresponding opening proofs with the goal of ensuring a succinct global state. When the messages are updated, users are given some global update information and update their opening proofs to match the new vector commitment. We investigate the relation between the size of the update information and the runtime complexity needed to update an individual opening proof. Existing vector commitment schemes require that either the information size or the runtime scale linearly in the number $k$ of updated state elements. We construct a vector commitment scheme that asymptotically achieves both length and runtime that is sublinear in $k$, namely $k^ν$ and $k^{1-ν}$ for any $ν\in (0,1)$. We prove an information-theoretic lower bound on the relation between the update information size and runtime complexity that shows the asymptotic optimality of our scheme. For $ν= 1/2$, our constructions outperform Verkle commitmen