ResearchTracker科研与行业发展动态追踪平台

|关于本站|帮助

首页
热门排行
人工智能
生物医药
新能源
材料科学
量子科技
我的收藏
阅读历史

ResearchTracker

实时追踪全球科研进展与行业发展动态，提供智能化的文献搜索、内容总结与趋势洞察服务。

数据来源

arXiv — 学术预印本
OpenAlex — 全球学术元数据（2 亿+ 论文）
PubMed — 生物医学文献
SearXNG — 聚合网页搜索
多元 RSS 科技新闻源

快速链接

首页
热门排行
关于本站
使用帮助

© 2026 ResearchTracker. All rights reserved.

搜索结果：Reduce

共找到 20 条结果

排序：按相关性按时间按热度

|

来源：全部 arXiv PubMed OpenAlex 新闻/报道

高级筛选 ▾

起始年份

截止年份

作者名

Fault-tolerant Reduce and Allreduce operations based on correction

arXiv2026-02-25作者：Martin Kuettler, Hermann Haertig

Implementations of Broadcast based on some information dissemination algorithm -- e.g., gossip or tree-based communication -- followed by a correction algorithm has been proposed previously. This work describes an approach to apply a similar idea to Reduce. In it, a correction-like communication phase precedes a tree-based phase. This provides a Reduce algorithm which is tolerant to a number of failed processes. Semantics of the resulting algorithm are provided and proven. Based on these results, Broadcast and Reduce are combined to provide Allreduce.

查看原文 ↗

On the Computation Rate of All-Reduce

arXiv2026-02-25作者：Yufeng Zhou, Hua Sun

In the All-Reduce problem, each one of the K nodes holds an input and wishes to compute the sum of all K inputs through a communication network where each pair of nodes is connected by a parallel link with arbitrary bandwidth. The computation rate of All-Reduce is defined as the number of sum instances that can be computed over each network use. For the computation rate, we provide a cut-set upper bound and a linear programming lower bound based on time (bandwidth) sharing over all schemes that first perform Reduce (aggregating all inputs at one node) and then perform Broadcast (sending the sum from that node to all other nodes). Specializing the two general bounds gives us the optimal computation rate for a class of communication networks and the best-known rate bounds (where the upper bound is no more than twice of the lower bound) for cyclic, complete, and hypercube networks.

查看原文 ↗

Near-Optimal Wafer-Scale Reduce

arXiv2024-04-24作者：Piotr Luczynski, Lukas Gianinazzi, Patrick Iff

Efficient Reduce and AllReduce communication collectives are a critical cornerstone of high-performance computing (HPC) applications. We present the first systematic investigation of Reduce and AllReduce on the Cerebras Wafer-Scale Engine (WSE). This architecture has been shown to achieve unprecedented performance both for machine learning workloads and other computational problems like FFT. We introduce a performance model to estimate the execution time of algorithms on the WSE and validate our predictions experimentally for a wide range of input sizes. In addition to existing implementations, we design and implement several new algorithms specifically tailored to the architecture. Moreover, we establish a lower bound for the runtime of a Reduce operation on the WSE. Based on our model, we automatically generate code that achieves near-optimal performance across the whole range of input sizes. Experiments demonstrate that our new Reduce and AllReduce algorithms outperform the current vendor solution by up to 3.27x. Additionally, our model predicts performance with less than 4% error. The proposed communication collectives increase the range of HPC applications that can benefit fro

查看原文 ↗

Semi-Centennial REDUCE

arXiv2025-05-02作者：Arthur C. Norman, Stephen M. Watt

We present a version of the REDUCE computer algebra system as it was in the early 1970s. We show how this historical version of REDUCE may be built and run in very modest present-day environments and outline some of its capabilities.

查看原文 ↗

Momentum Does Not Reduce Stochastic Noise in Stochastic Gradient Descent

arXiv2024-02-04作者：Naoki Sato, Hideaki Iiduka

For nonconvex objective functions, including those found in training deep neural networks, stochastic gradient descent (SGD) with momentum is said to converge faster and have better generalizability than SGD without momentum. In particular, adding momentum is thought to reduce stochastic noise. To verify this, we estimated the magnitude of gradient noise by using convergence analysis and an optimal batch size estimation formula and found that momentum does not reduce gradient noise. We also analyzed the effect of search direction noise, which is stochastic noise defined as the error between the search direction of the optimizer and the steepest descent direction, and found that it inherently smooths the objective function and that momentum does not reduce search direction noise either. Finally, an analysis of the degree of smoothing introduced by search direction noise revealed that adding momentum offers limited advantage to SGD.

查看原文 ↗

A parallel pattern for iterative stencil + reduce

arXiv2016-09-15作者：M. Aldinucci, M. Danelutto, M. Drocco

We advocate the Loop-of-stencil-reduce pattern as a means of simplifying the implementation of data-parallel programs on heterogeneous multi-core platforms. Loop-of-stencil-reduce is general enough to subsume map, reduce, map-reduce, stencil, stencil-reduce, and, crucially, their usage in a loop in both data-parallel and streaming applications, or a combination of both. The pattern makes it possible to deploy a single stencil computation kernel on different GPUs. We discuss the implementation of Loop-of-stencil-reduce in FastFlow, a framework for the implementation of applications based on the parallel patterns. Experiments are presented to illustrate the use of Loop-of-stencil-reduce in developing data-parallel kernels running on heterogeneous systems.

查看原文 ↗

AdaptGrad: Adaptive Sampling to Reduce Noise

arXiv2024-10-10作者：Linjiang Zhou, Chao Ma, Zepeng Wang

Gradient Smoothing is an efficient approach to reducing noise in gradient-based model explanation method. SmoothGrad adds Gaussian noise to mitigate much of these noise. However, the crucial hyper-parameter in this method, the variance $σ$ of Gaussian noise, is set manually or with heuristic approach. However, it results in the smoothed gradients still containing a certain amount of noise. In this paper, we aim to interpret SmoothGrad as a corollary of convolution, thereby re-understanding the gradient noise and the role of $σ$ from the perspective of confidence level. Furthermore, we propose an adaptive gradient smoothing method, AdaptGrad, based on these insights. Through comprehensive experiments, both qualitative and quantitative results demonstrate that AdaptGrad could effectively reduce almost all the noise in vanilla gradients compared with baselines methods. AdaptGrad is simple and universal, making it applicable for enhancing gradient-based interpretability methods for better visualization.

查看原文 ↗

Leveraging policy instruments and financial incentives to reduce embodied carbon in energy retrofits

arXiv2023-04-06作者：Haonan Zhang

The existing buildings and building construction sectors together are responsible for over one-third of the total global energy consumption and nearly 40% of total greenhouse gas (GHG) emissions. GHG emissions from the building sector are made up of embodied emissions and operational emissions. Recognizing the importance of reducing energy use and emissions associated with the building sector, governments have introduced policies, standards, and design guidelines to improve building energy performance and reduce GHG emissions associated with operating buildings. However, policy initiatives that reduce embodied emissions of the existing building sector are lacking. This research aims to develop policy strategies to reduce embodied carbon emissions in retrofits. In order to achieve this goal, this research conducted a literature review and identification of policies and financial incentives in British Columbia (BC) for reducing overall GHG emissions from the existing building sector. Then, this research analyzed worldwide policies and incentives that reduce embodied carbon emissions in the existing building sector. After reviewing the two categories of retrofit policies, the author i

查看原文 ↗

Entanglement as a Method to Reduce Uncertainty

arXiv2023-02-12作者：Diederik Aerts, Jonito Aerts Argëlles, Lester Beltran

In physics, entanglement 'reduces' the entropy of an entity, because the (von Neumann) entropy of, e.g., a composite bipartite entity in a pure entangled state is systematically lower than the entropy of the component sub-entities. We show here that this 'genuinely non-classical reduction of entropy as a result of composition' also holds whenever two concepts combine in human cognition and, more generally, it is valid in human culture. We exploit these results and make a 'new hypothesis' on the nature of entanglement, namely, the production of entanglement in the preparation of a composite entity can be seen as a 'dynamical process of collaboration between its sub-entities to reduce uncertainty', because the composite entity is in a pure state while its sub-entities are in a non-pure, or density, state, as a result of the preparation. We identify within the nature of this entanglement a mechanism of contextual updating and illustrate the mechanism in the example we analyze. Our hypothesis naturally explains the 'non-classical nature' of some quantum logical connectives, as due to Bell-type correlations.

查看原文 ↗

Reduce: A Framework for Reducing the Overheads of Fault-Aware Retraining

arXiv2023-05-21作者：Muhammad Abdullah Hanif, Muhammad Shafique

Fault-aware retraining has emerged as a prominent technique for mitigating permanent faults in Deep Neural Network (DNN) hardware accelerators. However, retraining leads to huge overheads, specifically when used for fine-tuning large DNNs designed for solving complex problems. Moreover, as each fabricated chip can have a distinct fault pattern, fault-aware retraining is required to be performed for each chip individually considering its unique fault map, which further aggravates the problem. To reduce the overall retraining cost, in this work, we introduce the concept of resilience-driven retraining amount selection. To realize this concept, we propose a novel framework, Reduce, that, at first, computes the resilience of the given DNN to faults at different fault rates and with different amounts of retraining. Then, based on the resilience, it computes the amount of retraining required for each chip considering its unique fault map. We demonstrate the effectiveness of our methodology for a systolic array-based DNN accelerator experiencing permanent faults in the computational array.

查看原文 ↗

Reduced atmospheres of post-impact worlds: The early Earth

arXiv2022-04-21作者：J. P. Itcovitz, A. S. P. Rae, R. I. Citron

Impacts may have had a significant effect on the atmospheric chemistry of the early Earth. Reduced phases in the impactor (e.g., metallic iron) can reduce the planet's H$_2$O inventory to produce massive atmospheres rich in H$_2$. Whilst previous studies have focused on the interactions between the impactor and atmosphere in such scenarios, we investigate two further effects, 1) the distribution of the impactor's iron inventory during impact between the target interior, target atmosphere, and escaping the target, and 2) interactions between the post-impact atmosphere and the impact-generated melt phase. We find that these two effects can potentially counterbalance each other, with the melt-atmosphere interactions acting to restore reducing power to the atmosphere that was initially accreted by the melt phase. For a $\sim10^{22}\,\mathrm{kg}$ impactor, when the iron accreted by the melt phase is fully available to reduce this melt, we find an equilibrium atmosphere with H$_2$ column density $\sim10^4\,\mathrm{moles\,cm^{-2}}$ ($p\mathrm{H2}\sim120\,\mathrm{bars}\mathrm{,}~X_\mathrm{H2}\sim0.77$), consistent with previous estimates. However, when the iron is not available to reduce t

查看原文 ↗

Localized Reduced Basis Additive Schwarz Methods

arXiv2021-03-19作者：Martin J. Gander, Stephan Rave

Reduced basis methods build low-rank approximation spaces for the solution sets of parameterized PDEs by computing solutions of the given PDE for appropriately selected snapshot parameters. Localized reduced basis methods reduce the offline cost of computing these snapshot solutions by instead constructing a global space from spatially localized less expensive problems. In the case of online enrichment, these local problems are iteratively solved in regions of high residual and correspond to subdomain solves in domain decomposition methods. We show in this note that indeed there is a close relationship between online-enriched localized reduced basis and domain decomposition methods by introducing a Localized Reduced Basis Additive Schwarz method (LRBAS), which can be interpreted as a locally adaptive multi-preconditioning scheme for the CG method.

查看原文 ↗

Analysis of a reduced-order HDG method for the Stokes equations

arXiv2015-02-06作者：Issei Oikawa

In this paper, we analyze a hybridized discontinuous Galerkin(HDG) method with reduced stabilization for the Stokes equations. The reduced stabilization enables us to reduce the number of facet unknowns and improve the computational efficiency of the method. We provide optimal error estimates in an energy and $L^2$ norms. It is shown that the reduced method with the lowest-order approximation is closely related to the nonconforming Crouzeix-Raviart finite element method. We also prove that the solution of the reduced method converges to the nonconforming Gauss-Legendre finite element solution as a stabilization parameter $τ$ tends to infinity and that the convergence rate is $O(τ^{-1})$.

查看原文 ↗

WRPN: Training and Inference using Wide Reduced-Precision Networks

arXiv2017-04-10作者：Asit Mishra, Jeffrey J Cook, Eriko Nurvitadhi

For computer vision applications, prior works have shown the efficacy of reducing the numeric precision of model parameters (network weights) in deep neural networks but also that reducing the precision of activations hurts model accuracy much more than reducing the precision of model parameters. We study schemes to train networks from scratch using reduced-precision activations without hurting the model accuracy. We reduce the precision of activation maps (along with model parameters) using a novel quantization scheme and increase the number of filter maps in a layer, and find that this scheme compensates or surpasses the accuracy of the baseline full-precision network. As a result, one can significantly reduce the dynamic memory footprint, memory bandwidth, computational energy and speed up the training and inference process with appropriate hardware support. We call our scheme WRPN - wide reduced-precision networks. We report results using our proposed schemes and show that our results are better than previously reported accuracies on ILSVRC-12 dataset while being computationally less expensive compared to previously reported reduced-precision networks.

查看原文 ↗

Analyzing whether workplace smoking bans can reduce the probability of smoking

arXiv2022-02-14作者：Tianjiao He

The rapid increase of smoking-related diseases and deaths globally is driving us to find an effective approach to reduce the smoking rate. This study aims to determine whether indoor smoking bans at workplaces can effectively reduce the smoking rate. The Smokeban dataset used for this study is an observational dataset that contains some socio-demographic factors, whether people smoke, and whether smoking bans exist. Since the observational data used in the study did not randomize people into with-smoking-bans group and without-smoking-bans group, confounders may cause bias in the estimation of whether the smoking bans can reduce smoking rates. The propensity score matching(PSM) method can reduce these biases via using a logistic regression model to predict the similarities of people in those 2 groups and using the nearest neighbour matching technique to match people who are the most similar. After reducing the bias, another regression model was created to interpret the relationship between the probability of smoking and the indoor smoking bans. We conclude by arguing that with the existence of indoor smoking bans, the probability of people who smoke can be decreased greatly.

查看原文 ↗

Pseudospectra of Isospectrally Reduced Matrices and Systems

arXiv2012-10-17作者：Fernando Guevara Vasquez, Benjamin Z. Webb

The isospectral reduction of matrix, which is closely related to its Schur complement, allows to reduce the size of a matrix while maintaining its eigenvalues up to a known set. Here we generalize this procedure by increasing the number of possible ways a matrix can be isospectrally reduced. The reduced matrix has rational functions as entries. We show that the notion of pseudospectrum can be extended to this class of matrices and that the pseudospectrum of a matrix shrinks as the matrix is reduced. Hence the eigenvalues of a reduced matrix are more robust to entry-wise perturbations than the eigenvalues of the original matrix. We also introduce the notion of inverse pseudospectrum (or pseudoresonances), which indicates how stable the poles of a matrix with rational function entries are to certain matrix perturbations. A mass spring system is used to illustrate and give a physical interpretation to both pseudospectra and inverse pseudospectra.

查看原文 ↗

An adaptive reduced basis ANOVA method for high-dimensional Bayesian inverse problems

arXiv2018-11-13作者：Qifeng Liao, Jinglai Li

In Bayesian inverse problems sampling the posterior distribution is often a challenging task when the underlying models are computationally intensive. To this end, surrogates or reduced models are often used to accelerate the computation. However, in many practical problems, the parameter of interest can be of high dimensionality, which renders standard model reduction techniques infeasible. In this paper, we present an approach that employs the ANOVA decomposition method to reduce the model with respect to the unknown parameters, and the reduced basis method to reduce the model with respect to the physical parameters. Moreover, we provide an adaptive scheme within the MCMC iterations, to perform the ANOVA decomposition with respect to the posterior distribution. With numerical examples, we demonstrate that the proposed model reduction method can significantly reduce the computational cost of Bayesian inverse problems, without sacrificing much accuracy.

查看原文 ↗

Optimal control of multiscale systems using reduced-order models

arXiv2014-06-13作者：Wei Zhang, Juan C. Latorre, Grigorios A. Pavliotis

We study optimal control of diffusions with slow and fast variables and address a question raised by practitioners: is it possible to first eliminate the fast variables before solving the optimal control problem and then use the optimal control computed from the reduced-order model to control the original, high-dimensional system? The strategy "first reduce, then optimize"--rather than "first optimize, then reduce"--is motivated by the fact that solving optimal control problems for high-dimensional multiscale systems is numerically challenging and often computationally prohibitive. We state sufficient and necessary conditions, under which the "first reduce, then control" strategy can be employed and discuss when it should be avoided. We further give numerical examples that illustrate the "first reduce, then optmize" approach and discuss possible pitfalls.

查看原文 ↗

Adaptive $h$-refinement for reduced-order models

arXiv2014-04-02作者：Kevin Carlberg

This work presents a method to adaptively refine reduced-order models \emph{a posteriori} without requiring additional full-order-model solves. The technique is analogous to mesh-adaptive $h$-refinement: it enriches the reduced-basis space online by `splitting' a given basis vector into several vectors with disjoint support. The splitting scheme is defined by a tree structure constructed offline via recursive $k$-means clustering of the state variables using snapshot data. The method identifies the vectors to split online using a dual-weighted-residual approach that aims to reduce error in an output quantity of interest. The resulting method generates a hierarchy of subspaces online without requiring large-scale operations or full-order-model solves. Further, it enables the reduced-order model to satisfy \emph{any prescribed error tolerance} regardless of its original fidelity, as a completely refined reduced-order model is mathematically equivalent to the original full-order model. Experiments on a parameterized inviscid Burgers equation highlight the ability of the method to capture phenomena (e.g., moving shocks) not contained in the span of the original reduced basis.

查看原文 ↗

Reducing Compute Waste in LLMs through Kernel-Level DVFS

arXiv2026-01-13作者：Jeffrey Spaan, Kuan-Hsun Chen, Ana-Lucia Varbanescu

The rapid growth of AI has fueled the expansion of accelerator- or GPU-based data centers. However, the rising operational energy consumption has emerged as a critical bottleneck and a major sustainability concern. Dynamic Voltage and Frequency Scaling (DVFS) is a well-known technique used to reduce energy consumption, and thus improve energy-efficiency, since it requires little effort and works with existing hardware. Reducing the energy consumption of training and inference of Large Language Models (LLMs) through DVFS or power capping is feasible: related work has shown energy savings can be significant, but at the cost of significant slowdowns. In this work, we focus on reducing waste in LLM operations: i.e., reducing energy consumption without losing performance. We propose a fine-grained, kernel-level, DVFS approach that explores new frequency configurations, and prove these save more energy than previous, pass- or iteration-level solutions. For example, for a GPT-3 training run, a pass-level approach could reduce energy consumption by 2% (without losing performance), while our kernel-level approach saves as much as 14.6% (with a 0.6% slowdown). We further investigate the effe

查看原文 ↗