We consider gradient descent with constant stepsizes and derive exact worst-case convergence rates on the minimum gradient norm of the iterates. Our analysis covers all possible stepsizes and arbitrary upper/lower bounds on the curvature of the objective function, thus including convex, strongly convex and weakly convex (hypoconvex) objective functions. Among the challenging parts of the analysis, we note the necessity to exploit dependencies between non-consecutive iterates. While this complicates the proofs to some extent, it enables us to achieve an exact full-range analysis of gradient descent for any constant stepsize (covering, in particular, normalized stepsizes greater than one), whereas the literature contained only conjectured rates of this type. In the nonconvex case, allowing arbitrary bounds on upper and lower curvatures extends existing partial results that are valid only for gradient Lipschitz functions (i.e., where lower and upper bounds on curvature are equal), leading to improved rates for weakly convex functions. From our exact worst-case performance bounds, we deduce the optimal constant stepsize for gradient descent. Leveraging our analysis, we also introduce a
Formed from the debris of planet formation, interstellar comets provide invaluable insights into the chemical compositions of planetary systems outside of our Solar System. Spectroscopic observations of 3I/ATLAS, the third interstellar object, reveal production of numerous volatiles and refractory species throughout its trajectory. In this paper we present a framework to calculate the change in radius of an object on an arbitrary trajectory at any point in its orbit, applicable to any small body experiencing mass-loss. We next provide a comprehensive, machine readable table containing volatile and refractory production rates from all reported observations of 3I/ATLAS pre- and post-perihelion. Applying these equations to 3I/ATLAS, we calculate that it has lost $\sim$ 1.05 -- 6.56 meters of its surface during its passage through the Solar System, corresponding to $\sim$ 10$^9$ -- 10$^{10}$ kg and $\sim$ 0.10 -- 1.13% of its total mass. These numbers could be lower estimates if the dust-to-gas ratio of its outflow was sustained at a high level. Conservative and optimistic estimates were calculated over a range of heliocentric distances defined by the onset of activity in reported obse
At the low temperature and high density conditions of a neutron star crust neutrons are degenerate. In this work, we study the effect of this degeneracy on the capture rates of neutrons on neutron rich nuclei in accreted crusts. We use a statistical Hauser-Feshbach model to calculate neutron capture rates and find that neutron degeneracy can increase rates significantly. Changes increase from a factor of a few to many orders of magnitude near the neutron drip line. We also quantify uncertainties due to model inputs for masses, $γ$-strength functions, and level densities. We find that uncertainties increase dramatically away from stability and that degeneracy tends to increase these uncertainties further, except for cases near the neutron drip line where degeneracy leads to more robustness. As in the case of capture of classically distributed neutrons, variations in the mass model have the strongest impact. Corresponding variations in the reaction rates can be as high as 3 to 4 orders of magnitude, and be more than 5 times larger than under classical conditions. To ease the incorporation of neutron degeneracy in nucleosynthesis networks, we provide tabulated results of capture rates
We introduce "logically contractive mappings" nonexpansive self-maps that contract along a subsequence of iterates and prove a fixed-point theorem that extends Banach's principle. We obtain event-indexed convergence rates and, under bounded gaps between events, explicit iteration-count rates. A worked example shows a nonexpansive map whose square is a strict contraction, and we clarify relations to Meir--Keeler and asymptotically nonexpansive mappings. We further generalize to variable-factor events and show that $\prod_k λ_k = 0$ (equivalently $\sum_k -\ln λ_k = \infty$) implies convergence. These results unify several generalized contraction phenomena and suggest new rate questions tied to event sparsity.
Differentially private (DP) linear regression has received significant attention in the recent theoretical literature, with several approaches proposed to improve error rates. Our work considers the popular high-dimensional regime with random data, where the number of training samples $n$ and the input dimension $d$ grow at a proportional rate $d / n \to γ$, and it studies a family of one-pass DP gradient descent (DP-GD) algorithms satisfying $ρ^2 / 2$ zero concentrated DP. In this setting, we establish a deterministic equivalent for the DP-GD trajectory in terms of a system of ordinary differential equations. This allows to analyze the effect of gradient clipping constants that are smaller than the typical norm of the per-sample gradients - a setup shown to improve performance in practice. For well-conditioned data, we show that DP-GD, upon properly choosing clipping constant and learning rate, achieves the non-asymptotic risk of $O(γ+ γ^2 / ρ^2)$, and we establish that this rate is minimax optimal. Then, we consider the ill-conditioned case where the data covariance spectrum follows a power-law distribution, and we show that the risk displays a power-like scaling law in $γ$, high
We study convergence rates of the Trotter splitting $e^{A+L} = \lim_{n \to \infty} (e^{L/n} e^{A/n})^n$ in the strong operator topology. In the first part, we use complex interpolation theory to treat generators $L$ and $A$ of contraction semigroups on Banach spaces, with $L$ relatively $A$-bounded. In the second part, we study unitary dynamics on Hilbert spaces and develop a new technique based on the concept of energy constraints. Our results provide a complete picture of the convergence rates for the Trotter splitting for all common types of Schrödinger and Dirac operators, including singular, confining and magnetic vector potentials, as well as molecular many-body Hamiltonians in dimension $d=3$. Using the Brezis-Mironescu inequality, we derive convergence rates for the Schrödinger operator with $V(x)=\pm |x|^{-a}$ potential. In each case, our conditions are fully explicit.
Classical optimisation theory guarantees monotonic objective decrease for gradient descent (GD) when employed in a small step size, or ``stable", regime. In contrast, gradient descent on neural networks is frequently performed in a large step size regime called the ``edge of stability", in which the objective decreases non-monotonically with an observed implicit bias towards flat minima. In this paper, we take a step toward quantifying this phenomenon by providing convergence rates for gradient descent with large learning rates in an overparametrised least squares setting. The key insight behind our analysis is that, as a consequence of overparametrisation, the set of global minimisers forms a Riemannian manifold $M$, which enables the decomposition of the GD dynamics into components parallel and orthogonal to $M$. The parallel component corresponds to Riemannian gradient descent on the objective sharpness, while the orthogonal component is a bifurcating dynamical system. This insight allows us to derive convergence rates in three regimes characterised by the learning rate size: (a) the subcritical regime, in which transient instability is overcome in finite time before linear conv
We estimate lepton capture and emission rates, as well as neutrino energy loss rates, for nuclei in the mass range A=65-80. These rates are calculated on a temperature/density grid appropriate for a wide range of astrophysical applications including simulations of late time stellar evolution and x-ray bursts. The basic inputs in our single particle and empirically inspired model are i) experimentally measured level and weak decay information, ii) estimates of matrix elements for allowed experimentally-unmeasured transitions based on the systematics of experimentally observed allowed transitions, and iii) estimates of the centroids of the GT resonances motivated by shell model calculations in the fp shell as well as by (n,p) and (p,n) experiments. Transitions involving Fermi resonances (isobaric analog states) are also included and dominate the rates for many interesting proton rich nuclei for which an experimentally-determined ground state lifetime is unavailable. To compare our results with more detailed shell model based calculations we also calculate weak rates for nuclei in the mass range A=60-65 for which Langanke and Martinez-Pinedo have provided rates. The typical deviation
We present an alternative approach to the forecasting of motor vehicle collision rates. We adopt an oft-used tool in mathematical finance, the Heston Stochastic Volatility model, to forecast the short-term and long-term evolution of motor vehicle collision rates. We incorporate a number of extensions to the Heston model to make it fit for modelling motor vehicle collision rates. We incorporate the temporally-unstable and non-deterministic nature of collision rate fluctuations, and introduce a parameter to account for periods of accelerated safety. We also adjust estimates to account for the seasonality of collision patterns. Using these parameters, we perform a short-term forecast of collision rates and explore a number of plausible scenarios using long-term forecasts. The short-term forecast shows a close affinity with realised rates (over 95% accuracy), and outperforms forecasting models currently used in road safety research (Vasicek, SARIMA, SARIMA-GARCH). The long-term scenarios suggest that modest targets to reduce collision rates (1.83% annually) and targets to reduce the fluctuations of month-to-month collision rates (by half) could have significant benefits for road safety
Nuclear weak interaction rates for fp-shell nuclei in stellar matter and the associated energy losses are calculated using a modified form of proton-neutron quasiparticle RPA model with separable Gamow-Teller forces. The stellar weak rates are calculated over a wide range of densities (10 < ρY_{e} (gcm^{-3}) < 10^{11}) and temperatures (10^{7} < T(K) < 30 x 10^{9}). This is the first ever extensive compilation of weak interaction rates in stellar matter calculated over a wide temperature-density grid and over a larger mass range. The calculated capture and decay rates take into consideration the latest experimental energy levels and ft value compilations. We have calculated stellar weak interaction rates for a total of 619 nuclei in the mass range A = 40 to 100. These also include many important neutron-rich nuclei which play an important role in the evolution process of stellar collapse. This is our second paper in a series where we will be presenting our results on an abbreviated scale of temperature and density for the mass range A = 18 to 100. This paper contains the stellar weak rates in the mass range 40 to 60.
We consider random walks in a uniformly elliptic, balanced, i.i.d. random environment in the integer lattice $Z^d$ for $d\geq 2$ and the corresponding problem of stochastic homogenization of non-divergence form difference operators. We first derive a quantitative law of large numbers for the invariant measure, which is nearly optimal. A mixing property of the field of the invariant measure is then achieved. We next obtain rates of convergence for the homogenization of the Dirichlet problem for non-divergence form operators, which are generically optimal for $d\geq 3$ and nearly optimal when $d=2$. Furthermore, we establish the existence, stationarity and uniqueness properties of the corrector problem for all dimensions $d\ge 2$. Afterwards, we quantify the ergodicity of the environmental process for both the continuous-time and discrete-time random walks, and as a consequence, we get explicit convergence rates for the quenched central limit theorem of the balanced random walk.
Models which postulate lognormal dynamics for interest rates which are compounded according to market conventions, such as forward LIBOR or forward swap rates, can be constructed initially in a discrete tenor framework. Interpolating interest rates between maturities in the discrete tenor structure is equivalent to extending the model to continuous tenor. The present paper sets forth an alternative way of performing this extension; one which preserves the Markovian properties of the discrete tenor models and guarantees the positivity of all interpolated rates.
This manuscript is an updated version of Kalogera et al. (2004) published in ApJ Letters to correct our calculation of the Galactic DNS in-spiral rate. The details of the original erratum submitted to ApJ Letters are given in page 6 of this manuscript. We report on the newly increased event rates due to the recent discovery of the highly relativistic binary pulsar J0737--3039 (Burgay et al. 2003). Using a rigorous statistical method, we present the calculations reported by Burgay et al., which produce a in-spiral rate for Galactic double neutron star (DNS) systems that is higher by a factor of 5-7 compared to estimates made prior to the new discovery. Our method takes into account known pulsar-survey selection effects and biases due to small-number statistics. This rate increase has dramatic implications for gravitational wave detectors. For the initial Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors, the most probable detection rates for DNS in-spirals are one event per 10-630 yr; at 95% confidence, we obtain rates up to one per 3 yr. For the advanced LIGO detectors, the most probable rates are 10-500 events per year. These predictions, for the first time, bri
The theory of the asymptotic manipulation of pure bipartite quantum systems can be considered completely understood: The rates at which bipartite entangled states can be asymptotically transformed into each other are fully determined by a single number each, the respective entanglement entropy. In the multi-partite setting, similar questions of the optimally achievable rates of transforming one pure state into another are notoriously open. This seems particularly unfortunate in the light of the revived interest in such questions due to the perspective of experimentally realizing multi-partite quantum networks. In this work, we report substantial progress by deriving surprisingly simple upper and lower bounds on the rates that can be achieved in asymptotic multi-partite entanglement transformations. These bounds are based on ideas of entanglement combing and state merging. We identify cases where the bounds coincide and hence provide the exact rates. As an example, we bound rates at which resource states for the cryptographic scheme of quantum secret sharing can be distilled from arbitrary pure tripartite quantum states, providing further scope for quantum internet applications beyo
We perform a multifractal analysis of homological growth rates of oriented geodesics on hyperbolic surfaces. Our main result provides a formula for the Hausdorff dimension of level sets of prescribed growth rates in terms of a generalized Poincaré exponent of the Fuchsian group. We employ symbolic dynamics developed by Bowen and Series, ergodic theory and thermodynamic formalism to prove the analyticity of the dimension spectrum.
Sparse additive models are families of $d$-variate functions that have the additive decomposition $f^* = \sum_{j \in S} f^*_j$, where $S$ is an unknown subset of cardinality $s \ll d$. In this paper, we consider the case where each univariate component function $f^*_j$ lies in a reproducing kernel Hilbert space (RKHS), and analyze a method for estimating the unknown function $f^*$ based on kernels combined with $\ell_1$-type convex regularization. Working within a high-dimensional framework that allows both the dimension $d$ and sparsity $s$ to increase with $n$, we derive convergence rates (upper bounds) in the $L^2(\mathbb{P})$ and $L^2(\mathbb{P}_n)$ norms over the class $\MyBigClass$ of sparse additive models with each univariate function $f^*_j$ in the unit ball of a univariate RKHS with bounded kernel function. We complement our upper bounds by deriving minimax lower bounds on the $L^2(\mathbb{P})$ error, thereby showing the optimality of our method. Thus, we obtain optimal minimax rates for many interesting classes of sparse additive models, including polynomials, splines, and Sobolev classes. We also show that if, in contrast to our univariate conditions, the multivariate f
We study a nonlinear kinetic model of mass exchange between interacting grains. The transition rates follow the Arrhenius equation with an activation energy that depends on the grain mass. We show that the activation parameter can be absorbed in the initial conditions for the grain masses, and that the total mass is conserved. We obtain numerical solutions of the coupled, nonlinear, ordinary differential equations of mass exchange for the two-grain system, and we compare them with approximate theoretical solutions in specific neighborhoods of the phase space. Using phase plane methods, we determine that the system exhibits regimes of diffusive and growth-decay (reverse diffusion) kinetics. The equilibrium states are determined by the mass equipartition and separation nullcline curves. If the transfer rates are perturbed by white noise, numerical simulations show that the system still exhibits diffusive and growth-decay regimes, although the noise can reverse the sign of equilibrium mass difference. Finally, we present theoretical analysis and numerical simulations of a system with many interacting grains. Diffusive and growth-decay regimes are established as well, but the approach
We consider a batch active learning scenario where the learner adaptively issues batches of points to a labeling oracle. Sampling labels in batches is highly desirable in practice due to the smaller number of interactive rounds with the labeling oracle (often human beings). However, batch active learning typically pays the price of a reduced adaptivity, leading to suboptimal results. In this paper we propose a solution which requires a careful trade off between the informativeness of the queried points and their diversity. We theoretically investigate batch active learning in the practically relevant scenario where the unlabeled pool of data is available beforehand ({\em pool-based} active learning). We analyze a novel stage-wise greedy algorithm and show that, as a function of the label complexity, the excess risk of this algorithm matches the known minimax rates in standard statistical learning settings. Our results also exhibit a mild dependence on the batch size. These are the first theoretical results that employ careful trade offs between informativeness and diversity to rigorously quantify the statistical performance of batch active learning in the pool-based scenario.
Used to investigate the presence of distinctive recurrent behaviours in natural processes, the recurrence plots can be applied to the analysis of economic data, and, in particular, to the characterization of exchange rates of currencies too. In this paper, we will show that these plots are able to characterize the periods of oscillation and random walk of currencies and enhance their reply to news and events, by means of texture transitions. The examples of recurrence plots given here are obtained from time series of exchange rates of Euro.
This paper characterizes the second-order coding rates for lossy source coding with side information available at both the encoder and the decoder. We first provide non-asymptotic bounds for this problem and then specialize the non-asymptotic bounds for three different scenarios: discrete memoryless sources, Gaussian sources, and Markov sources. We obtain the second-order coding rates for these settings. It is interesting to observe that the second-order coding rate for Gaussian source coding with Gaussian side information available at both the encoder and the decoder is the same as that for Gaussian source coding without side information. Furthermore, regardless of the variance of the side information, the dispersion is $1/2$ nats squared per source symbol.