共找到 20 条结果
We prove that dichotomies given by growth rates that are either faster or slower than exponential either do not occur or are inconsequential in the setting of skew-products with compact base. A similar conclusion is obtained for the nonuniform exponential behavior. To achieve this, we study families of translated linear nonautonomous differential equations for which we prove propagation results. We also study translations of growth rates under a comparison criteria.
Post-training compression reduces LLM parameter counts but often produces irregular tensor dimensions that degrade GPU performance -- a phenomenon we call \emph{dimensional misalignment}. We present a full-stack analysis tracing root causes at three levels: framework, library, and hardware. The key insight is that model inference becomes slower because the resulting dimensions are unfriendly with the GPU execution stack. For example, compressing Llama-3-8B with activation-aware singular value decomposition (ASVD) has 15\% fewer parameters yet runs no faster than the uncompressed baseline, because 95\% of its dimensions are misaligned. We propose \textbf{GAC} (GPU-Aligned Compression), a new compression paradigm that wraps any dimension-reducing compressor and re-selects hardware-aligned dimensions via multi-choice knapsack optimization under the same parameter budget. We evaluate GAC on Llama-3-8B with ASVD and LLM-Pruner, achieving 100\% alignment and recovering up to 1.5$\times$ speedup while preserving model quality.
We describe the design, construction, and characterization of a permanent magnet based, transverse-field Zeeman slower for lithium atoms. We use off-the-shelf compact permanent bar magnets in the Halbach configuration to create a uniform magnetic field in the transverse direction. We develop a general approach for a mechanical structure that supports the spatial distribution of magnets using 3D printing technology. The approach allows for flexible assembly and dismantling of the magnetic field on the target vacuum system. Finally, we verify that the Zeeman slower supports a high flux of slow atoms in the region of magneto-optical trap.
We present a compact design of dual-beam Zeeman slower optimized for efficient production of cold atom applications. Traditional single-beam configurations face challenges from substantial residual atomic flux impacting downstream optical windows, resulting in increased system size, atomic deposition contamination, and a reduced operational lifetime. Our approach employs two oblique laser beams and a capillary-array collimation system to address these challenges while maintaining efficient deceleration. For rubidium ($^{87}$Rb), simulations demonstrate a significant increase in the fraction of atoms captured by a two-dimensional magneto-optical trap (2D-MOT) and nearly eliminate atom-induced contamination probability at optical windows, all within a compact Zeeman slower length of 44 cm. Experimental validation with Rb and Yb demonstrates highly efficient atomic loading within the same compact design. This advancement represents a substantial improvement for high-flux cold atom applications, providing reliable performance for high-precision metrology, quantum computation and simulation.
Critical-data-size accounts of grokking suggest a natural post-threshold intuition: once training data is sufficient to identify the underlying rule, additional data should accelerate validation convergence. We show that this intuition can fail in a controlled structured-output task. In Needleman--Wunsch (NW) matrix generation, small Transformers reach high validation exact-match accuracy fastest at an intermediate dataset size, not at the largest one. Past this dataset-size sweet spot, generalization remains achievable but requires more gradient updates. Conversely, in the regime where partial validation competence first appears, larger datasets can require fewer updates to reach high training accuracy, suggesting that emerging rule structure can accelerate fitting beyond example-wise memorization. A multiplication baseline does not show the same post-threshold slowdown. These results separate the critical data size for the onset of generalization from the dataset size that optimizes update-based convergence, and identify structured-output tasks where learning the rule and completing exact-fitting can diverge.
Reasoning models have attracted increasing attention for their ability to tackle complex tasks, embodying the System II (slow thinking) paradigm in contrast to System I (fast, intuitive responses). Yet a key question remains: Does slower reasoning necessarily lead to more truthful answers? Our findings suggest otherwise. We conduct the first systematic study of the inverse scaling law in slow-thinking paradigms for multimodal reasoning. We find that when confronted with incomplete or misleading visual inputs, slow-thinking models are more prone to fabricating plausible yet false details to justify untruthful reasoning. To analyze this behavior, we construct a 5,000-sample hierarchical prompt dataset annotated by 50 human participants. The prompts progressively increase in complexity, revealing a consistent pattern: slower reasoning models tend to follow depth-first search (DFS) thinking, persistently exploring flawed premises, while faster chat models favor breadth-first search (BFS) inference, showing greater caution under uncertainty. These findings reveal a critical vulnerability of reasoning models: while effective in structured domains such as math, their DFS-style reasoning b
We report the construction and characterization of an experimental setup for producing a cold gas of $^{40}$Ca atoms and excite them to high Rydberg states with a resonant three-photon-excitation scheme. The apparatus comprises four stages, each designed in-house. An oven heated to $\sim 500^\circ$C generates an atomic beam that is collimated by a capillary stack. The beam is sent into a passive, permanent-magnet-based Zeeman slower that reduces the atomic velocity to $30$ m/s. The slow atoms are captured in a magneto-optical trap (MOT) and cooled to $1.0(3)$ mK with a trapping time of $16(2)$ ms. Ground-state atoms in the cold gas are excited to high Rydberg states via resonant excitation through the intermediate $4s4p\, ^1P_1$ and $4s4d\, ^1D_2$ states. The MOT is operated at the center of an electrode stack, which serves to apply continuous and pulsed electric fields and field-ionize the Rydberg atoms for detection. We benchmark our MOT against previous implementations and find its performance consistent with state-of-the-art results in terms of temperature and trapping lifetime. Finally, we demonstrate Rydberg spectroscopy of calcium, confirming the system's suitability for ult
Video generators are increasingly evaluated as potential world models, which requires them to encode and understand physical laws. We investigate their representation of a fundamental law: gravity. Out-of-the-box video generators consistently generate objects falling at an effectively slower acceleration. However, these physical tests are often confounded by ambiguous metric scale. We first investigate if observed physical errors are artifacts of these ambiguities (e.g., incorrect frame rate assumptions). We find that even temporal rescaling cannot correct the high-variance gravity artifacts. To rigorously isolate the underlying physical representation from these confounds, we introduce a unit-free, two-object protocol that tests the timing ratio $t_1^2/t_2^2 = h_1/h_2$, a relationship independent of $g$, focal length, and scale. This relative test reveals violations of Galileo's equivalence principle. We then demonstrate that this physical gap can be partially mitigated with targeted specialization. A lightweight low-rank adaptor fine-tuned on only 100 single-ball clips raises $g_{\mathrm{eff}}$ from $1.81\,\mathrm{m/s^2}$ to $6.43\,\mathrm{m/s^2}$ (reaching $65\%$ of terrestrial
In shift-symmetric Einstein-scalar-Gauss-Bonnet gravity, stationary black holes have a non-vanishing scalar charge. During the inspiral, the phase evolution is modified by several effects, primarily an additional scalar dipole radiation, which enters at -1PN order. This effect accelerates the inspiral when compared to general relativity, when including corrections up to 2PN. Using fully non-linear numerical simulations of quasi-circular, comparable mass binaries, we find that in the late stages the orbital dynamics are altered so that the overall effect is instead a decelerated merger phase for the modified gravity case. We attribute this to a change in the conservative dynamics, and show that at the late inspiral stage more energy must be emitted in scalar-Gauss-Bonnet gravity to induce a given change in frequency. In longer signals, this should lead to a distinctive switch between a faster and slower frequency evolution relative to general relativity as the binary approaches merger. This work suggests we should revisit existing constraints on the theory that are obtained assuming PN approximations apply up to merger, or based on order by order approximations that neglect backreac
Higher-order ODE solvers have become a standard tool for accelerating diffusion probabilistic model (DPM) sampling, motivating the widespread view that first-order methods are inherently slower and that increasing discretization order is the primary path to faster generation. This paper challenges this belief and revisits acceleration from a complementary angle: beyond solver order, the placement of DPM evaluations along the reverse-time dynamics can substantially affect sampling accuracy in the low-neural function evaluation (NFE) regime. We propose a novel training-free, first-order sampler whose leading discretization error has the opposite sign to that of DDIM. Algorithmically, the method approximates the forward-value evaluation via a cheap one-step lookahead predictor. We provide theoretical guarantees showing that the resulting sampler provably approximates the ideal forward-value trajectory while retaining first-order convergence. Empirically, across standard image generation benchmarks (CIFAR-10, ImageNet, FFHQ, and LSUN), the proposed sampler consistently improves sample quality under the same NFE budget and can be competitive with, and sometimes outperform, state-of-the-
High-speed solar wind streams (HSSs) interact with the preceding ambient solar wind to form Stream Interaction Regions (SIRs), which are a primary source of recurrent geomagnetic storms. However, HSSs may also encounter and subsequently interact with Interplanetary Coronal Mass Ejections (ICMEs). In particular, the impact of the interaction between slower ICMEs and faster HSSs, represents an unexplored area that requires further in-depth investigation. This specific interaction can give rise to unexpected geomagnetic storm signatures, diverging from the conventional expectations of individual SIR events sharing similar HSS properties. Our study presents a comprehensive analysis of solar wind data spanning from 1996 to 2020, capturing 23 instances where such encounters led to geomagnetic storms ($SymH$ $< -30$ nT). We determined that interaction events between preceding slower ICMEs and faster HSSs possess the potential to induce substantial storm activity, statistically nearly doubling the geoeffective impact in comparison to SIR storm events. The increase in the amplitude of the $SymH$ index appears to result from heightened dynamic pressure, often coupled with the concurrent a
This note demonstrates that, for all compact convex sets, high-precision linear minimization can be performed via a single evaluation of the projection and a scalar-vector multiplication. In consequence, if $\varepsilon$-approximate linear minimization takes at least $L(\varepsilon)$ real vector-arithmetic operations and projection requires $P$ operations, then $\mathcal{O}(P)\geq \mathcal{O}(L(\varepsilon))$ is guaranteed. This concept is expounded with examples, an explicit error bound, and an exact linear minimization result for polyhedral sets.
Low-Rank Adaptation (LoRA) is one of the most widely used techniques for fine-tuning large language models (LLMs). By introducing a small number of trainable low-rank weight matrices, LoRA substantially reduces the number of parameters that need to be updated, offering significant advantages in memory consumption and computational efficiency compared to full fine-tuning. However, we observed that LoRA does not consistently provide speed improvements across all model architectures and training setups. Motivated by this inconsistency, we conduct a comprehensive analysis of LoRA's performance and investigate the underlying factors limiting its speedup. Based on our findings, we propose several methods for more efficient fine-tuning of LLMs. We empirically evaluate these methods and compare them to LoRA, demonstrating that our approach achieves comparable or superior performance while delivering more consistent training speed improvements. Our work offers valuable insights and practical guidelines for practitioners seeking to optimize LLM fine-tuning under resource constraints.
Electron-hole pairs in semiconductors are essential for solar cells and fast electronic circuitry, but the competition between carrier transport and relaxation into heat limits the efficiency and speed. Here we use ultrafast electron diffraction with terahertz pulse compression to measure the electron-phonon decay rate in single-crystal silicon as a function of laser excitation strength. We find that the excited electrons relax slower into phonons for higher carrier densities. The electron-phonon scattering rate changes in a nonlinear way from 400 fs at ~$2 \times 10^{20} \text{ cm}^{-3}$ to 1.2 ps at ~$4 \times 10^{20} \text{ cm}^{-3}$. These results indicate that a hot electron gas quenches the scattering into phonons in a temperature-dependent way. Ultrafast electronic circuitry of silicon therefore should work faster and provide higher bandwidths at lower carrier densities.
A transverse Zeeman slower composed of an array of compact discrete neodymium magnets is considered. A simple and precise model of such a slower based on magnetic dipoles is developed. The theory of a general Zeeman slower is modified to include spatial nonuniformity of the slowing laser beam intensity due to its convergence and absorption by slowed atoms. The slower needs no high currents or water cooling and the spatial distribution of its magnetic field can be adjusted. In addition the slower provides a possibility to cool the slowed atoms transversally along the whole length of the slower. Such a slower would be ideal for transportable optical atomic clocks and their future applications in space physics.
Galactic bars can form via the internal bar instability or external tidal perturbations by other galaxies. We systematically compare the properties of bars formed through the two mechanisms with a series of controlled $N$-body simulations that form bars through internal or external mechanisms. We create three disk galaxy models with different dynamical ``hotness'' and evolve them in isolation and under flyby interactions. In the cold and warm disk models, where bars can form spontaneously in isolation, tidally-induced bars are promoted to a more ``advanced'' evolutionary stage. However, these bars have similar pattern speeds to those formed spontaneously within the same disk. Bars formed from both mechanisms have similar distributions in pattern speed--bar strength ($Ω_p-A_2$) space and exhibit comparable ratios of co-rotation radius to bar length (${\cal R}={R_{\mathrm {CR}}}/{R_{\mathrm {bar}}}$). Dynamical analyses suggest that the inner stellar disk loses the same amount of angular momentum, irrespective of the presence or intensity of the perturbation, which possibly explains the resemblance between tidally and spontaneously formed bars. In the hot disk model, which avoids the
We compare the $(1,λ)$-EA and the $(1 + λ)$-EA on the recently introduced benchmark DisOM, which is the OneMax function with randomly planted local optima. Previous work showed that if all local optima have the same relative height, then the plus strategy never loses more than a factor $O(n\log n)$ compared to the comma strategy. Here we show that even small random fluctuations in the heights of the local optima have a devastating effect for the plus strategy and lead to super-polynomial runtimes. On the other hand, due to their ability to escape local optima, comma strategies are unaffected by the height of the local optima and remain efficient. Our results hold for a broad class of possible distortions and show that the plus strategy, but not the comma strategy, is generally deceived by sparse unstructured fluctuations of a smooth landscape.
We describe the design, construction and operation of a versatile dual-species Zeeman slower for both Cs and Yb, which is easily adaptable for use with other alkali metals and alkaline earths. With the aid of analytic models and numerical simulation of decelerator action, we highlight several real-world problems affecting the performance of a slower and discuss effective solutions. To capture Yb into a magneto-optical trap (MOT), we use the broad $^1S_0$ to $^1P_1$ transition at 399 nm for the slower and the narrow $^1S_0$ to $^3P_1$ intercombination line at 556 nm for the MOT. The Cs MOT and slower both use the D2 line ($6^2S_{1/2}$ to $6^2P_{3/2}$) at 852 nm. We demonstrate that within a few seconds the Zeeman slower loads more than $10^9$ Yb atoms and $10^8$ Cs atoms into their respective MOTs. These are ideal starting numbers for further experiments on ultracold mixtures and molecules.
Longitudinal Zeeman slowers composed of arrays of compact discrete neodymium magnets are proposed. The general properties of these slowers, as well as specific designs of short spin-flip Zeeman slowers for Sr and Rb atoms are described. The advantages of these slowers are their simplicity, low cost and absence of consumed electrical power and corresponding water cooling. The smoothness of the magnetic field together with ease of adjustability makes it possible to operate these slowers near the theoretical limits of deceleration, making them more compact and efficient.
Context. Mounting evidence has shown that EUV waves consist of a fast-mode magnetohydrodynamic (MHD) wave (or shock wave) followed by a slower nonwave component, as predicted by the magnetic fieldline stretching model. However, not all observed events display both wavefronts, particularly the slower nonwave component. Even in case that the slower nonwave component is present, the intensity distribution often exhibits strong anisotropy. Aims. This study is intended to unveil the formation condition of the slower nonwave component of EUV waves. Methods. We analyzed the EUV wave event on 8 March 2019, and compared the EUV wave intensity map with the extrapolation coronal potential magnetic field. Data-inspired MHD simulation was also performed. Results. Two types of EUV waves are identified, and the slower nonwave component exhibits strong anisotropy. By reconstructing 3D coronal magnetic fields, we found that the slower nonwave component of EUV waves is more pronounced in the regions where magnetic fields are backward-inclined, which is further reproduced by our MHD simulations. Conclusions. The anisotropy of the slower nonwave component of EUV waves is strongly related to the magnet