共找到 20 条结果
Naomi Gleit has weathered many controversies at Meta, but remains in what she tells the BBC is her "dream job"
Context retrieval systems for LLM inference face a critical challenge: high retrieval latency creates a fundamental tension between waiting for complete context (poor time-to-first-token) and proceeding without it (reduced quality). Streaming context incrementally--overlapping retrieval with inference--can mitigate this latency, but doing so with concurrent requests introduces new challenges: requests contend for GPU compute and memory, and scheduling must adapt to dynamic context arrivals. We present Stream2LLM, a streaming-aware LLM serving system for concurrent prefill-decode disaggregated deployments. Stream2LLM introduces adaptive scheduling and preemption for two distinct retrieval patterns: append-mode (progressive context accumulation) and update-mode (iterative refinement with cache invalidation). It decouples scheduling decisions from resource acquisition, enabling flexible preemption strategies guided by hardware-specific cost models, and uses longest common prefix matching to minimize redundant computation when input changes dynamically. To evaluate Stream2LLM, we collect two large-scale, real-world streaming workloads based on web crawling and approximate nearest neigh
We study online task allocation for multi-robot, multi-queue systems with asymmetric stochastic arrivals and switching delays. We formulate the problem in discrete time: each location can host at most one robot per slot, servicing a task consumes one slot, switching between locations incurs a one-slot travel delay, and arrivals at locations are independent Bernoulli processes with heterogeneous rates. Building on our previous structural result that optimal policies are of exhaustive type, we formulate a discounted-cost Markov decision process and develop an exhaustive-assignment actor-critic policy architecture that enforces exhaustive service by construction and learns only the next-queue allocation for idle robots. Unlike the exhaustive-serve-longest (ESL) queue rule, whose optimality is known only under symmetry, the proposed policy adapts to asymmetry in arrival rates. Across different server-location ratios, loads, and asymmetric arrival profiles, the proposed policy consistently achieves lower discounted holding cost and smaller mean queue length than the ESL baseline, while remaining near-optimal on instances where an optimal benchmark is available. These results show that s
We study the fair capacitated vehicle routing problem, in which a fleet of vehicles must serve a set of customers such that the difference between the longest and shortest route, the range, is minimized. A key challenge is that the range objective is non-monotonic: it can be reduced by artificially lengthening routes, leading to solutions that violate TSP-optimality of individual routes. Existing exact methods struggle to handle this efficiently. We propose a branch-price-and-cut framework that enforces TSP-optimality through TSP-optimality cuts, which forbid TSP-dominated arc sequences. We strengthen the cuts through a dedicated lifting procedure. Computational experiments on benchmark instances with up to 25 customers show the method solves nearly all instances to optimality, achieving an average gap of 0.27% on the hardest configurations.
We study online task allocation for multi-robot, multi-queue systems with stochastic arrivals and switching delays. Time is slotted; each location can host at most one robot per slot; service consumes one slot; switching between locations incurs a one-slot travel delay; and arrivals are independent Bernoulli processes. We formulate a discounted-cost Markov decision process and propose Exhaustive-Serve-Longest (ESL), a simple real-time policy that serves exhaustively when the current location is nonempty and, when idle, switches to a longest unoccupied nonempty location, and we prove the optimality of this policy. As baselines, we tune a fixed-dwell cyclic policy via a discrete-time delay expression and implement a first-come-first-serve policy. Across server-to-location ratios and loads, ESL consistently yields lower discounted holding cost and smaller mean queue lengths, with action-time fractions showing more serving and restrained switching. Its simplicity and robustness make ESL a practical default for real-time multi-robot scheduling systems.
Black hole quasinormal modes (QNMs) can exhibit resonant excitations associated with avoided crossings in their complex frequency spectrum. Such resonance phenomena can serve as novel signatures for probing new physics, where additional degrees of freedom are commonly introduced. Motivated by this possibility, we investigate QNMs in systems where multiple degrees of freedom are coupled with each other, and introduce a definition of excitation factors suitable for such systems. To demonstrate our formulation, we apply it to a black hole in the Einstein-Maxwell-axion theory, where we find that avoided crossings can appear even between longest-lived modes originating from the fundamental modes of different degrees of freedom, in contrast to the Kerr case in General Relativity. We show that the excitation factors are indeed amplified as a manifestation of resonance at parameter values corresponding to the avoided crossings.
The supermarket model is a system of $n$ queues each with serving rates $1$ and arrival rates $λ$ per vertex, where tasks will move on arrival to the shortest adjacent queue. We consider the supermarket model in the small $λ$ regime on a large dynamic configuration hypergraph with stubs swapping their hyperedge membership at rate $κ$. This interpolates previous investigations of the supermarket model on static graphs of bounded degree (where an exponential tail produces a logarithmic queue) and with independently drawn neighbourhoods (where the ``power of two choices'' phenomenon is a doubly logarithmic queue). We find with high probability, over any polynomial timeframe, the order of the longest queue is \[ \log\log n + \frac{\log n}{\log κ} \wedge \log n \] so in the sense of controlling the order of maximal queue length, we identify which speed orders are sufficiently fast that there is no gain in moving the environment faster. Additional results describe mixing of the system and propagation of chaos over time.
The efficient deployment of large language models (LLMs) in online settings requires optimizing inference performance under stringent latency constraints, particularly the time-to-first-token (TTFT) and time-per-output-token (TPOT). This paper focuses on the query scheduling problem for LLM inference with prefix reuse, a technique that leverages shared prefixes across queries to reduce computational overhead. Our work reveals previously unknown limitations of the existing first-come-first-serve (FCFS) and longest-prefix-match (LPM) scheduling strategies with respect to satisfying latency constraints. We present a formal theoretical framework for LLM query scheduling under RadixAttention, a prefix reuse mechanism that stores and reuses intermediate representations in a radix tree structure. Our analysis establishes the NP-hardness of the scheduling problem with prefix reuse under TTFT constraints and proposes a novel scheduling algorithm, $k$-LPM, which generalizes existing methods by balancing prefix reuse and fairness in query processing. Theoretical guarantees demonstrate that $k$-LPM achieves improved TTFT performance under realistic traffic patterns captured by a data generativ
JADES-GS-z6-0, a high-redshift galaxy ($z \sim 6.7$) recently observed as part of the James Webb Space Telescope (JWST) Advanced Deep Extragalactic Survey (JADES), exhibits a distinct bump in its rest-frame ultraviolet (UV) spectrum indicative of a large quantity of hydrocarbon grains, a sign of rapid metal and dust enrichment in its interstellar medium (ISM). This galaxy serves as an ideal case for examining rapid dust formation processes in the early universe. We investigated diverse dust production channels from a possible maximal formation redshift of $z_{\rm form} \approx 17$, enabling dust contributions from asymptotic giant branch (AGB) stars over the longest possible timescale. Our model simultaneously reproduces key spectral features of JADES-GS-z6-0 such as its Balmer decrement, UV slope, and UV bump. The match is obtained by adopting a star-formation history in which a burst at $\sim 600$~Myr accounts for approximately 30\% of the galaxy's final stellar mass. Our findings indicate two pathways for the formation of hydrocarbon grains, such as polycyclic aromatic hydrocarbons (PAHs): (1) efficient dust accretion within the ISM, necessitating a low depletion of metals into
Large language model (LLM) inference workload dominates a wide variety of modern AI applications, ranging from multi-turn conversation to document analysis. Balancing fairness and efficiency is critical for managing diverse client workloads with varying prefix patterns. Unfortunately, existing fair scheduling algorithms for LLM serving, such as Virtual Token Counter (VTC), fail to take prefix locality into consideration and thus suffer from poor performance. On the other hand, locality-aware scheduling algorithms in existing LLM serving frameworks tend to maximize the prefix cache hit rate without considering fair sharing among clients. This paper introduces the first locality-aware fair scheduling algorithm, Deficit Longest Prefix Match (DLPM), which can maintain a high degree of prefix locality with a fairness guarantee. We also introduce a novel algorithm, Double Deficit LPM (D$^2$LPM), extending DLPM for the distributed setup that can find a balance point among fairness, locality, and load-balancing. Our extensive evaluation demonstrates the superior performance of DLPM and D$^2$LPM in ensuring fairness while maintaining high throughput (up to 2.87$\times$ higher than VTC) and
We propose and demonstrate that peripheral neutron-$α$ scattering at low energies can serve as a sensitive and clean probe of the long-range three-nucleon forces. To this aim, we perform {\it ab initio} quantum Monte Carlo calculations using two- and three-nucleon interactions derived in chiral effective field theory up to third expansion order. We show that the longest-range three-nucleon force stemming from the two-pion exchange plays a crucial role in the proper description of the neutron-$α$ $D$-wave phase shifts. Our Letter reveals the predictive power of chiral symmetry in the few-body sector and opens a new direction for probing and constraining three-nucleon forces.
The manufacturing industry is undergoing a transformative shift, driven by cutting-edge technologies like 5G, AI, and cloud computing. Despite these advancements, effective system control, which is crucial for optimizing production efficiency, remains a complex challenge due to the intricate, knowledge-dependent nature of manufacturing processes and the reliance on domain-specific expertise. Conventional control methods often demand heavy customization, considerable computational resources, and lack transparency in decision-making. In this work, we investigate the feasibility of using Large Language Models (LLMs), particularly GPT-4, as a straightforward, adaptable solution for controlling manufacturing systems, specifically, mobile robot scheduling. We introduce an LLM-based control framework to assign mobile robots to different machines in robot assisted serial production lines, evaluating its performance in terms of system throughput. Our proposed framework outperforms traditional scheduling approaches such as First-Come-First-Served (FCFS), Shortest Processing Time (SPT), and Longest Processing Time (LPT). While it achieves performance that is on par with state-of-the-art metho
Surface operators in four-dimensional gauge theories are two-dimensional defects, serving as natural generalizations of Wilson lines and 't Hooft line operators. They act as ideal probes for exploring the non-perturbative structure of the theory. Rigid surface operators are a specific class of surface operators characterized by the absence of continuous deformation parameters. It is expected that a closed $S$-duality map should exist among these rigid operators. While progress has been made on specific examples or subclasses by leveraging invariants and empirical conjectures, a complete picture remains elusive. A significant challenge arises when multiple rigid surface operators share identical invariants, making the determination of $S$-duality relations difficult. More critically, a mismatch exists in the number of rigid surface operators between dual theories when classified by invariants; this is referred to as the \textit{mismatch problem}. This discrepancy suggests the necessity of extending the scope of consideration beyond strictly rigid operators. In this paper, we propose a direct, natural, and precise $S$-duality map for rigid surface operators. Our map is realized by mo
Our interpretation of terrestrial exoplanet atmospheric spectra will always be limited by the accuracy of the data we use as input in our forward and retrieval models. Ultraviolet molecular absorption cross sections are one category of these essential model inputs; however, they are often poorly characterized at the longest wavelengths relevant to photo-dissociation. Photolysis reactions dominate the chemical kinetics of temperate terrestrial planet atmospheres. One molecule of particular importance is CO$_2$, which is likely present in all terrestrial planet atmospheres. The photolysis of CO$_2$ can introduce CO and O, as well as shield tropospheric water vapor from undergoing photolysis. This is important because H$_2$O photolysis produces OH, which serves as a major reactive sink to many atmospheric trace gases. Here, we construct CO$_2$ cross-section prescriptions at 195K and 300K extrapolated beyond 200 nm from measured cross sections. We compare results from the implementation of these new cross sections to the most commonly used CO$_2$ prescriptions for temperate, terrestrial planets with Archean-like atmospheres. We generally find that the observational consequences of CO$_
The opening of the Gotthard Base Tunnel in 2017, the longest railway tunnel in the world, marked a milestone in Swiss transport policy. The tunnel, a part of the New Rail Link through the Alps, serves as a key instrument of the so-called "modal shift policy," which aims to transfer transalpine freight traffic from road to rail. The reduction in travel time by train between northern and southern Switzerland raised expectations that a substantial share of tourist-oriented passenger traffic would also shift from car to rail. In this paper, we conduct a causal analysis of the impact of the Gotthard Base Tunnel's opening at the end of 2016 on the number of cars using the parallel Gotthard motorway section in the subsequent years. To this end, we apply the synthetic control and the synthetic difference-in-differences methods to construct a synthetic Gotthard motorway section based on a weighted combination of other alpine road crossings (a so-called donor pool) that did not experience the construction of a competing rail infrastructure. Our results reveal only a modest but statistically significant decline in the number of cars between the actual and the synthetic Gotthard motorway in th
This study presents all available, multi-epoch 3.6 and 4.5 $μ$m photometry from Spitzer Space Telescope observations of white dwarf debris disks, including weekly cadence observations of 16 relatively bright systems, and 5 h staring-mode observations for five of these. Significant variability is detected in 85 per cent of disks and across all timescales probed, from minutes to weeks to years, where the largest flux changes correlate with the longest time baselines, and the infrared excesses persist utterly. While each source is idiosyncratic, the overall results indicate the most variable disks correlate with those that are the brightest (dustiest), and also among those with detected gas, demonstrating both dust and gas are produced via ongoing collisions. There is a correlation between flux and colour changes, where disks tend to appear redder when dimmer and bluer when brighter, consistent with an excess of small dust grains produced in collisions, followed by a gradual return to equilibrium. The overall results are a drastic departure from the predictions of the canonical - geometrically thin, optically thick - disk in both flux and colour, but are broadly consistent with collis
Instruction tuning data are often quantity-saturated due to the large volume of data collection and fast model iteration, leaving data selection important but underexplored. Existing quality-driven data selection methods, such as LIMA (NeurIPS 2023 \citep{zhou2024lima}) and AlpaGasus (ICLR 2024 \citep{chenalpagasus}) generally ignore the equal importance of data diversity and complexity. In this work, we aim to design a diversity-aware data selection strategy and creatively propose using sparse autoencoders (SAEs) to tackle the challenge of data diversity measure. In addition, SAEs can also provide more interpretability of model behavior and explain, e.g., the surprising effectiveness of selecting the longest response (ICML 2024 \citep{zhaolong}). Using effective data selection, we experimentally prove that models trained on our selected data can outperform other methods in terms of model capabilities, reduce training cost, and potentially gain more control over model behaviors. We prove that SAEs can serve as a good alternative to diversity measure and design our method to be scalable for potential industrial large-scale pruning, and we will also release our trained SAEs for use b
This paper addresses the Restricted Longest Common Subsequence (RLCS) problem, an extension of the well-known Longest Common Subsequence (LCS) problem. This problem has significant applications in bioinformatics, particularly for identifying similarities and discovering mutual patterns and important motifs among DNA, RNA, and protein sequences. Building on recent advancements in solving this problem through a general search framework, this paper introduces two novel heuristic approaches designed to enhance the search process by steering it towards promising regions in the search space. The first heuristic employs a probabilistic model to evaluate partial solutions during the search process. The second heuristic is based on a neural network model trained offline using a genetic algorithm. A key aspect of this approach is extracting problem-specific features of partial solutions and the complete problem instance. An effective hybrid method, referred to as the learning beam search, is developed by combining the trained neural network model with a beam search framework. An important contribution of this paper is found in the generation of real-world instances where scientific abstracts
Quantifying similarities between time series in a meaningful way remains a challenge in time series analysis, despite many advances in the field. Most real-world solutions still rely on a few popular measures, such as Euclidean Distance (EuD), Longest Common Subsequence (LCSS), and Dynamic Time Warping (DTW). The strengths and weaknesses of these measures have been studied extensively, and incremental improvements have been proposed. In this study, however, we present a different similarity measure that fuses the notion of Dubuc's variation from fractal analysis with the Intersection-over-Union (IoU) measure which is widely used in object recognition (also known as the Jaccard Index). In this proof-of-concept paper, we introduce the Multiscale Dubuc Distance (MDD) measure and prove that it is a metric, possessing desirable properties such as the triangle inequality. We use 95 datasets from the UCR Time Series Classification Archive to compare MDD's performance with EuD, LCSS, and DTW. Our experiments show that MDD's overall success, without any case-specific customization, is comparable to DTW with optimized window sizes per dataset. We also highlight several datasets where MDD's p
The cosmic 21 cm signal serves as a crucial probe for studying the evolutionary history of the Universe. However, detecting the 21 cm signal poses significant challenges due to its extremely faint nature. To mitigate the interference from the Earth's radio frequency interference (RFI), the ground and the ionospheric effects, the Discovering the Sky at the Longest Wavelength (DSL) project will deploy a constellation of satellites in Lunar orbit, with its high-frequency daughter satellite tasked with detecting the global 21 cm signal from cosmic dawn and reionization era (CD/EoR). We intend to employ the Vari-Zeroth-Order Polynomial (VZOP) for foreground fitting and subtracting. We have studied the effect of thermal noise, thermal radiation from the Moon, the Lunar reflection, anisotropic frequency-dependent beam, inaccurate antenna beam pattern, and RFI contamination. We discovered that the RFI contamination can significantly affect the fitting process and thus prevent us from detecting the signal. Therefore, experimenting on the far side of the moon is crucial. We also discovered that using VZOP together with DSL, after 1080 orbits around the Moon, which takes about 103 days, we ca