搜索结果：longest-serving

共找到 20 条结果

排序：按相关性按时间按热度

来源：全部 arXiv PubMed OpenAlex 新闻/报道

高级筛选 ▾

Mark Zuckerberg's longest-serving employee on AI, jobs - and her boss

BBC Tech2026-06-04

Naomi Gleit has weathered many controversies at Meta, but remains in what she tells the BBC is her "dream job"

longest-serving

Stream2LLM: Overlap Context Streaming and Prefill for Reduced Time-to-First-Token (TTFT)

arXiv2026-03-29作者：Rajveer Bachkaniwala, Chengqi Luo, Richard So

Context retrieval systems for LLM inference face a critical challenge: high retrieval latency creates a fundamental tension between waiting for complete context (poor time-to-first-token) and proceeding without it (reduced quality). Streaming context incrementally--overlapping retrieval with inference--can mitigate this latency, but doing so with concurrent requests introduces new challenges: requests contend for GPU compute and memory, and scheduling must adapt to dynamic context arrivals. We present Stream2LLM, a streaming-aware LLM serving system for concurrent prefill-decode disaggregated deployments. Stream2LLM introduces adaptive scheduling and preemption for two distinct retrieval patterns: append-mode (progressive context accumulation) and update-mode (iterative refinement with cache invalidation). It decouples scheduling decisions from resource acquisition, enabling flexible preemption strategies guided by hardware-specific cost models, and uses longest common prefix matching to minimize redundant computation when input changes dynamically. To evaluate Stream2LLM, we collect two large-scale, real-world streaming workloads based on web crawling and approximate nearest neigh

搜索结果：longest-serving

Mark Zuckerberg's longest-serving employee on AI, jobs - and her boss

Stream2LLM: Overlap Context Streaming and Prefill for Reduced Time-to-First-Token (TTFT)

Multi-Robot Multi-Queue Control via Exhaustive Assignment Actor-Critic Learning

Enforcing TSP-Optimality in Fair Vehicle Routing by Cutting Planes

Exhaustive-Serve-Longest Control for Multi-robot Scheduling Systems

Resonance of black hole quasinormal modes in coupled systems

The Supermarket Model on a Dynamic Regular Hypergraph

LLM Query Scheduling with Prefix Reuse and Latency Constraints

Origins of Carbon Dust in a JWST-Observed Primeval Galaxy at $z\sim$6.7

Locality-aware Fair Scheduling in LLM Serving

Chiral symmetry and peripheral neutron-$α$ scattering

Pretrained LLMs as Real-Time Controllers for Robot Operated Serial Production Line

Exact $S$-duality Map for Rigid Surface Operators

The Impact of Extended CO$_2$ Cross Sections on Temperate Anoxic Planet Atmospheres

The Effect of the Gotthard Base Tunnel on Road Traffic: A Synthetic Control Approach

Activity in White Dwarf Debris Disks I: Spitzer Legacy Reveals Variability Incompatible with the Canonical Model

Diversity-driven Data Selection for Language Model Tuning through Sparse Autoencoder

A Learning Search Algorithm for the Restricted Longest Common Subsequence Problem

Multiscale Dubuc: A New Similarity Measure for Time Series

Forecast measurement of the 21 cm global spectrum from Lunar orbit with the Vari-Zeroth-Order Polynomial (VZOP) method