To address the high mobility impacts and the ultra-reliable and low-latency communication (URLLC) requirements in autonomous driving scenarios, rate-splitting multiple access (RSMA) combined with short-packet communication (SPC) emerges as a promising solution.Autonomous vehicles rely on real-time information exchange to ensure safety and coordination, making information freshness essential.By jointly capturing transmission delays and packet errors, age of information (AoI) serves as a comprehensive metric for freshness.In this paper, we investigate short-packet rate splitting to enhance information freshness measured by the AoI.By splitting the unicast messages into common and private parts, encoding all common parts together with the multicast message into a common stream, and encoding each private part into a private stream, RSMA effectively manages interference and enables achieving lower AoI.By considering critical factors such as transmit power, vehicle velocity, blocklength, and the number of transmit antennas, we derive closed-form expressions for the average AoI (AAoI) of the common stream under partial decoding and the overall AAoI under complete decoding.To enhance the A
In large-scale industrial recommendation systems, retrieval must produce high-quality candidates from massive corpora under strict latency. Recently, Generative Retrieval (GR) has emerged as a viable alternative to Embedding-Based Retrieval (EBR), which quantizes items into a finite token space and decodes candidates autoregressively, providing a scalable path that explicitly models target-history interactions via cross-attention. However, deploying GR in short-video feeds remains challenged by long-short interest interference, context-induced noise in hierarchical SID generation, and the lack of explicit learning from exposed-but-unclicked feedback. To address these challenges, we propose DualGR, which combines (i) a Dual-Branch Long/Short-Term Router (DBR) with selective activation, (ii) Search-based SID Decoding (S2D) that constrains fine-level decoding within the current coarse bucket for efficiency and noise control, and (iii) an Exposure-aware Next-Token Prediction Loss (ENTP-Loss) that treats unclicked exposures as coarse-level hard negatives to promote timely interest fade-out. On the large-scale Kuaishou short-video recommendation system, DualGR has achieved outstanding pe
Algorithmic decision-making in high-stakes settings can have profound impacts on individuals and populations. While much prior work studies fairness in static settings, recent results show that enforcing static fairness constraints may exacerbate long-run disparities. Motivated by this tension, we study a stylized sequential selection problem in which a decision-maker repeatedly selects individuals, affecting both immediate utility and the population distribution over time. We introduce notions of group fairness for both the short and long term and theoretically analyze the trade-off between fairness and utility via the Price of Fairness (PoF). We characterize optimal and fair policies in the short term and show that the PoF can be large even when group distributions are nearly identical. In contrast, we show that long-term disparities can vanish under simple investment policies that achieve a low PoF. We also empirically validate these theoretical observations using both synthetic and real datasets.
The short text matching task employs a model to determine whether two short texts have the same semantic meaning or intent. Existing short text matching models usually rely on the content of short texts which are lack information or missing some key clues. Therefore, the short texts need external knowledge to complete their semantic meaning. To address this issue, we propose a new short text matching framework for introducing external knowledge to enhance the short text contextual representation. In detail, we apply a self-attention mechanism to enrich short text representation with external contexts. Experiments on two Chinese datasets and one English dataset demonstrate that our framework outperforms the state-of-the-art short text matching models.
We evaluate multimodal large language models (MLLMs) for topic-aligned captioning in financial short-form videos (SVs) by testing joint reasoning over transcripts (T), audio (A), and video (V). Using 624 annotated YouTube SVs, we assess all seven modality combinations (T, A, V, TA, TV, AV, TAV) across five topics: main recommendation, sentiment analysis, video purpose, visual analysis, and financial entity recognition. Video alone performs strongly on four of five topics, underscoring its value for capturing visual context and effective cues such as emotions, gestures, and body language. Selective pairs such as TV or AV often surpass TAV, implying that too many modalities may introduce noise. These results establish the first baselines for financial short-form video captioning and illustrate the potential and challenges of grounding complex visual cues in this domain. All code and data can be found on our Github under the CC-BY-NC-SA 4.0 license.
We show that once $θ>17/30$, every sufficiently long interval $[x,x+x^θ]$ contains many $k$-term arithmetic progressions of primes, uniformly in the starting point $x$. More precisely, for each fixed $k\ge3$ and $θ>17/30$, for all sufficiently large $X$ and all $x\in[X,2X]$, \[ \#\{\text{$k$-APs of primes in }[x,x+x^θ]\}\ \gg_{k,θ}\ \frac{N^{2}}{\big((\varphi(W)/W)^{k}(\log R)^{k}\big)}\ \asymp\ \frac{X^{2θ}}{(\log X)^{k+1+o(1)}}, \] where $W:=\prod_{p\le \tfrac12\log\log X}p$, $N:=\lfloor x^θ/W\rfloor$, and $R:=N^η$ for a small fixed $η=η(k,θ)>0$. This is obtained by combining the uniform short-interval prime number theorem at exponents $θ>17/30$ (a consequence of recent zero-density estimates of Guth and Maynard) with the Green-Tao transference principle (in the relative Szemerédi form) on a window-aligned $W$-tricked block. We also record a concise Maynard-type lemma on dense clusters \emph{restricted to a fixed congruence class} in tiny intervals $(\log x)^\varepsilon$, which we use as a warm-up and for context. An appendix contains a short-interval Barban-Davenport-Halberstam mean square bound (uniform in $x$) that we use as a black box for variance estimates. The
Short answer assessment is a vital component of science education, allowing evaluation of students' complex three-dimensional understanding. Large language models (LLMs) that possess human-like ability in linguistic tasks are increasingly popular in assisting human graders to reduce their workload. However, LLMs' limitations in domain knowledge restrict their understanding in task-specific requirements and hinder their ability to achieve satisfactory performance. Retrieval-augmented generation (RAG) emerges as a promising solution by enabling LLMs to access relevant domain-specific knowledge during assessment. In this work, we propose an adaptive RAG framework for automated grading that dynamically retrieves and incorporates domain-specific knowledge based on the question and student answer context. Our approach combines semantic search and curated educational sources to retrieve valuable reference materials. Experimental results in a science education dataset demonstrate that our system achieves an improvement in grading accuracy compared to baseline LLM approaches. The findings suggest that RAG-enhanced grading systems can serve as reliable support with efficient performance gain
The rapid proliferation of user-generated content (UGC) on short-form video platforms has made video engagement prediction increasingly important for optimizing recommendation systems and guiding content creation. However, this task remains challenging due to the complex interplay of factors such as semantic content, visual quality, audio characteristics, and user background. Prior studies have leveraged various types of features from different modalities, such as visual quality, semantic content, background sound, etc., but often struggle to effectively model their cross-feature and cross-modality interactions. In this work, we empirically investigate the potential of large multimodal models (LMMs) for video engagement prediction. We adopt two representative LMMs: VideoLLaMA2, which integrates audio, visual, and language modalities, and Qwen2.5-VL, which models only visual and language modalities. Specifically, VideoLLaMA2 jointly processes key video frames, text-based metadata, and background sound, while Qwen2.5-VL utilizes only key video frames and text-based metadata. Trained on the SnapUGC dataset, both models demonstrate competitive performance against state-of-the-art basel
The short-form videos have explosive popularity and have dominated the new social media trends. Prevailing short-video platforms,~\textit{e.g.}, Kuaishou (Kwai), TikTok, Instagram Reels, and YouTube Shorts, have changed the way we consume and create content. For video content creation and understanding, the shot boundary detection (SBD) is one of the most essential components in various scenarios. In this work, we release a new public Short video sHot bOundary deTection dataset, named SHOT, consisting of 853 complete short videos and 11,606 shot annotations, with 2,716 high quality shot boundary annotations in 200 test videos. Leveraging this new data wealth, we propose to optimize the model design for video SBD, by conducting neural architecture search in a search space encapsulating various advanced 3D ConvNets and Transformers. Our proposed approach, named AutoShot, achieves higher F1 scores than previous state-of-the-art approaches, e.g., outperforming TransNetV2 by 4.2%, when being derived and evaluated on our newly constructed SHOT dataset. Moreover, to validate the generalizability of the AutoShot architecture, we directly evaluate it on another three public datasets: ClipSh
Short text classi cation is a method for classifying short sentence with prede ned labels. However, short text is limited in shortness in text length that leads to a challenging problem of sparse features. Most of existing methods treat each short sentences as independently and identically distributed (IID), local context only in the sentence itself is focused and the relational information between sentences are lost. To overcome these limitations, we propose a PathWalk model that combine the strength of graph networks and short sentences to solve the sparseness of short text. Experimental results on four different available datasets show that our PathWalk method achieves the state-of-the-art results, demonstrating the efficiency and robustness of graph networks for short text classification.
The redshift distribution of the short-duration GRBs is a crucial, but currently fragmentary, clue to the nature of their progenitors. Here we present optical observations of nine short GRBs obtained with Gemini, Magellan, and the Hubble Space Telescope. We detect the afterglows and host galaxies of two short bursts, and host galaxies for two additional bursts with known optical afterglow positions, and five with X-ray positions (<6'' radius). In eight of the nine cases we find that the most probable host galaxies are faint, R~23-26.5 mag, and are therefore starkly different from the first few short GRB hosts with R~17-22 mag and z<0.5. Indeed, we measure spectroscopic redshifts of z~0.4-1.1 for the four brightest hosts. A comparison to large field galaxy samples, as well as the hosts of long GRBs and previous short GRBs, indicates that the fainter hosts likely reside at z>1. Our most conservative limit is that at least half of the five hosts without a known redshift reside at z>0.7 (97% confidence level), suggesting that about 1/3-2/3 of all short GRBs originate at higher redshifts than previously determined. This has two important implications: (i) We constrain the ac
We generalize the classical Bombieri-Vinogradov theorem to a short interval, non-abelian setting. This leads to variants of the prime number theorem for short intervals where the primes lie in arithmetic progressions that are "twisted" by a splitting condition in a Galois extension $L/K$ of number fields. Using this result in conjunction with recent work of Maynard, we prove that rational primes in short intervals with a given splitting condition in a Galois extension $L/\mathbb{Q}$ exhibit dense clusters in short intervals. We explore several arithmetic applications related to questions of Serre regarding the nonvanishing Fourier coefficients of cuspidal modular forms, including finding dense clusters of fundamental discriminants $ d $ in short intervals for which the central values of $d$-quadratic twists of modular $L$-functions are non-vanishing.
We exhibit an explicit short basis of the Stickelberger ideal of cyclotomic fields of any conductor $m$, i.e., a basis containing only short elements. By definition, an element of $\mathbb{Z}[G_m]$, where $G_m$ denotes the Galois group of the field, is called short whenever it writes as $\sum_{σ\in G_m} \varepsilon_σσ$ with all $\varepsilon_σ\in\{0,1\}$. One ingredient for building such a basis consists in picking wisely generators $α_m(b)$ in a large family of short elements. As a direct practical consequence, we deduce from this short basis an explicit upper bound on the relative class number, that is valid for any conductor. This basis also has several concrete applications, in particular for the cryptanalysis of the Shortest Vector Problem on Ideal lattices.
Multi-messenger astronomy received a great boost following the discovery of kilonova AT2017gfo, the optical counterpart of the gravitational wave source GW170817 associated with the short gamma-ray burst GRB 170817A. AT2017gfo was the first kilonova that could be extensively monitored in time both photometrically and spectroscopically. Previously, only few candidates have been observed against the glare of short GRB afterglows. In this work, we aim to search the fingerprints of AT2017gfo-like kilonova emissions in the optical/NIR light curves of 39 short GRBs with known redshift. For the first time, our results allow us to study separately the range of luminosity of the blue and red components of AT2017gfo-like kilonovae in short GRBs. In particular, the red component is similar in luminosity to AT2017gfo, while the blue kilonova can be more than 10 times brighter. Finally, we find further evidence to support all the claimed kilonova detections and we exclude an AT2017gfo-like kilonova in GRBs 050509B and 061201.
In this article, I provide significant mathematical evidence in support of the existence of short-time approximations of any polynomial order for the computation of density matrices of physical systems described by arbitrarily smooth and bounded from below potentials. While for Theorem 2, which is ``experimental'', I only provide a ``physicist's'' proof, I believe the present development is mathematically sound. As a verification, I explicitly construct two short-time approximations to the density matrix having convergence orders 3 and 4, respectively. Furthermore, in the Appendix, I derive the convergence constant for the trapezoidal Trotter path integral technique. The convergence orders and constants are then verified by numerical simulations. While the two short-time approximations constructed are of sure interest to physicists and chemists involved in Monte Carlo path integral simulations, the present article is also aimed at the mathematical community, who might find the results interesting and worth exploring. I conclude the paper by discussing the implications of the present findings with respect to the solvability of the dynamical sign problem appearing in real-time Feynma
Understanding the major fraud problems in the world and interpreting the data available for analysis is a current challenge that requires interdisciplinary knowledge to complement the knowledge of computer professionals. Collaborative events (called Hackathons, Datathons, Codefests, Hack Days, etc.) have become relevant in several fields. Examples of fields which are explored in these events include startup development, open civic innovation, corporate innovation, and social issues. These events have features that favor knowledge exchange to solve challenges. In this paper, we present an event format called Short Datathon, a Hackathon for the development of exploratory data analysis and visualization skills. Our goal is to evaluate if participating in a Short Datathon can help participants learn basic data analysis and visualization concepts. We evaluated the Short Datathon in two case studies, with a total of 20 participants, carried out at the Federal University of Technology - Paraná. In both case studies we addressed the issue of tax evasion using real world data. We describe, as a result of this work, the qualitative aspects of the case studies and the perception of the partic
Short-term memory in the brain cannot in general be explained the way long-term memory can -- as a gradual modification of synaptic weights -- since it takes place too quickly. Theories based on some form of cellular bistability, however, do not seem able to account for the fact that noisy neurons can collectively store information in a robust manner. We show how a sufficiently clustered network of simple model neurons can be instantly induced into metastable states capable of retaining information for a short time (a few seconds). The mechanism is robust to different network topologies and kinds of neural model. This could constitute a viable means available to the brain for sensory and/or short-term memory with no need of synaptic learning. Relevant phenomena described by neurobiology and psychology, such as local synchronization of synaptic inputs and power-law statistics of forgetting avalanches, emerge naturally from this mechanism, and we suggest possible experiments to test its viability in more biological settings.
In this work, we introduce a new problem, named as {\em story-preserving long video truncation}, that requires an algorithm to automatically truncate a long-duration video into multiple short and attractive sub-videos with each one containing an unbroken story. This differs from traditional video highlight detection or video summarization problems in that each sub-video is required to maintain a coherent and integral story, which is becoming particularly important for resource-production video sharing platforms such as Youtube, Facebook, TikTok, Kwai, etc. To address the problem, we collect and annotate a new large video truncation dataset, named as TruNet, which contains 1470 videos with on average 11 short stories per video. With the new dataset, we further develop and train a neural architecture for video truncation that consists of two components: a Boundary Aware Network (BAN) and a Fast-Forward Long Short-Term Memory (FF-LSTM). We first use the BAN to generate high quality temporal proposals by jointly considering frame-level attractiveness and boundaryness. We then apply the FF-LSTM, which tends to capture high-order dependencies among a sequence of frames, to decide whether
This is a short proof of Ledoit-Péché's RIE formula for covariance matrices. The proof is based on the Stein formula, which gives a very simple way to derive the result. One of the advantages of this approach is that it shows that the only really needed hypothesis, for the machinery to work, is that the mean of the eigenvalues of the true covariance matrix and the largest of them have the same order.
The Centrifugal Mirror Fusion Experiment (CMFX) is an axisymmetric magnetic mirror with a central cathode which generates an azimuthal, radially sheared, supersonic \( E \times B \) flow. The induced rotation stabilizes, confines, and heats the plasma. The diagnostic set on CMFX is sparse, giving limited insight to the state of the plasma. In this work, we developed a time-dependent interpretive analysis framework that uses applied voltage, input power, and measured neutron yield rate to infer evolving plasma conditions throughout a discharge. The 0D MCTrans++ code serves as the core physics model, incorporating centrifugal effects, viscous heating, and angular momentum confinement to infer plasma parameters from operating conditions and experimental observables. An iterative Newton's method was implemented to solve for the plasma state evolution consistent with experimental measurements averaged over successive time intervals. The interpretive analysis was applied to experiments comparing different fueling strategies, revealing a path to improved performance via several short puffs of fuel spread across the discharge. This insight led to operations at voltages up 70 kV. Deuterium