In mid-2024, asteroid 2019 UO$_{14}$ was identified as the first-ever Saturn Trojan through ground-based archival observations and numerical simulations. Trojans, including those associated with Jupiter and other planets, raise important questions about the formation processes of our solar system. Exploring this Trojan object with spacecraft may provide direct answers and definitive evidence regarding these questions. This paper thoroughly investigates potential mission scenarios to the first Saturn Trojan, 2019 UO$_{14}$, to determine the necessary launch window and spacecraft specifications. First, by assuming a ballistic flight using chemical engines, optimal sequences of events, including (powered) gravity-assist maneuvers and deep-space maneuvers, are identified through a meta-heuristic global trajectory optimization algorithm. The analysis indicates that flyby exploration is feasible with a launch window around 2034 and a $ΔV$ ranging from 92 m/s to 1041 m/s within an 11-year mission duration, while a rendezvous can be achieved with a departure around 2035 and a $ΔV$ of 2-3 km/s. Specifically, the itinerary via Saturn requires a $ΔV$ of 2 km/s and a flight time of 24.6 years,
We describe the testing of a prototype SiPM-on-tile iron-scintillator calorimeter at the Relativistic Heavy Ion Collider (RHIC) during its 200 GeV $pp$ run in 2024. The prototype, measuring $20 \times 20 \, \text{cm}^{2}$ and 24 radiation lengths in depth, was positioned in the STAR experimental hall, approximately 8 m from the interaction point and 65 cm from the beam line, covering a pseudorapidity range of about $3.1<η<3.4$. By using the dark current of a reference SiPM as a radiation monitor, we estimate that the prototype was exposed to a fluence of about $10^{10}$ 1-MeV $n_{\mathrm{eq}}$/cm$^2$. Channel-by-channel calibration was performed in a data-driven way with the signature from minimum-ionizing particles during beam-on conditions. A Geant4 detector simulation, with inputs from the Pythia8 event generator, describes measurements of energy spectra and hit multiplicities reasonably well. These results mark the first deployment, commissioning, calibration, and long-term operation of a SiPM-on-tile calorimeter in a collider environment. This experimental campaign will guide detector designs and operational strategies for the ePIC detector at the future EIC, as well as
Vibrations from experimental setups and the environment are a persistent source of noise for low-temperature calorimeters searching for rare events, including neutrinoless double beta ($0νββ$) decay or dark matter interactions. Such noise can significantly limit experimental sensitivity to the physics case under investigation. Here we report the first detection of marine microseismic vibrations using mK-scale calorimeters. This study employs a multi-device analysis correlating data from CUORE, the leading experiment in the search for $0νββ$ decay with mK-scale calorimeters and the Copernicus Earth Observation program, revealing the seasonal impact of Mediterranean Sea activity on CUORE's energy thresholds, resolution, and sensitivity over four years. The detection of marine microseisms underscores the need to address faint environmental noise in ultra-sensitive experiments. Understanding how such noise couples to the detector and developing mitigation strategies is essential for next-generation experiments. We demonstrate one such strategy: a noise decorrelation algorithm implemented in CUORE using auxiliary sensors, which reduces vibrational noise and improves detector performance
Let $sym^{2} f$ denote the symmetric square lift of a Hecke eigenform $f \in S_{k}(Γ_{0}(N))$ with the $n^{\rm th}$-Fourier coefficients $ λ_{sym^{2}f}(n)$. In this article, we prove an estimate for the first moment of the sequence $\{ λ_{sym^{2}f}(\mathcal{Q}(\underline{x}))\}_{\mathcal{Q} \in \mathcal{S}_{D}, \underline{x} \in \mathbb{Z}^{2}}$ where $\mathcal{S}_{D}$ denotes the set of in-equivalent reduced forms of the discriminant $D$. More precisely, we establish an estimate for the following sum: \begin{equation*} \begin{split} S(sym^{2}f, D; X ) &= \sideset{}{^{\flat }}\sum_{\substack{\mathcal{Q}(\underline{x}) \leq X \\ \underline{x} \in \mathbb{Z}^{2} ,~ \mathcal{Q} \in \mathcal{S}_{D} \\ \gcd(\mathcal{Q}(\underline{x}),N) =1 }} λ_{sym^{2}f}(\mathcal{Q}(\underline{x})), \end{split} \end{equation*} Moreover, we consider a question concerning the behavior of signs of the Fourier coefficients $λ_{sym^{2}f}(n),$ supported on the set of integers represented by reduced forms of the discriminant $D$. We determine the size of $n_{sym^{2}f, D}$ (see definition before \thmref{ExtMatKLSW}), in terms of the conductor of the associated $L$-functions.
In the quest for artificial general intelligence, Multi-modal Large Language Models (MLLMs) have emerged as a focal point in recent advancements. However, the predominant focus remains on developing their capabilities in static image understanding. The potential of MLLMs in processing sequential visual data is still insufficiently explored, highlighting the absence of a comprehensive, high-quality assessment of their performance. In this paper, we introduce Video-MME, the first-ever full-spectrum, Multi-Modal Evaluation benchmark of MLLMs in Video analysis. Our work distinguishes from existing benchmarks through four key features: 1) Diversity in video types, spanning 6 primary visual domains with 30 subfields to ensure broad scenario generalizability; 2) Duration in temporal dimension, encompassing both short-, medium-, and long-term videos, ranging from 11 seconds to 1 hour, for robust contextual dynamics; 3) Breadth in data modalities, integrating multi-modal inputs besides video frames, including subtitles and audios, to unveil the all-round capabilities of MLLMs; 4) Quality in annotations, utilizing rigorous manual labeling by expert annotators to facilitate precise and reliab
Non-inflationary sources of gravitational waves in the early Universe generically predict causality-limited tensor power spectra at low frequencies. We report the first-ever constraints on such sources based on cosmic microwave background (CMB) $B$-mode polarization measurements. Using data from BICEP/Keck, SPTpol, SPT-3G, Planck, and WMAP, we constrain the amplitude of an early causal tensor (ECT) power spectrum parameterized by $r_{ect}$, the ratio of causal tensor power to total scalar power at $k~=~0.01$ Mpc$^{-1}$, and obtain a 95% CL upper limit of $r_{ect}<$ 0.0077. Since $r_{ect}$ can easily be related to the parameters of a given theory, our bound robustly constrains a broad class of well-motivated gravitational wave sources in the early universe, including first-order cosmological phase transitions, enhanced small-scale density perturbations, and various topological defects. Finally, we translate our limit into a bound on the present-day energy density in gravitational waves at ultra-low frequencies otherwise inaccessible to traditional gravitational wave detection strategies, including pulsar timing arrays, interferometers, and resonant cavities.
Harmful memes are ever-shifting in the Internet communities, which are difficult to analyze due to their type-shifting and temporal-evolving nature. Although these memes are shifting, we find that different memes may share invariant principles, i.e., the underlying design concept of malicious users, which can help us analyze why these memes are harmful. In this paper, we propose RepMD, an ever-shifting harmful meme detection method based on the design concept reproduction. We first refer to the attack tree to define the Design Concept Graph (DCG), which describes steps that people may take to design a harmful meme. Then, we derive the DCG from historical memes with design step reproduction and graph pruning. Finally, we use DCG to guide the Multimodal Large Language Model (MLLM) to detect harmful memes. The evaluation results show that RepMD achieves the highest accuracy with 81.1% and has slight accuracy decreases when generalized to type-shifting and temporal-evolving memes. Human evaluation shows that RepMD can improve the efficiency of human discovery on harmful memes, with 15$\sim$30 seconds per meme.
The Askaryan Radio Array (ARA) is an ultrahigh energy (UHE) neutrino detector at the South Pole, designed to search for radio pulses emitted by neutrino-initiated particle showers in ice. ARA consists of an array of five autonomous stations with 2 km spacing. Each station consists of 16 radio antennas embedded ${\sim}200$ m deep in the ice that are sensitive to either vertically- or horizontally-polarized signals. Radio arrays like ARA represent a cost-efficient means of achieving the enormous detection $O(10~\text{km}^3)$ volumes necessary for UHE neutrino detection. This contribution presents the current status of the first-ever array-wide search for UHE neutrinos, leveraging ARA's unprecedented ${\sim}28$ station-years of livetime. This search will have the best sensitivity of any neutrino detector above $3$ EeV, sufficient to probe the $220$ PeV flux inferred from KM3NeT's observation of KM3-230213A. Importantly, this study demonstrates the feasibility of array-wide neutrino searches, which are necessary for next-generation detectors, like RNO-G (35 stations planned) and IceCube-Gen2 Radio (361 stations proposed), to achieve their design sensitivity. We discuss the progress tow
Multi-group multicast (MGM) is an increasingly important form of multi-user wireless communications with several potential applications, such as video streaming, federated learning, safety-critical vehicular communications, etc. Rate-Splitting Multiple Access (RSMA) is a powerful interference management technique that can, in principle, achieve higher data rates and greater fairness for all types of multi-user wireless communications, including MGM. This paper presents the first-ever experimental evaluation of RSMA-based MGM, as well as the first-ever three-way comparison of RSMA-based, Space Divison Multiple Access (SDMA)-based and Non-Orthogonal Multiple Access (NOMA)-based MGM. Using a measurement setup involving a two-antenna transmitter and two groups of two single-antenna users per group, we consider the problem of realizing throughput (max-min) fairness across groups for each of three multiple access schemes, over nine experimental cases in a line-of-sight environment capturing varying levels of pathloss difference and channel correlation across the groups. Over these cases, we observe that RSMA-based MGM achieves fairness at a higher throughput for each group than SDMA- and
Minibeam and microbeam radiation therapy promise improved treatment outcomes through reduced normal tissue toxicity at better tumor control rates. The lack of suitable compact radiation sources limits the clinical application of minibeams to superficial tumors and renders it impossible for microbeams. We developed the first prototype of a compact line-focus X-ray tube (LFXT) with technology potentially suitable for clinical translation of minibeams and microbeams. We give an overview of the commissioning process preceding first operation, present optical and radiological focal spot characterization methods, and dosimetric measurements. Additionally, we report on first preclinical in vitro cell and in vivo mouse brain irradiations conducted with the LFXT prototype. The LFXT was high voltage conditioned up to 300 kV.The focal spot characterization resulted in a strongly eccentric electron distribution with a width of 72.3 $μ$m. Dosimetry showed sharp microbeam dose profiles with steep lateral penumbras and a peak-to-valley dose ratio above 10 throughout a 70 mm thick PMMA phantom. An open-field dose rate of 4.3 Gy/s was measured at an acceleration voltage of 150 kV and a beam current
Recent progress in (multimodal) large language models ((M)LLMs) has shifted focus from pre-training to inference-time computation and post-training optimization, largely due to concerns over the availability of high-quality human data. However, these strategies alone are insufficient to drive substantial model improvements. We argue that effective model advancement requires strong synergy among pre-training, inference-time computation, and post-training optimization. In this paper, we introduce Self-Improving cognition (SIcog), a self-learning framework for constructing next-generation foundation MLLMs by imparting multimodal knowledge and enhancing systematic cognitive capabilities through multimodal pre-training with self-generated data. Specifically, we propose Chain-of-Description for step-by-step visual understanding and integrate structured Chain-of-Thought (CoT) reasoning to support in-depth multimodal reasoning. SIcog first equips a base model with systematic perception and reasoning using minimal external supervision. The enhanced models then generate candidate image captions and CoT reasoning responses for unlabeled images and image-question pairs across diverse tasks, wh
Reasoning is the fundamental capability of large language models (LLMs). Due to the rapid progress of LLMs, there are two main issues of current benchmarks: i) these benchmarks can be crushed in a short time (less than 1 year), and ii) these benchmarks may be easily hacked. To handle these issues, we propose the ever-scalingness for building the benchmarks which are scaling over complexity against crushing, instance against hacking and exploitation, oversight for easy verification, and coverage for real-world relevance. This paper presents Nondeterministic Polynomial-time Problem Challenge (NPPC), an ever-scaling reasoning benchmark for LLMs. Specifically, the NPPC has three main modules: i) npgym, which provides a unified interface of 25 well-known NP-complete problems and can generate any number of instances with any levels of complexities, ii) npsolver, which provides a unified interface to evaluate the problem instances with both online and offline models via APIs and local deployments, respectively, and iii) npeval, which provides the comprehensive and ready-to-use tools to analyze the performances of LLMs over different problems, the number of tokens, the reasoning errors and
The quest for radio signals from technologically-advanced extraterrestrial intelligence has traditionally concentrated on the vicinity of 1.4 GHz. In this paper, we extend the search to unprecedented territories, detailing our extensive observations at 6 GHz and initiating the first-ever survey at 18 GHz with the Sardinia Radio Telescope (SRT). Our strategy entailed rigorous observation sessions, totaling 36 hours, directed towards the Galactic Center and 72 selected TESS targets-making this the most comprehensive high-frequency technosignature search to date. Our narrowband signal search found no definitive evidence of drifting signals that could suggest an extraterrestrial origin from the surveyed regions. Nevertheless, our efforts have enabled us to set new constraints on the presence of radio emissions from approximately $5\times 10^{5}$ stars, establishing an isotropic radiated power limit of $1.8\times 10^{19} W$. We also provide a comparative analysis of the 'hits' recorded across both frequencies to highlight the significance of pursuing technosignature searches at higher frequencies, where the spectral landscape is less congested and more conducive to detection.
We present 4DiM, a cascaded diffusion model for 4D novel view synthesis (NVS), supporting generation with arbitrary camera trajectories and timestamps, in natural scenes, conditioned on one or more images. With a novel architecture and sampling procedure, we enable training on a mixture of 3D (with camera pose), 4D (pose+time) and video (time but no pose) data, which greatly improves generalization to unseen images and camera pose trajectories over prior works that focus on limited domains (e.g., object centric). 4DiM is the first-ever NVS method with intuitive metric-scale camera pose control enabled by our novel calibration pipeline for structure-from-motion-posed data. Experiments demonstrate that 4DiM outperforms prior 3D NVS models both in terms of image fidelity and pose alignment, while also enabling the generation of scene dynamics. 4DiM provides a general framework for a variety of tasks including single-image-to-3D, two-image-to-video (interpolation and extrapolation), and pose-conditioned video-to-video translation, which we illustrate qualitatively on a variety of scenes. For an overview see https://4d-diffusion.github.io
Assuming the Generalized Riemann Hypothesis, the non-trivial zeros of $L$-functions lie on the critical line with the real part $1/2$. We find an upper bound of the lowest first zero in families of even cuspidal newforms of prime level tending to infinity. We obtain explicit bounds using the $n$-level densities and results towards the Katz-Sarnak density conjecture. We prove that as the level tends to infinity, there is at least one form with a normalized zero within $1/4$ of the average spacing. We also obtain the first-ever bounds on the percentage of forms in these families with a fixed number of zeros within a small distance near the central point.
The BL Lacertae (BL Lac) object OJ 287 underwent an intense X-ray activity phase, exhibiting its brightest recorded X-ray flare in 2016-2017, characterized by much softer X-ray spectra and, concurrently, its first-ever recorded very-high-energy (VHE) emission (100--560 GeV), reported by the VERITAS observatory. Broadband spectral energy distribution reveals a new jet emission component similar to high-synchrotron-peaked BL Lac objects, thereby implying the soft X-ray spectrum for the synchrotron emission. Using the advantage of simultaneous X-ray and VHE spectral information, as well as the source being a low-synchrotron-peaked BL Lac object, we systematically explored the extragalactic background light (EBL) spectrum by demanding that the VHE spectrum cannot be harder than the X-ray spectrum. We used three different phenomenological forms of the EBL spectral shape (power-law, parabola, and polynomial) motivated by current constraints on the EBL with the Bayesian Monte Carlo approach to infer the credible EBL range. Our study favors an almost flat power-law spectral shape and is consistent with previous studies. The other spectral forms capable of capturing curvature though result
This study addresses the widening gap in Automatic Speech Recognition (ASR) research between high resource and extremely low resource languages, with a particular focus on Manchu, a critically endangered language. Manchu exemplifies the challenges faced by marginalized linguistic communities in accessing state-of-the-art technologies. In a pioneering effort, we introduce the first-ever Manchu ASR model ManWav, leveraging Wav2Vec2-XLSR-53. The results of the first Manchu ASR is promising, especially when trained with our augmented data. Wav2Vec2-XLSR-53 fine-tuned with augmented data demonstrates a 0.02 drop in CER and 0.13 drop in WER compared to the same base model fine-tuned with original data.
A canonical use case of Integrated Sensing and Communications (ISAC) in multiple-input multiple-output (MIMO) systems involves a multi-antenna transmitter communicating with $K$ users and sensing targets in its vicinity. For this setup, precoder and multiple access designs are of utmost importance, as the limited transmit power budget must be efficiently directed towards the desired directions (users and targets) to maximize both communications and sensing performance. This problem has been widely investigated analytically under various design choices, in particular (a) whether or not a dedicated sensing signal is needed, and (b) for different MIMO multiple access techniques, such as Space Division Multiple Access (SDMA) and Rate-Splitting Multiple Access (RSMA). However, a conclusive answer on which design choice achieves the best ISAC performance, backed by experimental results, remains elusive. We address this vacuum by experimentally evaluating and comparing RSMA and SDMA for communicating with two users $(K = 2)$ and sensing (ranging) one target. Over three scenarios that are representative of \emph{vehicular} ISAC, covering different levels of inter-user interference and sepa
As voice assistants cement their place in our technologically advanced society, there remains a need to cater to the diverse linguistic landscape, including colloquial forms of low-resource languages. Our study introduces the first-ever comprehensive dataset for intent detection and slot filling in formal Bangla, colloquial Bangla, and Sylheti languages, totaling 984 samples across 10 unique intents. Our analysis reveals the robustness of large language models for tackling downstream tasks with inadequate data. The GPT-3.5 model achieves an impressive F1 score of 0.94 in intent detection and 0.51 in slot filling for colloquial Bangla.
In multi-user multi-antenna communications, it is well-known in theory that Rate-Splitting Multiple Access (RSMA) can achieve a higher spectral efficiency than both Space Division Multiple Access (SDMA) and Non-Orthogonal Multiple Access (NOMA). However, an experimental evaluation of RSMA's performance, relative to SDMA and NOMA, is missing in the literature, which is essential to address the ongoing debate between RSMA and NOMA over which is better suited to handle most efficiently the available resources and interference in 6G. In this paper, we address this critical knowledge gap by realizing the first-ever RSMA prototype using software-defined radios. Through measurements using our prototype, we empirically solve the modulation and coding scheme limited sum throughput maximization problem for RSMA, SDMA and NOMA for the two-user multiple-input single-output (MISO) scenario over (a) different pairs of line-of-sight channels that vary in terms of their relative pathloss and spatial correlation, and with (b) different channel state information quality. We observe that RSMA achieves the highest sum throughput across all these cases, whereas SDMA and NOMA are effective only in some