共找到 20 条结果
We find a new proof for the celebrated theorem of Keith and Zhong that a $(1,p)$-Poincaré inequality self-improves to a $(1,p-ε)$-Poincaré inequality. The paper consists of a novel characterization of Poincaré inequalities and then uses it to give an entirely new proof which is closely related to Muckenhoupt-weights. This new characterization, and the alternative proof, demonstrate a formal similarity between Muckenhoupt-weights and Poincaré inequalities. The proofs we give are short and somewhat more direct. With them we can give the first completely transparent bounds for the quantity of self-improvement and the constants involved. We observe that the quantity of self-improvement is, for large $p$, directly proportional to $p$, and inversely proportional to a power of the doubling constant and the constant in the Poincaré inequality. The proofs can be localized and thus we obtain more transparent proofs of the self-improvement of local Poincaré inequalities.
We prove a sharp Zhong-Yang type eigenvalue lower bound for closed Riemannian manifolds with control on integral Ricci curvature.
A refinement of Zhong's variational principle [Nonlin. Anal., 29 (1997), 1421-1431] is given, in the realm of almost metric structures. Applications to equilibrium points are also provided.
A comment has been recently posted on the arXiv (arXiv:1902.07504) that discuss our recent work on encircling multiple exceptional points in Nat. Commun. 9, 4808 (2018). In that comment, the authors claim that our approach is prone to errors. In discussing their findings, they also indicate that a method presented by their team in Phys. Rev. A 98, 023818 (2018) (presumed to give correct results in all situations) was published prior to our work. First, we would like to note that both their work and ours were posted on the arXiv within a week from each other. Second and more importantly, as we will show below, their analysis and conclusion concerning the validity of our approach as presented in the comment article (arXiv:1902.07504) are not correct. As we will demonstrate, a proper application of our method does indeed provide the correct results.
The effective Hamiltonian of $mα^6$ and $mα^6(m/M)$ order corrections for hydrogen molecular ions has been derived in [ Z.-X. Zhong, \emph{et al.}, Phys. Rev. A {\bf98}, 032502(2018).], in this work we express the energy correction in the form of finite-value effective operators. The cut-off regularization scheme is used to determine finite part of divergent operators of the leading-order recoil corrections. Numerical calculations of first-order contributions are performed in the Hylleraas basis set. Combining the second-order terms calculated in recent work [V. I. Korobov, \emph{et al.}, Mol. Phys. e2563023 (2025).], the $mα^6$-order corrections for the fundamental rovibrational transition are obtained with an uncertainty three times smaller than in previous calculations.
Motivated by the study of unlikely intersection in the moduli space of rational maps, we initiate our investigation on algebraic dynamics for families of regular polynomial skew products in this article. Our goals are threefold. (1) We classify special loci -- which contain a Zariski dense set of postcritically finite points -- in the moduli space of quadratic regular polynomial skew products. More precisely, special loci include families of homogeneous polynomial endomorphisms, families of split endomorphisms, and polynomial endomorphisms of the form $(x^2,y^2+bx)$ up to conjugacy. As a consequence, we verify a special case of a conjecture proposed by Zhong. (2) Let $F_t$ be a family of regular polynomial skew products defined over a number field $K$ and let $P_t, Q_t\in K[t]\times K[t]$ be two initial marked points. We introduce a good height $h_{P_t}(t)$ which is built from the theory of adelic line bundles for quasi projective varieties. We show that the set of parameters $t_0\in \overline{K}$ for which $P_{t_0}$ and $Q_{t_0}$ are simultaneously $F_{t_0}$-preperiodic is infinite if and only if $h_{P_t}=h_{Q_t}$. (3) As an application of $h_{P_t}$, we show that, under some degre
In many robotic manipulation tasks, the robot repeatedly solves motion-planning problems that differ mainly in the location of the goal object and its associated obstacle, while the surrounding workspace remains fixed. Prior works have shown that leveraging experience and offline computation can accelerate repeated planning queries, but they lack guarantees of covering the continuous task space and require storing large libraries of solutions. In this work, we present COAD, a framework that provides constant-time planning over a continuous goal-parameterized task space. COAD discretizes the continuous task space into finitely many Task Coverage Regions. Instead of planning and storing solutions for every region offline, it constructs a compressed library by only solving representative root problems. Other problems are handled through fast adaptation from these root solutions. At query time, the system retrieves a root motion in constant time and adapts it to the desired goal using lightweight adaptation modules such as linear interpolation, Dynamic Movement Primitives, or simple trajectory optimization. We evaluate the framework on various manipulators and environments in simulatio
Coded elastic computing enables virtual machines to be preempted for high-priority tasks while allowing new virtual machines to join ongoing computation seamlessly. This paper addresses coded elastic computing for matrix-matrix multiplications with straggler tolerance by encoding both storage and download using Lagrange codes. In 2018, Yang et al. introduced the first coded elastic computing scheme for matrix-matrix multiplications, achieving a lower computational load requirement. However, this scheme lacks straggler tolerance and suffers from high upload cost. Zhong et al. (2023) later tackled these shortcomings by employing uncoded storage and Lagrange-coded download. However, their approach requires each machine to store the entire dataset. This paper introduces a new class of elastic computing schemes that utilize Lagrange codes to encode both storage and download, achieving a reduced storage size. The proposed schemes efficiently mitigate both elasticity and straggler effects, with a storage size reduced to a fraction $\frac{1}{L}$ of Zhong et al.'s approach, at the expense of doubling the download cost. Moreover, we evaluate the proposed schemes on AWS EC2 by measuring compu
Recent Large Multimodal Models (LMMs), primarily designed for offline settings, are ill-suited for the dynamic requirements of streaming video. While recent online adaptations improve real-time processing, they still face critical scalability challenges, with resource demands typically growing at least linearly with video duration. To overcome this bottleneck, we propose FlowNar, a novel framework for scalable streaming video narration. The core of FlowNar is a dynamic context management strategy for historical visual context removal, combined with our CLAM (Cross Linear Attentive Memory) module for streaming visual history retention, ensuring bounded visual memory usage and computational complexity, crucial for efficient streaming. We also introduce a realistic self-conditioned evaluation protocol and complementary evaluation metrics to assess streaming narration models under deployment-like conditions. Experiments on the Ego4D, EgoExo4D, and EpicKitchens100 datasets demonstrate that FlowNar substantially improves narration quality over strong baselines while being highly efficient, supporting processing of 10$\times$ longer videos and achieving 3$\times$ higher throughput (FPS).
We introduce a new low-noise condition for classification, the Model Margin Noise (MM noise) assumption, and derive enhanced $\mathcal{H}$-consistency bounds under this condition. MM noise is weaker than Tsybakov noise condition: it is implied by Tsybakov noise condition but can hold even when Tsybakov fails, because it depends on the discrepancy between a given hypothesis and the Bayes-classifier rather than on the intrinsic distributional minimal margin (see Figure 1 for an illustration of an explicit example). This hypothesis-dependent assumption yields enhanced $\mathcal{H}$-consistency bounds for both binary and multi-class classification. Our results extend the enhanced $\mathcal{H}$-consistency bounds of Mao, Mohri, and Zhong (2025a) with the same favorable exponents but under a weaker assumption than the Tsybakov noise condition; they interpolate smoothly between linear and square-root regimes for intermediate noise levels. We also instantiate these bounds for common surrogate loss families and provide illustrative tables.
Low-level visual perception underpins reliable remote sensing (RS) image analysis, yet current image quality assessment (IQA) methods output uninterpretable scalar scores rather than characterizing physics-driven RS degradations, deviating markedly from the diagnostic needs of RS experts. While Vision-Language Models (VLMs) present a compelling alternative by delivering language-grounded IQA, their visual priors are heavily biased toward ground-level natural images. Consequently, whether VLMs can overcome this domain gap to perceive and articulate RS artifacts remains insufficiently studied. To bridge this gap, we propose \textbf{SenseBench}, the first dedicated diagnostic benchmark for RS low-level visual perception and description. Driven by a physics-based hierarchical taxonomy that unifies both non-reference and reference-based paradigms, SenseBench features over 10K meticulously curated instances across 6 major and 22 fine-grained RS degradation categories. Specifically, two complementary protocols are designed for evaluation: objective low-level visual \textit{perception} and subjective diagnostic \textit{description}. Comprehensive evaluation of 29 state-of-the-art VLMs reve
This paper presents a comprehensive investigation into the decay mechanisms inherent in linear complexity sequence models. We systematically delineate the design space of decay mechanisms across four pivotal dimensions: parameterization strategy, which refers to the computational methodology for decay; parameter sharing, which involves the utilization of supplementary parameters for decay computation; decay granularity, comparing scalar versus vector-based decay; and compatibility with relative positional encoding methods, such as Rotary Position Embedding (RoPE). Through an extensive series of experiments conducted on diverse language modeling tasks, we uncovered several critical insights. Firstly, the design of the parameterization strategy for decay requires meticulous consideration. Our findings indicate that effective configurations are typically confined to a specific range of parameters. Secondly, parameter sharing cannot be used arbitrarily, as it may cause decay values to be too large or too small, thereby significantly impacting performance. Thirdly, under identical parameterization strategies, scalar decay generally underperforms compared to its vector-based counterpart.
DeFi (Decentralized Finance) is one of the most important applications of today's cryptocurrencies and smart contracts. It manages hundreds of billions in Total Value Locked (TVL) on-chain, yet it remains susceptible to common DeFi price manipulation attacks. Despite state-of-the-art (SOTA) systems like DeFiRanger and DeFort, we found that they are less effective to non-standard price models in custom DeFi protocols, which account for 44.2% of the 95 DeFi price manipulation attacks reported over the past three years. In this paper, we introduce the first LLM-based approach, DeFiScope, for detecting DeFi price manipulation attacks in both standard and custom price models. Our insight is that large language models (LLMs) have certain intelligence to abstract price calculation from smart contract source code and infer the trend of token price changes based on the extracted price models. To further strengthen LLMs in this aspect, we leverage Foundry to synthesize on-chain data and use it to fine-tune a DeFi price-specific LLM. Together with the high-level DeFi operations recovered from low-level transaction data, DeFiScope detects various DeFi price manipulations according to systemati
Mixture-of-Expert (MoE) models outperform conventional models by selectively activating different subnets, named experts, on a per-token basis. This gated computation generates dynamic communications that cannot be determined beforehand, challenging the existing GPU interconnects that remain static during the distributed training process. In this paper, we advocate for a first-of-its-kind system, called MixNet, that unlocks topology reconfiguration during distributed MoE training. Towards this vision, we first perform a production measurement study and show that the MoE dynamic communication pattern has strong locality, alleviating the requirement of global reconfiguration. Based on this, we design and implement a regionally reconfigurable high-bandwidth domain on top of existing electrical interconnects using optical circuit switching (OCS), achieving scalability while maintaining rapid adaptability. We have built a fully functional MixNet prototype with commodity hardware and a customized collective communication runtime that trains state-of-the-art MoE models with in-training topology reconfiguration across 32 A100 GPUs. Large-scale packet-level simulations show that MixNet deli
Conventional federated learning (FL) assumes a closed world with a fixed total number of clients. In contrast, new clients continuously join the FL process in real-world scenarios, introducing new knowledge. This raises two critical demands: detecting new knowledge, i.e., knowledge discovery, and integrating it into the global model, i.e., knowledge adaptation. Existing research focuses on coarse-grained knowledge discovery, and often sacrifices source domain performance and adaptation efficiency. To this end, we propose a fine-grained federated domain adaptation approach in open set (Gains). Gains splits the model into an encoder and a classifier, empirically revealing features extracted by the encoder are sensitive to domain shifts while classifier parameters are sensitive to class increments. Based on this, we develop fine-grained knowledge discovery and contribution-driven aggregation techniques to identify and incorporate new knowledge. Additionally, an anti-forgetting mechanism is designed to preserve source domain performance, ensuring balanced adaptation. Experimental results on multi-domain datasets across three typical data-shift scenarios demonstrate that Gains significa
In recent years, Multimodal Emotion Recognition (MER) has made substantial progress. Nevertheless, most existing approaches neglect the semantic inconsistencies that may arise across modalities, such as conflicting emotional cues between text and visual inputs. Besides, current methods are often dominated by the text modality due to its strong representational capacity, which can compromise recognition accuracy. To address these challenges, we propose a model termed Calibrated Multimodal Consensus (CMC). CMC introduces a Pseudo Label Generation Module (PLGM) to produce pseudo unimodal labels, enabling unimodal pretraining in a self-supervised fashion. It then employs a Parameter-free Fusion Module (PFM) and a Multimodal Consensus Router (MCR) for multimodal finetuning, thereby mitigating text dominance and guiding the fusion process toward a more reliable consensus. Experimental results demonstrate that CMC achieves performance on par with or superior to state-of-the-art methods across four datasets, CH-SIMS, CH-SIMS v2, CMU-MOSI, and CMU-MOSEI, and exhibits notable advantages in scenarios with semantic inconsistencies on CH-SIMS and CH-SIMS v2. The implementation of this work is p
Recent advancements in Multimodal Emotion Recognition (MER) face challenges in addressing both modality missing and Out-Of-Distribution (OOD) data simultaneously. Existing methods often rely on specific models or introduce excessive parameters, which limits their practicality. To address these issues, we propose a novel robust MER framework, Causal Inference Distiller (CIDer), and introduce a new task, Random Modality Feature Missing (RMFM), to generalize the definition of modality missing. CIDer integrates two key components: a Model-Specific Self-Distillation (MSSD) module and a Model-Agnostic Causal Inference (MACI) module. MSSD enhances robustness under the RMFM task through a weight-sharing self-distillation approach applied across low-level features, attention maps, and high-level representations. Additionally, a Word-level Self-aligned Attention Module (WSAM) reduces computational complexity, while a Multimodal Composite Transformer (MCT) facilitates efficient multimodal fusion. To tackle OOD challenges, MACI employs a tailored causal graph to mitigate label and language biases using a Multimodal Causal Module (MCM) and fine-grained counterfactual texts. Notably, MACI can in
The interest in linear complexity models for large language models is on the rise, although their scaling capacity remains uncertain. In this study, we present the scaling laws for linear complexity language models to establish a foundation for their scalability. Specifically, we examine the scaling behaviors of three efficient linear architectures. These include TNL, a linear attention model with data-independent decay; HGRN2, a linear RNN with data-dependent decay; and cosFormer2, a linear attention model without decay. We also include LLaMA as a baseline architecture for softmax attention for comparison. These models were trained with six variants, ranging from 70M to 7B parameters on a 300B-token corpus, and evaluated with a total of 1,376 intermediate checkpoints on various downstream tasks. These tasks include validation loss, commonsense reasoning, and information retrieval and generation. The study reveals that existing linear complexity language models exhibit similar scaling capabilities as conventional transformer-based models while also demonstrating superior linguistic proficiency and knowledge retention.
Objective: To establish an automated pipeline for post-processing of quantitative spin-lattice relaxation time constant in the rotating frame ($T_{1ρ}$) imaging of knee articular cartilage. Design: The proposed post-processing pipeline commences with an image standardisation procedure, followed by deep learning-based segmentation to generate cartilage masks. The articular cartilage is then automatically parcellated into 20 subregions, where $T_{1ρ}$ quantification is performed. The proposed pipeline was retrospectively validated on a dataset comprising knee $T_{1ρ}$ images of 10 healthy volunteers and 30 patients with knee osteoarthritis. Three experiments were conducted, namely an assessment of segmentation model performance (using Dice similarity coefficients, DSCs); an evaluation of the impact of standardisation; and a test of $T_{1ρ}$ quantification accuracy (using paired t-tests; root-mean-square deviations, RMSDs; and coefficients of variance of RMSDs, $CV_{RMSD}$). Statistical significance was set as p<0.05. Results: There was a substantial agreement between the subregional $T_{1ρ}$ quantification from the model-predicted masks and those from the manual segmentation label
We present the Linear Complexity Sequence Model (LCSM), a comprehensive solution that unites various sequence modeling techniques with linear complexity, including linear attention, state space model, long convolution, and linear RNN, within a single framework. The goal is to enhance comprehension of these models by analyzing the impact of each component from a cohesive and streamlined viewpoint. Specifically, we segment the modeling processes of these models into three distinct stages: Expand, Oscillation, and Shrink (EOS), with each model having its own specific settings. The Expand stage involves projecting the input signal onto a high-dimensional memory state. This is followed by recursive operations performed on the memory state in the Oscillation stage. Finally, the memory state is projected back to a low-dimensional space in the Shrink stage. We perform comprehensive experiments to analyze the impact of different stage settings on language modeling and retrieval tasks. Our results show that data-driven methods are crucial for the effectiveness of the three stages in language modeling, whereas hand-crafted methods yield better performance in retrieval tasks.