搜索 — ResearchTracker

In this paper, we delve deeper into the Kullback-Leibler (KL) Divergence loss and mathematically prove that it is equivalent to the Decoupled Kullback-Leibler (DKL) Divergence loss that consists of (1) a weighted Mean Square Error (wMSE) loss and (2) a Cross-Entropy loss incorporating soft labels. Thanks to the decoupled structure of DKL loss, we have identified two areas for improvement. Firstly, we address the limitation of KL loss in scenarios like knowledge distillation by breaking its asymmetric optimization property along with a smoother weight function. This modification effectively alleviates convergence challenges in optimization, particularly for classes with high predicted scores in soft labels. Secondly, we introduce class-wise global information into KL/DKL to reduce bias arising from individual samples. With these two enhancements, we derive the Generalized Kullback-Leibler (GKL) Divergence loss and evaluate its effectiveness by conducting experiments on CIFAR-10/100, ImageNet, and vision-language datasets, focusing on adversarial training, and knowledge distillation tasks. Specifically, we achieve new state-of-the-art adversarial robustness on the public leaderboard

Representation Fréchet Loss for Visual Generation

arXiv2026-04-30作者：Jiawei Yang, Zhengyang Geng, Xuan Ju

We show that Fréchet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term this approach FD-loss. Optimizing FD-loss reveals several surprising findings. First, post-training a base generator with FD-loss in different representation spaces consistently improves visual quality. Under the Inception feature space, a one-step generator achieves0.72 FID on ImageNet 256x256. Second, the same FD-loss repurposes multi-step generators into strong one-step generators without teacher distillation, adversarial training or per-sample targets. Third, FID can misrank visual quality: modern representations can yield better samples despite worse Inception FID. This motivates FDr$^k$, a multi-representation metric. We hope this work will encourage further exploration of distributional distances in diverse representation spaces as both training objectives and evaluation metrics for generative models.

搜索结果：loss

Generalized Kullback-Leibler Divergence Loss

Representation Fréchet Loss for Visual Generation

GT-Mean Loss: A Simple Yet Effective Solution for Brightness Mismatch in Low-Light Image Enhancement

Measurement incompatibility under loss

Anneal-free ultra-low loss silicon nitride integrated photonics

A Perceptual Shape Loss for Monocular 3D Face Reconstruction

Stabilizing optical solitons by frequency-dependent linear gain-loss and the collisional Raman frequency shift

Functional Analysis of Loss-development Patterns in P&amp;C Insurance

Enhancing Performance of Point Cloud Completion Networks with Consistency Loss

Exact Sampling of Gibbs Measures with Estimated Losses

Conditional Copula models using loss-based Bayesian Additive Regression Trees

Visualizing the Loss Landscape of Neural Nets

Precision measurement of the microwave dielectric loss of sapphire in the quantum regime with parts-per-billion sensitivity

CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models

General Bayesian Loss Function Selection and the use of Improper Models

VoiceID Loss: Speech Enhancement for Speaker Verification

Loss Aversion and State-Dependent Linear Utility Functions for Monetary Returns

Using Focal Loss to Fight Shallow Heuristics: An Empirical Analysis of Modulated Cross-Entropy in Natural Language Inference

The IEEE-IS2 2024 Music Packet Loss Concealment Challenge

Beam Loss in Linacs

Functional Analysis of Loss-development Patterns in P&C Insurance