搜索 — ResearchTracker

Recent preference learning frameworks for large language models (LLMs) simplify human preferences with binary pairwise comparisons and scalar rewards. This simplification could make LLMs' responses biased to mostly preferred features, and would be exacerbated during the iterations of online preference learning steps. To address these challenges, we propose a novel framework coined PFP (Preference Feature Preservation). The key idea of PFP is maintaining the distribution of human preference features and utilizing such rich signals throughout the online preference learning process. Specifically, PFP first extract preference features from offline pairwise human preference data and trains a feature classifier. Then, using trained classifier and the distribution preserving optimization, PFP maps appropriate preference features for a new input instruction during online learning. Lastly, PFP trains LLM using the existing preference learning method, by incorporating the preference feature into system prompts and enabling LLM to explicitly handle various human preferences. Our experiments demonstrate that PFP successfully mitigates the bias in preference features during online learning, and

MagicWand: A Universal Agent for Generation and Evaluation Aligned with User Preference

arXiv2025-11-23作者：Zitong Xu, Dake Shen, Yaosong Du

Recent advances in AIGC (Artificial Intelligence Generated Content) models have enabled significant progress in image and video generation. However, users still struggle to obtain content that aligns with their preferences due to the difficulty of crafting detailed prompts and the lack of mechanisms to retain their preferences. To address these challenges, we construct \textbf{UniPrefer-100K}, a large-scale dataset comprising images, videos, and associated text that describes the styles users tend to prefer. Based on UniPrefer-100K, we propose \textbf{MagicWand}, a universal generation and evaluation agent that enhances prompts based on user preferences, leverages advanced generation models for high-quality content, and applies preference-aligned evaluation and refinement. In addition, we introduce \textbf{UniPreferBench}, the first large-scale benchmark with over 120K annotations for assessing user preference alignment across diverse AIGC tasks. Experiments on UniPreferBench demonstrate that MagicWand consistently generates content and evaluations that are well aligned with user preferences across a wide range of scenarios.

搜索结果：prefer

Debiasing Online Preference Learning via Preference Feature Preservation

MagicWand: A Universal Agent for Generation and Evaluation Aligned with User Preference

Whose Boat Does it Float? Improving Personalization in Preference Tuning via Inferred User Personas

Misaligned by Reward: Socially Undesirable Preferences in LLMs

Consensus and fragmentation in academic publication preferences

What's In My Human Feedback? Learning Interpretable Descriptions of Preference Data

A Good Plan is Hard to Find: Aligning Models with Preferences is Misaligned with What Helps Users

Multi-Domain Explainability of Preferences

Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering

PrefPalette: Personalized Preference Modeling with Latent Attributes

Preference for redistribution and institutional trust: Comparison before and after COVID-19

Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences

Empowering Retrieval-based Conversational Recommendation with Contrasting User Preferences

Optimizing Data Delivery: Insights from User Preferences on Visuals, Tables, and Text

Investigating Language Preference of Multilingual RAG Systems

PREFER: An Ontology for the PREcision FERmentation Community

Do LLM Evaluators Prefer Themselves for a Reason?

Do readers prefer AI-generated Italian short stories?

On the Role of Preference Variance in Preference Optimization