ResearchTracker — 科研与行业发展动态追踪

Making Large Language Models Better Reasoners with Alignment

学术论文arXiv2023-09-05作者：Peiyi Wang, Lei Li, Liang Chen

Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capability is essential for large language models (LLMs) to serve as the brain of the artificial general intelligence agent. Recent studies reveal that fine-tuning LLMs on data with the chain of thought (COT) reasoning process can significantly enhance their reasoning capabilities. However, we find that the fine-tuned LLMs suffer from an \textit{Assessment Misalignment} problem, i.e., they frequently assign higher scores to subpar COTs, leading to potential limitations in their reasoning abilities. To address this problem, we introduce an \textit{Alignment Fine-Tuning (AFT)} paradigm, which involves three steps: 1) fine-tuning LLMs with COT training data; 2) generating multiple COT responses for each question, and categorizing them into positive and negative ones based on whether they achieve the correct answer; 3) calibrating the scores of positive and negative responses given by LLMs with a novel constraint alignment loss. Specifically, the constraint alignment loss has two objectives: a) Alignment, which guarantees that positive scores surpass negative scores to encourage answers with h

搜索结果：large language model

Making Large Language Models Better Reasoners with Alignment

Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap

A Critical Review of Causal Reasoning Benchmarks for Large Language Models

THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models

Attacks on Third-Party APIs of Large Language Models

Are Compressed Language Models Less Subgroup Robust?

Unforgettable Generalization in Language Models

TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models

Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models

Enhancing Human-Like Responses in Large Language Models

Acoustic Prompt Tuning: Empowering Large Language Models with Audition Capabilities

Speaker attribution in German parliamentary debates with QLoRA-adapted large language models

Discriminating Similar Languages: Evaluations and Explorations

A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks

Scaling Behavior of Machine Translation with Large Language Models under Prompt Injection Attacks

Modeling Language Change in Historical Corpora: The Case of Portuguese

Large Language Models Merging for Enhancing the Link Stealing Attack on Graph Neural Networks

Large Language Models in Ambulatory Devices for Home Health Diagnostics: A case study of Sickle Cell Anemia Management

Robust Language Identification for Romansh Varieties

Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle