搜索结果：based

共找到 20 条结果

高级筛选 ▾

Rating-based Reinforcement Learning

arXiv2023-07-30作者：Devin White, Mingkang Wu, Ellen Novoseller

This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.

搜索结果：based

Rating-based Reinforcement Learning

TimeWeaver: Age-Consistent Reference-Based Face Restoration with Identity Preservation

From Model-Based Screening to Data-Driven Surrogates: A Multi-Stage Workflow for Exploring Stochastic Agent-Based Models

Register-based Census in Thailand: a Case Study in Chachoengsao Province

LLM-guided Task and Motion Planning using Knowledge-based Reasoning

ActMiner: Applying Causality Tracking and Increment Aligning for Graph-based Cyber Threat Hunting

Tensor Decomposition Based Attention Module for Spiking Neural Networks

Anomal-E: A Self-Supervised Network Intrusion Detection System based on Graph Neural Networks

Model-based Transfer Learning for Automatic Optical Inspection based on domain discrepancy

A Performance Survey on Stack-based and Register-based Virtual Machines

Agent-Based Modelling: An Overview with Application to Disease Dynamics

A Methodology to Engineer and Validate Dynamic Multi-level Multi-agent Based Simulations

End-to-End Reinforcement Learning for Torque Based Variable Height Hopping

Incremental cycle bases for cycle-based pose graph optimization

What will be the maximum Tc in the iron-based superconductors?

Propagation based phase retrieval of simulated intensity measurements using artificial neural networks

Learning Attention-based Representations from Multiple Patterns for Relation Prediction in Knowledge Graphs

ResCap-DBP: A Lightweight Residual-Capsule Network for Accurate DNA-Binding Protein Prediction Using Global ProteinBERT Embeddings

A Multi-Stage Hybrid CNN-Transformer Network for Automated Pediatric Lung Sound Classification

FairUDT: Fairness-aware Uplift Decision Trees

搜索结果：based

Rating-based Reinforcement Learning

TimeWeaver: Age-Consistent Reference-Based Face Restoration with Identity Preservation

From Model-Based Screening to Data-Driven Surrogates: A Multi-Stage Workflow for Exploring Stochastic Agent-Based Models

Register-based Census in Thailand: a Case Study in Chachoengsao Province

LLM-guided Task and Motion Planning using Knowledge-based Reasoning

ActMiner: Applying Causality Tracking and Increment Aligning for Graph-based Cyber Threat Hunting

Tensor Decomposition Based Attention Module for Spiking Neural Networks

Anomal-E: A Self-Supervised Network Intrusion Detection System based on Graph Neural Networks

Model-based Transfer Learning for Automatic Optical Inspection based on domain discrepancy

A Performance Survey on Stack-based and Register-based Virtual Machines

Agent-Based Modelling: An Overview with Application to Disease Dynamics

A Methodology to Engineer and Validate Dynamic Multi-level Multi-agent Based Simulations

End-to-End Reinforcement Learning for Torque Based Variable Height Hopping

Incremental cycle bases for cycle-based pose graph optimization

What will be the maximum Tc in the iron-based superconductors?

Propagation based phase retrieval of simulated intensity measurements using artificial neural networks

Learning Attention-based Representations from Multiple Patterns for Relation Prediction in Knowledge Graphs

ResCap-DBP: A Lightweight Residual-Capsule Network for Accurate DNA-Binding Protein Prediction Using Global ProteinBERT Embeddings

A Multi-Stage Hybrid CNN-Transformer Network for Automated Pediatric Lung Sound Classification

FairUDT: Fairness-aware Uplift Decision Trees