current position:Home>Reinforcement learning of integrated bootstrap

Reinforcement learning of integrated bootstrap

2021-08-23 13:18:47 Author: Qingshi

【 The author team 】Oren Peer, Chen Tessler, Nadav Merlis, Ron Meir

【 Thesis link 】https://arxiv.org/pdf/2103.00445.pdf

【 Recommended reasons 】Q Study (QL) It is a common reinforcement learning algorithm , Due to the best Bellman The maximization term in the operator suffers from overestimation bias . This bias can lead to suboptimal behavior .Double-Q Learn to solve this problem by using two estimators , But it can lead to underestimation bias . And Q Overestimation in learning is similar to , In some cases , Underestimating bias can degrade performance . In this work , A new bias reduction algorithm is introduced , be called Ensemble Bootstrapped Q-Learning(EBQL), This is a Double-Q-learning A natural extension of a set . The proposed method is analyzed theoretically and empirically . In theory , It is proved that when estimating the maximum average value of a set of independent random variables , similar EBQL Updates to produce lower MSE. Based on experience , Show that there are some fields , In these domains , Overestimation and underestimation will lead to suboptimal performance . Last , It shows EBQL Of RL depth RL The variant is superior to other ATARI The depth of the game QL Superior performance of the algorithm .

copyright notice
author[Author: Qingshi],Please bring the original link to reprint, thank you.
https://en.qdmana.com/2021/08/20210823131843618H.html

Random recommended