强化学习 Reinforcement Learning DQN Q-Learning Sarsa学习笔记

看到了一个新名词Sarsa,以为是个什么新东西,原来和Q-Learning差不多

Q-Learning公式简单,但却轻松解决了回报延时的问题,使得网络可以学习策略,策略中的每一个步所带来的价值可能需要很多步之后才可以体现。DQN又解决输入状态信息量过大的问题。
OpenAI Gym系统深度解析
http://tech.163.com/16/0510/09/BMMOPSCR00094OE0.html

如何用简单例子讲解 Q - learning 的具体过程?
https://www.zhihu.com/question/26408259
A Painless Q-learning Tutorial (一个 Q-learning 算法的简明教程)
http://blog.csdn.net/itplus/article/details/9361915

Flappy Bird Q-learning
https://enhuiz.github.io/flappybird-ql/

ConvNetJS Deep Q Learning Demo
https://www.zhihu.com/question/26408259

Deep Q-Network 学习笔记(二)—— Q-Learning与神经网络结合使用(有代码实现)
http://blog.csdn.net/gongxiaojiu/article/details/73345808

什么是 DQN
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-A-DQN/
DQN 算法更新 (Tensorflow)
https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/4-1-DQN1/

强化学习系列之九:Deep Q Network (DQN)
http://www.algorithmdog.com/drl

Fixed Target Q-network
http://blog.csdn.net/songrotek/article/details/50917286


发表于:2017-10-20 15:30:25

原文链接(转载请保留): http://www.multisilicon.com/blog/a25323846.html

友情链接: MICROIC
首页