5 个月前

用于深度强化学习的决斗网络架构

Ziyu Wang Tom Schaul Matteo Hessel Hado van Hasselt Marc Lanctot Nando de Freitas

摘要

近年来，深度表示在强化学习中的应用取得了许多成功。然而，许多这些应用仍然使用传统的架构，如卷积网络、LSTM（长短期记忆网络）或自编码器。本文中，我们提出了一种新的无模型强化学习神经网络架构。我们的双流网络表示两个独立的估计器：一个用于状态值函数，另一个用于状态依赖的动作优势函数。这种分解的主要好处是在不改变底层强化学习算法的情况下，能够跨动作泛化学习。实验结果表明，该架构在存在大量相似价值动作的情况下，能够实现更好的策略评估。此外，双流架构使我们的强化学习代理在Atari 2600领域超越了现有技术水平。

代码仓库

prajwalgatti/DRL-Continuous-Control

GitHub 中提及

wtingda/DeepRLBreakout

GitHub 中提及

facebookresearch/Horizon

pytorch

GitHub 中提及

nbopardi/smb

GitHub 中提及

shehrum/RL_Navigation

pytorch

GitHub 中提及

alessandrositta/Flatland_challenge

pytorch

GitHub 中提及

R-Sweke/DeepQ-Decoding

GitHub 中提及

gouxiangchen/dueling-DQN-pytorch

pytorch

GitHub 中提及

hemilpanchiwala/Dueling_Network_Architectures

pytorch

GitHub 中提及

dxyang/DQN_pytorch

pytorch

GitHub 中提及

utarumo/RL_implementation

GitHub 中提及

JuliaPOMDP/DeepQLearning.jl

GitHub 中提及

cove9988/TradingGym

GitHub 中提及

iDataist/Navigation-with-Deep-Q-Network

pytorch

GitHub 中提及

170928/-Review-Dueling-Deep-Q-Network

GitHub 中提及

la3lma/chezjulia

GitHub 中提及

zynk13/dueling-dqn-Reinforcement-learning

GitHub 中提及

mindspore-courses/Deep-Reinforcement-Learning-Algorithms-with-MindSpore

mindspore

guillaumeboniface/bananaland

pytorch

GitHub 中提及

Montherapy/Deep-reinforcement-learning-for-multi-class-imbalanced-classification

GitHub 中提及

chainer/chainerrl

pytorch

GitHub 中提及

botforge/simplementation

pytorch

GitHub 中提及

jsztompka/DuelDQN

pytorch

GitHub 中提及

cocolico14/N-step-Dueling-DDQN-PER-Pacman

GitHub 中提及

near32/regym

pytorch

GitHub 中提及

1jsingh/rl_navigation

pytorch

GitHub 中提及

BY571/DQN-Atari-Agents

pytorch

GitHub 中提及

jezzarax/drlnd_p1_navigation

pytorch

GitHub 中提及

eddynelson/dqn

GitHub 中提及

tensorlayer/RLzoo

GitHub 中提及

fengsterooni/dql

pytorch

GitHub 中提及

rybread1/deep-rl-trex

GitHub 中提及

kshitij-ingale/Reinforcement-Learning

GitHub 中提及

ZainRaza14/deepRL

pytorch

GitHub 中提及

la3lma/Chez

GitHub 中提及

Adrelf/DRL-navigation

pytorch

GitHub 中提及

MEOWMEOW114/nd893-p1-navigation-banana

pytorch

GitHub 中提及

opplieam/Pong-Deep-RL

pytorch

GitHub 中提及

mindspore-courses/Rainbow-MindSpore

mindspore

GitHub 中提及

abryeemessi/Wednesday

GitHub 中提及

kmdanielduan/DQN_Family_PyTorch

pytorch

GitHub 中提及

JBGUIMBAUD/deep-reenforcement-learning

pytorch

GitHub 中提及

xusophia/DataSciFinalProj

pytorch

GitHub 中提及

tensorpack/tensorpack/tree/master/examples/DeepQNetwork

GitHub 中提及

rybread1/DeepRlTrex

GitHub 中提及

OMS1996/Carla_The_RL_Self-Driving-Car

GitHub 中提及

hemilpanchiwala/Dueling-Network-Architectures

pytorch

GitHub 中提及

ethanmclark1/carla_aebs

pytorch

GitHub 中提及

labmlai/annotated_deep_learning_paper_implementations

pytorch

KDL-umass/saliency_maps

GitHub 中提及

Curt-Park/rainbow-is-all-you-need

GitHub 中提及

nathanin/pad

GitHub 中提及

opendilab/DI-engine

pytorch

mohit8935/Deep-Q-Learning-Paper

pytorch

GitHub 中提及

HussonnoisMaxence/RL_Algorithms

pytorch

GitHub 中提及

NervanaSystems/coach

GitHub 中提及

facebookresearch/ReAgent

pytorch

GitHub 中提及

mightypirate1/DRL-Tetris

GitHub 中提及

austinsilveria/Banana-Collection-DQN

pytorch

GitHub 中提及

shashwatsaxena571/DRL-navigation

pytorch

GitHub 中提及

marload/DeepRL-TensorFlow2

GitHub 中提及

chandar-lab/RLHive

pytorch

manvibharat/Stock-price-pridiction-using-Deep-reienforcement-learning

GitHub 中提及

philtabor/Deep-Q-Learning-Paper-To-Code

pytorch

GitHub 中提及

atavakol/action-branching-agents

GitHub 中提及

MohammadAsadolahi/Dueling_Deep-Double_Q-Learning-for-solving-OpenAi-Gym-LunarLander-v2-in-python

GitHub 中提及

Brandon-Rozek/DeepRL

GitHub 中提及

FaboNo/DRLND

pytorch

GitHub 中提及

SayhoKim/tetrisRL

GitHub 中提及

clarky104/carla_aebs

pytorch

GitHub 中提及

prajwalgatti/DRL-Navigation

GitHub 中提及

ku2482/sac-discrete.pytorch

pytorch

GitHub 中提及

MOVzeroOne/DQN

pytorch

GitHub 中提及

基准测试

基准	方法	指标
atari-games-on-atari-2600-alien	Prior+Duel noop	Score: 3941.0
atari-games-on-atari-2600-alien	Duel noop	Score: 4461.4
atari-games-on-atari-2600-alien	DDQN (tuned) noop	Score: 3747.7
atari-games-on-atari-2600-alien	Prior+Duel hs	Score: 823.7
atari-games-on-atari-2600-alien	Duel hs	Score: 1486.5
atari-games-on-atari-2600-amidar	Prior+Duel hs	Score: 238.4
atari-games-on-atari-2600-amidar	DDQN (tuned) noop	Score: 1793.3
atari-games-on-atari-2600-amidar	Duel hs	Score: 172.7
atari-games-on-atari-2600-amidar	Duel noop	Score: 2354.5
atari-games-on-atari-2600-amidar	Prior+Duel noop	Score: 2296.8
atari-games-on-atari-2600-assault	Prior+Duel noop	Score: 11477.0
atari-games-on-atari-2600-assault	Prior+Duel hs	Score: 10950.6
atari-games-on-atari-2600-assault	DDQN (tuned) noop	Score: 5393.2
atari-games-on-atari-2600-assault	Duel noop	Score: 4621.0
atari-games-on-atari-2600-assault	Duel hs	Score: 3994.8
atari-games-on-atari-2600-asterix	Prior+Duel hs	Score: 364200.0
atari-games-on-atari-2600-asterix	DDQN (tuned) noop	Score: 17356.5
atari-games-on-atari-2600-asterix	Prior+Duel noop	Score: 375080.0
atari-games-on-atari-2600-asterix	Duel hs	Score: 15840.0
atari-games-on-atari-2600-asterix	Duel noop	Score: 28188.0
atari-games-on-atari-2600-asteroids	Prior+Duel noop	Score: 1192.7
atari-games-on-atari-2600-asteroids	Duel noop	Score: 2837.7
atari-games-on-atari-2600-asteroids	Duel hs	Score: 2035.4
atari-games-on-atari-2600-asteroids	DDQN (tuned) noop	Score: 734.7
atari-games-on-atari-2600-atlantis	DDQN (tuned) noop	Score: 106056.0
atari-games-on-atari-2600-atlantis	Duel noop	Score: 382572.0
atari-games-on-atari-2600-atlantis	Duel hs	Score: 445360.0
atari-games-on-atari-2600-atlantis	Prior+Duel noop	Score: 395762.0
atari-games-on-atari-2600-bank-heist	DDQN (tuned) noop	Score: 1030.6
atari-games-on-atari-2600-bank-heist	Duel hs	Score: 1129.3
atari-games-on-atari-2600-bank-heist	Prior+Duel noop	Score: 1503.1
atari-games-on-atari-2600-bank-heist	Duel noop	Score: 1611.9
atari-games-on-atari-2600-battle-zone	Duel hs	Score: 31320.0
atari-games-on-atari-2600-battle-zone	Duel noop	Score: 37150.0
atari-games-on-atari-2600-battle-zone	Prior+Duel noop	Score: 35520.0
atari-games-on-atari-2600-battle-zone	DDQN (tuned) noop	Score: 31700.0
atari-games-on-atari-2600-beam-rider	Duel noop	Score: 12164.0
atari-games-on-atari-2600-beam-rider	Prior+Duel noop	Score: 30276.5
atari-games-on-atari-2600-beam-rider	DDQN (tuned) noop	Score: 13772.8
atari-games-on-atari-2600-beam-rider	Duel hs	Score: 14591.3
atari-games-on-atari-2600-berzerk	Duel noop	Score: 1472.6
atari-games-on-atari-2600-berzerk	Duel hs	Score: 910.6
atari-games-on-atari-2600-berzerk	DDQN (tuned) noop	Score: 1225.4
atari-games-on-atari-2600-berzerk	Prior+Duel noop	Score: 3409.0
atari-games-on-atari-2600-bowling	Duel noop	Score: 65.5
atari-games-on-atari-2600-bowling	Duel hs	Score: 65.7
atari-games-on-atari-2600-bowling	DDQN (tuned) noop	Score: 68.1
atari-games-on-atari-2600-bowling	Prior+Duel noop	Score: 46.7
atari-games-on-atari-2600-boxing	DDQN (tuned) noop	Score: 91.6
atari-games-on-atari-2600-boxing	Duel noop	Score: 99.4
atari-games-on-atari-2600-boxing	Duel hs	Score: 77.3
atari-games-on-atari-2600-boxing	Prior+Duel noop	Score: 98.9
atari-games-on-atari-2600-breakout	Prior+Duel noop	Score: 366.0
atari-games-on-atari-2600-breakout	Duel hs	Score: 411.6
atari-games-on-atari-2600-breakout	Duel noop	Score: 345.3
atari-games-on-atari-2600-breakout	DDQN (tuned) noop	Score: 418.5
atari-games-on-atari-2600-centipede	Duel hs	Score: 4881.0
atari-games-on-atari-2600-centipede	DDQN (tuned) noop	Score: 5409.4
atari-games-on-atari-2600-centipede	Duel noop	Score: 7561.4
atari-games-on-atari-2600-centipede	Prior+Duel noop	Score: 7687.5
atari-games-on-atari-2600-chopper-command	Duel noop	Score: 11215.0
atari-games-on-atari-2600-chopper-command	DDQN (tuned) noop	Score: 5809.0
atari-games-on-atari-2600-chopper-command	Prior+Duel noop	Score: 13185.0
atari-games-on-atari-2600-chopper-command	Duel hs	Score: 3784.0
atari-games-on-atari-2600-crazy-climber	Duel noop	Score: 143570.0
atari-games-on-atari-2600-crazy-climber	DDQN (tuned) noop	Score: 117282.0
atari-games-on-atari-2600-crazy-climber	Duel hs	Score: 124566.0
atari-games-on-atari-2600-crazy-climber	Prior+Duel noop	Score: 162224.0
atari-games-on-atari-2600-defender	Prior+Duel noop	Score: 41324.5
atari-games-on-atari-2600-defender	Duel noop	Score: 42214.0
atari-games-on-atari-2600-defender	Prior+Duel hs	Score: 34415.0
atari-games-on-atari-2600-demon-attack	Prior+Duel noop	Score: 72878.6
atari-games-on-atari-2600-demon-attack	Duel noop	Score: 60813.3
atari-games-on-atari-2600-demon-attack	Duel hs	Score: 56322.8
atari-games-on-atari-2600-demon-attack	DDQN (tuned) noop	Score: 58044.2
atari-games-on-atari-2600-double-dunk	Duel noop	Score: 0.1
atari-games-on-atari-2600-double-dunk	Prior+Duel noop	Score: -12.5
atari-games-on-atari-2600-double-dunk	DDQN (tuned) noop	Score: -5.5
atari-games-on-atari-2600-double-dunk	Duel hs	Score: -0.8
atari-games-on-atari-2600-enduro	Prior+Duel noop	Score: 2306.4
atari-games-on-atari-2600-enduro	Duel hs	Score: 2077.4
atari-games-on-atari-2600-enduro	Duel noop	Score: 2258.2
atari-games-on-atari-2600-enduro	DDQN (tuned) noop	Score: 1211.8
atari-games-on-atari-2600-fishing-derby	Duel hs	Score: -4.1
atari-games-on-atari-2600-fishing-derby	Duel noop	Score: 46.4
atari-games-on-atari-2600-fishing-derby	Prior+Duel noop	Score: 41.3
atari-games-on-atari-2600-fishing-derby	DDQN (tuned) noop	Score: 15.5
atari-games-on-atari-2600-freeway	Duel noop	Score: 0.0
atari-games-on-atari-2600-freeway	Duel hs	Score: 0.2
atari-games-on-atari-2600-freeway	Prior+Duel noop	Score: 33.0
atari-games-on-atari-2600-freeway	DDQN (tuned) noop	Score: 33.3
atari-games-on-atari-2600-frostbite	Duel hs	Score: 2332.4
atari-games-on-atari-2600-frostbite	Prior+Duel noop	Score: 7413.0
atari-games-on-atari-2600-frostbite	Duel noop	Score: 4672.8
atari-games-on-atari-2600-frostbite	DDQN (tuned) noop	Score: 1683.3
atari-games-on-atari-2600-gopher	Prior+Duel noop	Score: 104368.2
atari-games-on-atari-2600-gopher	Duel noop	Score: 15718.4
atari-games-on-atari-2600-gopher	DDQN (tuned) noop	Score: 14840.8
atari-games-on-atari-2600-gopher	Duel hs	Score: 20051.4
atari-games-on-atari-2600-gravitar	Duel noop	Score: 588.0
atari-games-on-atari-2600-gravitar	Duel hs	Score: 297.0
atari-games-on-atari-2600-gravitar	DDQN (tuned) noop	Score: 412.0
atari-games-on-atari-2600-gravitar	Prior+Duel noop	Score: 238.0
atari-games-on-atari-2600-hero	Prior+Duel noop	Score: 21036.5
atari-games-on-atari-2600-hero	DDQN (tuned) noop	Score: 20130.2
atari-games-on-atari-2600-hero	Duel hs	Score: 15207.9
atari-games-on-atari-2600-hero	Duel noop	Score: 20818.2
atari-games-on-atari-2600-ice-hockey	DDQN (tuned) noop	Score: -2.7
atari-games-on-atari-2600-ice-hockey	Prior+Duel noop	Score: -0.4
atari-games-on-atari-2600-ice-hockey	Duel noop	Score: 0.5
atari-games-on-atari-2600-ice-hockey	Duel hs	Score: -1.3
atari-games-on-atari-2600-james-bond	Duel noop	Score: 1312.5
atari-games-on-atari-2600-james-bond	Duel hs	Score: 835.5
atari-games-on-atari-2600-james-bond	Prior+Duel noop	Score: 812.0
atari-games-on-atari-2600-james-bond	DDQN (tuned) noop	Score: 1358.0
atari-games-on-atari-2600-kangaroo	Prior+Duel noop	Score: 1792.0
atari-games-on-atari-2600-kangaroo	DDQN (tuned) noop	Score: 12992.0
atari-games-on-atari-2600-kangaroo	Duel hs	Score: 10334.0
atari-games-on-atari-2600-kangaroo	Duel noop	Score: 14854.0
atari-games-on-atari-2600-krull	Prior+Duel noop	Score: 10374.4
atari-games-on-atari-2600-krull	Duel hs	Score: 8051.6
atari-games-on-atari-2600-krull	Duel noop	Score: 11451.9
atari-games-on-atari-2600-krull	DDQN (tuned) noop	Score: 7920.5
atari-games-on-atari-2600-kung-fu-master	DDQN (tuned) noop	Score: 29710.0
atari-games-on-atari-2600-kung-fu-master	Prior+Duel noop	Score: 48375.0
atari-games-on-atari-2600-kung-fu-master	Duel hs	Score: 24288.0
atari-games-on-atari-2600-kung-fu-master	Duel noop	Score: 34294.0
atari-games-on-atari-2600-montezumas-revenge	Duel hs	Score: 22.0
atari-games-on-atari-2600-ms-pacman	Prior+Duel noop	Score: 3327.3
atari-games-on-atari-2600-ms-pacman	DDQN (tuned) noop	Score: 2711.4
atari-games-on-atari-2600-ms-pacman	Duel noop	Score: 6283.5
atari-games-on-atari-2600-ms-pacman	Duel hs	Score: 2250.6
atari-games-on-atari-2600-name-this-game	Duel hs	Score: 11185.1
atari-games-on-atari-2600-name-this-game	Prior+Duel noop	Score: 15572.5
atari-games-on-atari-2600-name-this-game	Duel noop	Score: 11971.1
atari-games-on-atari-2600-name-this-game	DDQN (tuned) noop	Score: 10616.0
atari-games-on-atari-2600-phoenix	Prior+Duel hs	Score: 63597.0
atari-games-on-atari-2600-pong	DDQN (tuned) noop	Score: 20.9
atari-games-on-atari-2600-pong	Duel noop	Score: 21.0
atari-games-on-atari-2600-pong	Duel hs	Score: 18.8
atari-games-on-atari-2600-pong	Prior+Duel noop	Score: 20.9
atari-games-on-atari-2600-private-eye	Duel hs	Score: 292.6
atari-games-on-atari-2600-private-eye	Prior+Duel noop	Score: 206.0
atari-games-on-atari-2600-private-eye	Duel noop	Score: 103.0
atari-games-on-atari-2600-private-eye	DDQN (tuned) noop	Score: 129.7
atari-games-on-atari-2600-qbert	Prior+Duel noop	Score: 18760.3
atari-games-on-atari-2600-qbert	Duel hs	Score: 14175.8
atari-games-on-atari-2600-qbert	DDQN (tuned) noop	Score: 15088.5
atari-games-on-atari-2600-qbert	Duel noop	Score: 19220.3
atari-games-on-atari-2600-river-raid	Duel noop	Score: 21162.6
atari-games-on-atari-2600-river-raid	DDQN (tuned) noop	Score: 14884.5
atari-games-on-atari-2600-river-raid	Prior+Duel noop	Score: 20607.6
atari-games-on-atari-2600-river-raid	Duel hs	Score: 16569.4
atari-games-on-atari-2600-road-runner	Duel hs	Score: 58549.0
atari-games-on-atari-2600-road-runner	Duel noop	Score: 69524.0
atari-games-on-atari-2600-road-runner	Prior+Duel noop	Score: 62151.0
atari-games-on-atari-2600-road-runner	DDQN (tuned) noop	Score: 44127.0
atari-games-on-atari-2600-robotank	DDQN (tuned) noop	Score: 65.1
atari-games-on-atari-2600-robotank	Prior+Duel noop	Score: 27.5
atari-games-on-atari-2600-robotank	Duel hs	Score: 62.0
atari-games-on-atari-2600-robotank	Duel noop	Score: 65.3
atari-games-on-atari-2600-seaquest	Duel hs	Score: 37361.6
atari-games-on-atari-2600-seaquest	DDQN (tuned) noop	Score: 16452.7
atari-games-on-atari-2600-seaquest	Prior+Duel noop	Score: 931.6
atari-games-on-atari-2600-seaquest	Duel noop	Score: 50254.2
atari-games-on-atari-2600-space-invaders	Duel hs	Score: 5993.1
atari-games-on-atari-2600-space-invaders	Prior+Duel noop	Score: 15311.5
atari-games-on-atari-2600-space-invaders	DDQN (tuned) noop	Score: 2525.5
atari-games-on-atari-2600-space-invaders	Duel noop	Score: 6427.3
atari-games-on-atari-2600-star-gunner	Prior+Duel noop	Score: 125117.0
atari-games-on-atari-2600-star-gunner	Duel noop	Score: 89238.0
atari-games-on-atari-2600-star-gunner	Duel hs	Score: 90804.0
atari-games-on-atari-2600-star-gunner	DDQN (tuned) noop	Score: 60142.0
atari-games-on-atari-2600-tennis	Duel noop	Score: 5.1
atari-games-on-atari-2600-tennis	DDQN (tuned) noop	Score: -22.8
atari-games-on-atari-2600-tennis	Duel hs	Score: 4.4
atari-games-on-atari-2600-tennis	Prior+Duel noop	Score: 0.0
atari-games-on-atari-2600-time-pilot	DDQN (tuned) noop	Score: 8339.0
atari-games-on-atari-2600-time-pilot	Duel noop	Score: 11666.0
atari-games-on-atari-2600-time-pilot	Prior+Duel noop	Score: 7553.0
atari-games-on-atari-2600-time-pilot	Duel hs	Score: 6601.0
atari-games-on-atari-2600-tutankham	DDQN (tuned) noop	Score: 218.4
atari-games-on-atari-2600-tutankham	Prior+Duel noop	Score: 245.9
atari-games-on-atari-2600-tutankham	Duel hs	Score: 48.0
atari-games-on-atari-2600-tutankham	Duel noop	Score: 211.4
atari-games-on-atari-2600-up-and-down	DDQN (tuned) noop	Score: 22972.2
atari-games-on-atari-2600-up-and-down	Duel noop	Score: 44939.6
atari-games-on-atari-2600-up-and-down	Prior+Duel noop	Score: 33879.1
atari-games-on-atari-2600-up-and-down	Duel hs	Score: 24759.2
atari-games-on-atari-2600-venture	Prior+Duel noop	Score: 48.0
atari-games-on-atari-2600-venture	Duel noop	Score: 497.0
atari-games-on-atari-2600-venture	DDQN (tuned) noop	Score: 98.0
atari-games-on-atari-2600-venture	Duel hs	Score: 200.0
atari-games-on-atari-2600-video-pinball	Duel noop	Score: 98209.5
atari-games-on-atari-2600-video-pinball	DDQN (tuned) noop	Score: 309941.9
atari-games-on-atari-2600-video-pinball	Prior+Duel noop	Score: 479197.0
atari-games-on-atari-2600-video-pinball	Duel hs	Score: 110976.2
atari-games-on-atari-2600-wizard-of-wor	Prior+Duel noop	Score: 12352.0
atari-games-on-atari-2600-wizard-of-wor	DDQN (tuned) noop	Score: 7492.0
atari-games-on-atari-2600-wizard-of-wor	Duel hs	Score: 7054.0
atari-games-on-atari-2600-wizard-of-wor	Duel noop	Score: 7855.0
atari-games-on-atari-2600-zaxxon	Prior+Duel noop	Score: 13886.0
atari-games-on-atari-2600-zaxxon	DDQN (tuned) noop	Score: 10163.0
atari-games-on-atari-2600-zaxxon	Duel noop	Score: 12944.0
atari-games-on-atari-2600-zaxxon	Duel hs	Score: 10164.0

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程

即用型 GPU

最优价格

立即开始

Hyper Newsletters

订阅我们的最新资讯

我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新

邮件发送服务由 MailChimp 提供

Command Palette

用于深度强化学习的决斗网络架构

Ziyu Wang Tom Schaul Matteo Hessel Hado van Hasselt Marc Lanctot Nando de Freitas

摘要

代码仓库

基准测试

用 AI 构建 AI

Hyper Newsletters

Command Palette

用于深度强化学习的决斗网络架构

Ziyu Wang Tom Schaul Matteo Hessel Hado van Hasselt Marc Lanctot Nando de Freitas

摘要

代码仓库

基准测试

用 AI 构建 AI

Hyper Newsletters