Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
피인용 상위 자료
Human-level control through deep reinforcement learning.
Mnih, V., Kavukcuoglu, K., Silver, D. and 16 more
(2015) Nature, 518 (7540), pp. 529-533.
Trust region policy optimization.
Schulman, J., Levine, S., Moritz, P. and 2 more
(2015) 32nd International Conference on Machine Learning, ICML 2015, 3, pp. 1889-1897.
Deep reinforcement learning with double Q-Learning.
Van Hasselt, H., Guez, A., Silver, D.
(2016) 30th AAAI Conference on Artificial Intelligence, AAAI 2016, pp. 2094-2100.
Deterministic policy gradient algorithms
Silver, D., Lever, G., Heess, N. and 3 more
(2014) 31st International Conference on Machine Learning, ICML 2014, 1, pp. 605-619.
Asynchronous methods for deep reinforcement learning.
Mnih, V., Badia, A.P., Mirza, L. and 5 more
(2016) 33rd International Conference on Machine Learning, ICML 2016, 4, pp. 2850-2869