Paper
10 November 2022 Wasserstein gradient flows policy optimization via input convex neural networks
Yixuan Wang
Author Affiliations +
Proceedings Volume 12348, 2nd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2022); 1234812 (2022) https://doi.org/10.1117/12.2641331
Event: 2nd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2022), 2022, Zhuhai, China
Abstract
Reinforcement learning (RL) is a widely used learning paradigm today. As a common RL method, policy optimization usually updates parameters by maximizing the expected cumulative rewards and other information obtained in the process of environment interaction. To get a better understanding of the RL and its learning theory, there is also proposed that RL can be regarded as the optimal transport problem in a probability measure space. On this basis, we get a large-scale Wasserstein gradient flow RL method by introducing input convex neural networks (ICNNs) to improve the Jordan-Kinderlehrer-otto (JKO) scheme.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yixuan Wang "Wasserstein gradient flows policy optimization via input convex neural networks", Proc. SPIE 12348, 2nd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2022), 1234812 (10 November 2022); https://doi.org/10.1117/12.2641331
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Neural networks

Stochastic processes

Diffusion

Particles

Probability theory

Differential equations

Numerical analysis

Back to Top