Self-supervised boundary offline reinforcement learning

Jiahao Shen

doi:10.1117/12.3026355

27 March 2024 Self-supervised boundary offline reinforcement learning

Jiahao Shen

Proceedings Volume 13105, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023); 131052O (2024) https://doi.org/10.1117/12.3026355
Event: 3rd International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023), 2023, Qingdao, China

Abstract

Offline reinforcement learning represents a pivotal area of advancement within the broader realm of reinforcement learning. Its central objective is to train an agent exclusively using behavioral data, eliminating any need for online interaction. However, relying solely on insights from offline datasets can often lead to ineffective solutions, primarily due to the mismatching between the learned policy's understanding and the actual underlying environment. Recent research efforts have tended to approach this challenge with an overly pessimistic mindset, potentially compromising the agent's robustness when encountering unseen states. We introduce a self-supervised framework tailored to mitigate this issue. Drawing inspiration from contrastive-based techniques in self-supervised learning, we treat original data as positive samples and generate synthetic data from highly uncertain regions as negative samples. To simulate these regions, we employ modified Generative Adversarial Networks (GANs) to produce samples that mirroring the distribution of previous experiences and introducing a significant degree of uncertainty in terms of behavioral policy at the same time. To bolster the policy's robustness, we impose penalties for overconfident behavior when dealing with negative data. Comprehensive experiments conducted on multiple public offline reinforcement learning benchmarks have highlighted the practicality and efficacy of our framework.

(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Jiahao Shen "Self-supervised boundary offline reinforcement learning", Proc. SPIE 13105, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023), 131052O (27 March 2024); https://doi.org/10.1117/12.3026355

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available