Paper
27 March 2024 Self-supervised boundary offline reinforcement learning
Jiahao Shen
Author Affiliations +
Proceedings Volume 13105, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023); 131052O (2024) https://doi.org/10.1117/12.3026355
Event: 3rd International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023), 2023, Qingdao, China
Abstract
Offline reinforcement learning represents a pivotal area of advancement within the broader realm of reinforcement learning. Its central objective is to train an agent exclusively using behavioral data, eliminating any need for online interaction. However, relying solely on insights from offline datasets can often lead to ineffective solutions, primarily due to the mismatching between the learned policy's understanding and the actual underlying environment. Recent research efforts have tended to approach this challenge with an overly pessimistic mindset, potentially compromising the agent's robustness when encountering unseen states. We introduce a self-supervised framework tailored to mitigate this issue. Drawing inspiration from contrastive-based techniques in self-supervised learning, we treat original data as positive samples and generate synthetic data from highly uncertain regions as negative samples. To simulate these regions, we employ modified Generative Adversarial Networks (GANs) to produce samples that mirroring the distribution of previous experiences and introducing a significant degree of uncertainty in terms of behavioral policy at the same time. To bolster the policy's robustness, we impose penalties for overconfident behavior when dealing with negative data. Comprehensive experiments conducted on multiple public offline reinforcement learning benchmarks have highlighted the practicality and efficacy of our framework.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Jiahao Shen "Self-supervised boundary offline reinforcement learning", Proc. SPIE 13105, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023), 131052O (27 March 2024); https://doi.org/10.1117/12.3026355
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Machine learning

Data modeling

Adversarial training

Performance modeling

Statistical modeling

Back to Top