Paper
22 May 2023 Task-based dialogue policy learning based on decision transformer
Zhenyou Zhou, Zhibin Liu, Yuhan Liu
Author Affiliations +
Proceedings Volume 12640, International Conference on Internet of Things and Machine Learning (IoTML 2022); 126401O (2023) https://doi.org/10.1117/12.2673682
Event: International Conference on Internet of Things and Machine Learning (IoTML 2022), 2022, Harbin, China
Abstract
Due to interactions with real users, online reinforcement learning training projects for dialogue agents are expensive. User simulator is an alternative method that is commonly used. However, the environment of a user simulator is not identical to that of a real user, and it cannot provide the atypical and more variegated conversational behavior that is a hallmark of human spontaneity. We employ offline reinforcement learning and Transformer to abstract dialogue policy as a framework for sequence modeling problems, modeling the joint distribution of state, action, and reward sequences to generate optimal dialogue actions. An evaluation of the Multiwoz dataset shows that DT successfully improves the efficiency of DRL dialogue agents and improves dialogue robustness.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zhenyou Zhou, Zhibin Liu, and Yuhan Liu "Task-based dialogue policy learning based on decision transformer", Proc. SPIE 12640, International Conference on Internet of Things and Machine Learning (IoTML 2022), 126401O (22 May 2023); https://doi.org/10.1117/12.2673682
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Education and training

Systems modeling

Modeling

Autoregressive models

Model-based design

Back to Top