Sample-Observed Soft Actor-Critic Learning for Path Following of a Biomimetic Underwater Vehicle
Ma, Ruichen2,3; Wang, Yu2; Wang, Shuo2,3; Cheng, Long2; Wang, Rui2; Tan, Ming1
刊名IEEE Transactions on Automation Science and Engineering
2023-04
页码1-10
DOI10.1109/TASE.2023.3264237
英文摘要

This paper addresses a learning-based path following control scheme for a biomimetic underwater vehicle (BUV) driven by undulatory fins. A dynamic line-of-sight (DLOS) guidance system is designed, which uses a virtual ball with a dynamic radius to detect the reference path. This DLOS system guides our BUV in the path following control and extracts essential information for the Markov decision process (MDP) of the control task. A deep reinforcement learning (DRL) algorithm, sample-observed soft actor-critic (SOSAC) is proposed. The can train out control policy with greater cumulative reward and higher success rate by using two tricks: sample observation and sample diversification. Based on the DLOS system, the MDP of the control task, and a multilayer perceptron (MLP) trained by the SOSAC, our control scheme is established. Experiments show that our BUV can successfully achieve path following control in an indoor pool environment by using this control scheme. Note to Practitioners —The motivation of this paper is to design a practical end-to-end path following control scheme for the BUV driven by undulatory fins, and verify this scheme in a real-world environment. Unlike common autonomous underwater vehicles (AUVs) using axial propellers, the BUVs apply biomimetic propellers such as the undulatory fin. Multimodel wave patterns can be implemented by the undulatory fin, which generates nonlinear thrust and lateral force simultaneously. This propulsive feature makes the driving force on different directions of the BUV to be strong coupled, and it is complicated to convert the outputs of a common controller into waveform parameters of the undulatory fins to control the BUV. Therefore, in this paper, we proposed an end-to-end learning-based path following controller, which observes environmental information and directly generates waveform parameters to control our BUV. Experiments suggest that our control scheme is practical and valid.

URL标识查看原文
语种英语
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/52359]  
专题智能机器人系统研究
通讯作者Wang, Yu
作者单位1.Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
2.State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
3.University of Chinese Academy of Sciences, Beijing 100049, China
推荐引用方式
GB/T 7714
Ma, Ruichen,Wang, Yu,Wang, Shuo,et al. Sample-Observed Soft Actor-Critic Learning for Path Following of a Biomimetic Underwater Vehicle[J]. IEEE Transactions on Automation Science and Engineering,2023:1-10.
APA Ma, Ruichen,Wang, Yu,Wang, Shuo,Cheng, Long,Wang, Rui,&Tan, Ming.(2023).Sample-Observed Soft Actor-Critic Learning for Path Following of a Biomimetic Underwater Vehicle.IEEE Transactions on Automation Science and Engineering,1-10.
MLA Ma, Ruichen,et al."Sample-Observed Soft Actor-Critic Learning for Path Following of a Biomimetic Underwater Vehicle".IEEE Transactions on Automation Science and Engineering (2023):1-10.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace