文章摘要
刘峰, 魏瑞轩, 丁超, 姜龙亭, 李天.面向多机协同的Att-MADDPG围捕控制方法设计[J].空军工程大学学报:自然科学版,2021,22(3):9-14
面向多机协同的Att-MADDPG围捕控制方法设计
Design of Att MADDPG Hunting Control Method for Multi UAV Cooperation
  
DOI:
中文关键词: 协同围捕  强化学习  MADDPG  智能性涌现
英文关键词: cooperative hunting  reinforcement learning  MADDPG  intelligence emergence
基金项目:科技部“新一代人工智能”重点项目(2018AAA0102403)
作者单位
刘峰, 魏瑞轩, 丁超, 姜龙亭, 李天 空军工程大学航空工程学院 西安 710051 
摘要点击次数: 13
全文下载次数: 30
中文摘要:
      多无人机对动态目标的围捕是无人机集群作战中的重要问题。针对面向动态目标的集群围捕问题,通过分析基于MADDPG算法的围捕机制的不足,借鉴Google机器翻译团队使用的注意力机制,将注意力机制引入围捕过程,设计基于注意力机制的协同围捕策略,构建了相应的围捕算法。基于AC框架对MADDPG进行改进,首先,在Critic网络加入Attention模块,依据不同注意力权重对所有围捕无人机进行信息处理;然后,在Actor网络加入Attention模块,促使其他无人机进行协同围捕。仿真实验表明,Att MADDPG算法较MADDPG算法的训练稳定性提高8.9%,任务完成耗时减少19.12%,经学习后的围捕无人机通过协作配合使集群涌现出更具智能化围捕行为。
英文摘要:
      The hunting of dynamic targets by multi UAV is an important problem in UAV swarm operations. In this paper, aiming at the dynamic target oriented swarm hunting problem, by analyzing the shortcomings of the hunting mechanism based on MADDPG algorithm, and learning from the attention mechanism used by Google machine translation team, we introduce the attention mechanism into the hunting process, design the cooperative hunting strategy based on the attention mechanism, and construct the corresponding hunting algorithm. Improve MADDPG based on AC framework. First of all, the attention module is added to critical network to process the information of all UAVs according to different attention weights; then, the attention module is added to actor network to promote other UAVs to carry out cooperative hunting. The simulation results show that Att MADDPG algorithm can improve the training stability by 8.9% and reduce the task completion time by 19.12% compared with MADDPG algorithm. After learning, the UAV can cooperate to make the swarm emerge more intelligent behavior.
查看全文   查看/发表评论  下载PDF阅读器
关闭