Abstract:In consideration of the great Impact of missiles on air combat, the continuous and multidimensional state space and the weakness of traditional approaches in ignoring opponent’s strategy in the air combat, reinforcement learning is applied to 1vs1 beyond visual range (BVR) air combat maneuvering decisions. Firstly, a new reinforcement learning framework is built to decide both sides’ maneuvers. In this framework,ε-Nash equilibrium strategy is proposed to choose action, and reward function is revised by missile attack zone scoring function. Then, by using a memory base and a target network, Q-network can be trained, forming a “value network” for BVR air combat maneuvering decisions. Finally,Q-network reinforcement learning model is designed, and the whole maneuvering decision is divided into learning part and strategy forming part. In the simulation, considering that the enemy in the air combat confrontation adopts a fixed maneuver and the two sides are both agents, the former agent wins, and the latter has the advantage of the situation to win, verifying that the agent can perceive the situation of air combat and make a reasonable BVR air combat maneuver.