Aimed at the problems that the network security situation prediction task is complex, and high in noise of data in real environments, a network security situation prediction method is proposed based on empirical mode decomposition (EMD) and improved temporal Transformer (ITTransformer). The complete EEMD with adaptive noise (CEEMDAN) method is utilized for de-noising and pre-processing network security situation data in real environments through “decomposition-reconstruction”. The paper proposes ITTransformer. The Temporal Transformer module is used to extract the time-depth global features from the network security situation data sequences. An Attention Fusion mechanism is proposed to realize the adaptive fusion of temporal features to complete the prediction task in a more robust feature fusion way. The experimental results show that the method proposed in this paper is superior in prediction accuracy to the other methods, and its coefficient of determination reaches 0.997 860, and the fitting efficiency is good.