Abstract:With the rapid development of the information society, taking video sensors as the front end for acquiring information is of great significance in effectively finding specific objects through pedestrian re identification algorithms to protect people’s lives and property. This paper applies deep learning to the field of person re identification, and embeds the multi scale attention fusion module into the neural network for multi scale feature extraction and representation, effectively improving the recognition performance of the attention mechanism for deep learning networks. The paper proposes a multi scale channel attention fusion module based on SE block in combination with the ResNet50 convolutional neural network to extract features, further extract the feature sequence context information through the bidirectional LSTM network, and improve the model’s ability to extract important image features. At the same time, the attention to redundant features of images is reduced. Finally, the network model is jointly trained by the cascaded hard sampling triplet loss function and the cross entropy loss function, clustering the samples in the high dimensional feature space, and further improving the model recognition accuracy. Market1501 dataset and CUHK03 dataset are tested by the proposed algorithm respectively, and compared with other attention module algorithms under the same conditions. In order to further verify the function of each module, an ablation experiment is performed by the algorithm to verify the effectiveness of each module. The experimental results show that the proposed method can be effectively applied to person re identification