The basic assumption in the standard person reidentification (ReID) problem is that the clothing of the target person IDs would remain constant over long periods. This assumption creates errors during real-world implementations. In addition, most of the methods that handle ReID use CNN-based networks and have found limited success because CNNs can exploit only local dependencies and suffer the loss of information due to the use of downsampling operations. In this paper, we focus on a more challenging, realistic scenario of long-term cloth-changing ReID (CC-ReID). We aim to learn robust and unique feature representations that are invariant to clothing changes to address the CC-ReID problem. To overcome the limitations faced by CNNs, we propose a Vision-transformer-based framework. We also propose to intuitively exploit the unique soft-biometric-based discriminative information such as gait features and pair them with ViT feature representation for allowing the model to generate long-range structural and contextual relationships that are crucial for re-identification task in the long-term scenario. To evaluate the proposed approach, we perform experiments on two recent CC-ReID datasets, PRCC and LTCC. The experimental results show that the proposed approach achieves state-of-the-art results on the CC-ReID task.

Cloth-Changing Person Re-identification with Self-Attention

Bansal V.;Foresti G. L.;Martinel N.
2022-01-01

Abstract

The basic assumption in the standard person reidentification (ReID) problem is that the clothing of the target person IDs would remain constant over long periods. This assumption creates errors during real-world implementations. In addition, most of the methods that handle ReID use CNN-based networks and have found limited success because CNNs can exploit only local dependencies and suffer the loss of information due to the use of downsampling operations. In this paper, we focus on a more challenging, realistic scenario of long-term cloth-changing ReID (CC-ReID). We aim to learn robust and unique feature representations that are invariant to clothing changes to address the CC-ReID problem. To overcome the limitations faced by CNNs, we propose a Vision-transformer-based framework. We also propose to intuitively exploit the unique soft-biometric-based discriminative information such as gait features and pair them with ViT feature representation for allowing the model to generate long-range structural and contextual relationships that are crucial for re-identification task in the long-term scenario. To evaluate the proposed approach, we perform experiments on two recent CC-ReID datasets, PRCC and LTCC. The experimental results show that the proposed approach achieves state-of-the-art results on the CC-ReID task.
2022
978-1-6654-5824-5
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1223894
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? ND
social impact