Aggregating Deep Pyramidal Representations for Person Re-Identification

IRIS

Learning discriminative, view-invariant and multi-scale representations of person appearance with different semantic levels is of paramount importance for person Re-Identification (Re-ID). A surge of effort has been spent by the community to learn deep Re-ID models capturing a holistic single semantic level feature representation. To improve the achieved results, additional visual attributes and body part-driven models have been considered. However, these require extensive human annotation labor or demand additional computational efforts. We argue that a pyramid-inspired method capturing multi-scale information may overcome such requirements. Precisely, multi-scale stripes that represent visual information of a person can be used by a novel architecture factorizing them into latent discriminative factors at multiple semantic levels. A multi-task loss is combined with a curriculum learning strategy to learn a discriminative and invariant person representation which is exploited for triplet-similarity learning. Results on three benchmark Re-ID datasets demonstrate that better performance than existing methods are achieved (e.g., more than 90% accuracy on the Duke-MTMC dataset).

Aggregating Deep Pyramidal Representations for Person Re-Identification

Martinel N.^Primo;Foresti G. L.;Micheloni C.

2019-01-01

Abstract

Learning discriminative, view-invariant and multi-scale representations of person appearance with different semantic levels is of paramount importance for person Re-Identification (Re-ID). A surge of effort has been spent by the community to learn deep Re-ID models capturing a holistic single semantic level feature representation. To improve the achieved results, additional visual attributes and body part-driven models have been considered. However, these require extensive human annotation labor or demand additional computational efforts. We argue that a pyramid-inspired method capturing multi-scale information may overcome such requirements. Precisely, multi-scale stripes that represent visual information of a person can be used by a novel architecture factorizing them into latent discriminative factors at multiple semantic levels. A multi-task loss is combined with a curriculum learning strategy to learn a discriminative and invariant person representation which is exploited for triplet-similarity learning. Results on three benchmark Re-ID datasets demonstrate that better performance than existing methods are achieved (e.g., more than 90% accuracy on the Duke-MTMC dataset).

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Codice ISBN
	
				978-1-7281-2506-0
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Martinel_Aggregating_Deep_Pyramidal_Representations_for_Person_Re-Identification_CVPRW_2019_paper.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 15.06 MB Formato Adobe PDF Visualizza/Apri	15.06 MB	Adobe PDF	Visualizza/Apri
foresti2.pdf non disponibili Tipologia: Versione Editoriale (PDF) Licenza: Non pubblico Dimensione 1.16 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.16 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1187047

Citazioni

ND

57

ND

social impact