VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization

IRIS

We present a transformer-based image anomaly detection and localization network. Our proposed model is a combination of a reconstruction-based approach and patch embedding. The use of transformer networks helps preserving the spatial information of the embedded patches, which is later processed by a Gaussian mixture density network to localize the anomalous areas. In addition, we also publish BTAD, a real-world industrial anomaly dataset. Our results are compared with other state-of-the-art algorithms using publicly available datasets like MNIST and MVTec.

VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization

Mishra P.;Verk R.;Fornasier D.;Piciarelli C.;Foresti G. L.

2021-01-01

Abstract

We present a transformer-based image anomaly detection and localization network. Our proposed model is a combination of a reconstruction-based approach and patch embedding. The use of transformer networks helps preserving the spatial information of the embedded patches, which is later processed by a Gaussian mixture density network to localize the anomalous areas. In addition, we also publish BTAD, a real-world industrial anomaly dataset. Our results are compared with other state-of-the-art algorithms using publicly available datasets like MNIST and MVTec.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Codice ISBN
	
				978-1-7281-9023-5
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1215631

Citazioni

ND

266

196

social impact