Accurate detection of small and narrow-shaped defects in industrial imaging is crucial for precisely identifying and localizing anomalies. Vision Transformer (ViT)-based image anomaly detection and localization networks have exhibited remarkable performance improvements in recent years, but their conventional square patch embedding may not be optimal for fine-grained anomaly detection, where anomalies typically exhibit elongated or irregular shapes. To address this problem, we introduce a novel approach that leverages sliced-shaped patches instead of conventional square patches in Vision Transformer (ViT). This approach improves the spatial resolution and ensures more detailed feature representations. State-of-the-art results on two existing industrial anomaly detection benchmarks show that our model effectively captures morphological details and spatial dependencies, thus demonstrating its ability to capture intricate anomaly patterns.
Sliced Vision Transformers for Fine-Grained Anomaly Detection and Localization
Iqbal N.;Zanier M.;Vernier M.;Micheloni C.;Martinel N.
2025-01-01
Abstract
Accurate detection of small and narrow-shaped defects in industrial imaging is crucial for precisely identifying and localizing anomalies. Vision Transformer (ViT)-based image anomaly detection and localization networks have exhibited remarkable performance improvements in recent years, but their conventional square patch embedding may not be optimal for fine-grained anomaly detection, where anomalies typically exhibit elongated or irregular shapes. To address this problem, we introduce a novel approach that leverages sliced-shaped patches instead of conventional square patches in Vision Transformer (ViT). This approach improves the spatial resolution and ensures more detailed feature representations. State-of-the-art results on two existing industrial anomaly detection benchmarks show that our model effectively captures morphological details and spatial dependencies, thus demonstrating its ability to capture intricate anomaly patterns.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


