The opacity of deep neural networks (DNNs) poses challenges in understanding the causes of their misbehaviours. Illumination search characterizes the inputs of a DNN by means of relevant features and explores the resulting feature map extensively. This facilitates the interpretation of misbehaviour-inducing inputs based on the regions they occupy in the feature map. However, current illumination-based approaches necessitate human expert involvement for the definition of the features, limiting broad applicability. In this paper, we address these limitations with DeepTheia, our fully automated illumination-based test generator that automatically extracts the features and explores the feature space using cutting-edge diffusion models. Experimental results show that DeepTheia consistently extracts highly discriminative features. Independent human assessors certified that DeepTheia is able to group misbehaviour-inducing inputs in a way that is understandable to humans in over 78% of the cases. Moreover, the inputs generated by DeepTheia were useful in significantly improving the ability of the original DL systems to handle inputs with critical feature combinations through fine-tuning.

Automated Feature Extraction for Testing Deep Learning Systems Through Illumination Search

Riccio V.;
2026-01-01

Abstract

The opacity of deep neural networks (DNNs) poses challenges in understanding the causes of their misbehaviours. Illumination search characterizes the inputs of a DNN by means of relevant features and explores the resulting feature map extensively. This facilitates the interpretation of misbehaviour-inducing inputs based on the regions they occupy in the feature map. However, current illumination-based approaches necessitate human expert involvement for the definition of the features, limiting broad applicability. In this paper, we address these limitations with DeepTheia, our fully automated illumination-based test generator that automatically extracts the features and explores the feature space using cutting-edge diffusion models. Experimental results show that DeepTheia consistently extracts highly discriminative features. Independent human assessors certified that DeepTheia is able to group misbehaviour-inducing inputs in a way that is understandable to humans in over 78% of the cases. Moreover, the inputs generated by DeepTheia were useful in significantly improving the ability of the original DL systems to handle inputs with critical feature combinations through fine-tuning.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1324718
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact