Traditionally, relevance judgments have relied on human annotators, but recent advances in Large Language Models (LLMs) have prompted growing interest in their use as a proxy for relevance judgments. In this setting, a key yet underexplored factor is the choice of relevance scale. Relevance scales range from binary to fine-grained ones, and their impact on the effectiveness of LLM-based judgments, the effects of scale conversions, and their role in the presence of potential data contamination remain unclear. We systematically investigate how different scales, and their conversions, affect LLMs’ ability to provide reliable relevance judgments across multiple prompting strategies and model sizes. Using a popular TREC collection, we compare model outputs with both crowd and expert annotations, analyzing alignment, stability, and signs of potential data contamination.

Large Language Models as Assessors: On the Impact of Relevance Scales

Zamolo, Riccardo
;
Lunardi, Riccardo
;
Soprano, Michael
;
Demartini, Gianluca;Mizzaro, Stefano
;
Roitero, Kevin
2026-01-01

Abstract

Traditionally, relevance judgments have relied on human annotators, but recent advances in Large Language Models (LLMs) have prompted growing interest in their use as a proxy for relevance judgments. In this setting, a key yet underexplored factor is the choice of relevance scale. Relevance scales range from binary to fine-grained ones, and their impact on the effectiveness of LLM-based judgments, the effects of scale conversions, and their role in the presence of potential data contamination remain unclear. We systematically investigate how different scales, and their conversions, affect LLMs’ ability to provide reliable relevance judgments across multiple prompting strategies and model sizes. Using a popular TREC collection, we compare model outputs with both crowd and expert annotations, analyzing alignment, stability, and signs of potential data contamination.
2026
9783032212993
9783032213006
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1326264
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact