Large Language Models (LLMs) are increasingly used to replicate human decision-making in subjective tasks. In this work, we investigate whether LLMs can effectively impersonate real crowd workers when evaluating political misinformation statements. We assess (i) the agreement between LLM-generated assessments and human judgments and (ii) whether impersonation skews LLM assessments, impacting accuracy. Using publicly available misinformation assessment datasets, we prompt LLMs to impersonate real crowd workers based on their demographic profiles and evaluate them under the same statements. Through comparative analysis, we measure agreement rates and discrepancies in classification patterns. Our findings suggest that while some LLMs align moderately with crowd assessments, their impersonation ability remains inconsistent. Impersonation does not uniformly improve accuracy and often reinforces systematic biases, highlighting limitations in replicating human judgment.
Impersonating the Crowd: Evaluating LLMs' Ability to Replicate Human Judgment in Misinformation Assessment
David La Barbera;Riccardo Lunardi
;Kevin Roitero
2025-01-01
Abstract
Large Language Models (LLMs) are increasingly used to replicate human decision-making in subjective tasks. In this work, we investigate whether LLMs can effectively impersonate real crowd workers when evaluating political misinformation statements. We assess (i) the agreement between LLM-generated assessments and human judgments and (ii) whether impersonation skews LLM assessments, impacting accuracy. Using publicly available misinformation assessment datasets, we prompt LLMs to impersonate real crowd workers based on their demographic profiles and evaluate them under the same statements. Through comparative analysis, we measure agreement rates and discrepancies in classification patterns. Our findings suggest that while some LLMs align moderately with crowd assessments, their impersonation ability remains inconsistent. Impersonation does not uniformly improve accuracy and often reinforces systematic biases, highlighting limitations in replicating human judgment.| File | Dimensione | Formato | |
|---|---|---|---|
|
3731120.3744581.pdf
accesso aperto
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
1.07 MB
Formato
Adobe PDF
|
1.07 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


