In contrasting misinformation, both automated approaches using Large Language Models (LLMs) and human-based crowdsourcing have been explored. We enhance LLMs with a retrieval component to create a Retrieval-Augmented Generation (RAG) fact-checking system. Using state-of-the-art LLMs and a popular dataset, we empirically evaluate the system’s effectiveness and compare it with crowdsourced annotations. Our findings indicate that while RAG systems excel at detecting clear misinformation, they struggle with subtler distinctions where human judgment is more discerning. Interestingly, RAG systems show higher effectiveness on newer statements, suggesting adaptability to emerging misinformation.
A Comparative Analysis of Retrieval-Augmented Generation and Crowdsourcing for Fact-Checking
Mizzaro S.;Roitero K.
2025-01-01
Abstract
In contrasting misinformation, both automated approaches using Large Language Models (LLMs) and human-based crowdsourcing have been explored. We enhance LLMs with a retrieval component to create a Retrieval-Augmented Generation (RAG) fact-checking system. Using state-of-the-art LLMs and a popular dataset, we empirically evaluate the system’s effectiveness and compare it with crowdsourced annotations. Our findings indicate that while RAG systems excel at detecting clear misinformation, they struggle with subtler distinctions where human judgment is more discerning. Interestingly, RAG systems show higher effectiveness on newer statements, suggesting adaptability to emerging misinformation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


