Assessing Source Reliability on Wikipedia
Our research paper “Language-Agnostic Modeling of Source Reliability on Wikipedia” has been published in ACM Transactions on the Web
How can the reliability of a source in Wikipedia be assessed? In this collaboration with Universitat Pompeu Fabra, Universitat Oberta de Catalunya, and ISI Foundation, we address this question. We present language-agnostic model that leverages editing behavior patterns to evaluate the reliability of web domains used as references across multiple Wikipedia language editions. Our approach analyzes features such as how long sources remain cited in articles, the type of editors who add or remove them, and their usage patterns across controversial topics like Climate Change and COVID-19. Our model achieves an F1 Macro score of approximately 0.80 for English and other high-resource languages, and shows that combining data from multiple languages can improve the performance for low-resource languages, bridging the gap across Wikipedia’s more than 300 language communities.