搜索结果：Wikipedia

共找到 20 条结果

高级筛选 ▾

Web2Wiki: Characterizing Wikipedia Linking Across the Web

arXiv

Wikipedia is one of the most visited websites globally, yet its role beyond its own platform remains largely unexplored. In this paper, we present the first large-scale analysis of how Wikipedia is referenced across the Web. Using a dataset from Common Crawl, we identify over 90 million Wikipedia links spanning 1.68% of Web domains and examine their distribution, context, and function. Our analysis of English Wikipedia reveals three key findings: (1) Wikipedia is most frequently cited by news and science websites for informational purposes, while commercial websites reference it less often. (2) The majority of Wikipedia links appear within the main content rather than in boilerplate or user-generated sections, highlighting their role in structured knowledge presentation. (3) Most links (95%) serve as explanatory references rather than as evidence or attribution, reinforcing Wikipedia's function as a background knowledge provider. While this study focuses on English Wikipedia, our publicly released Web2Wiki dataset includes links from multiple language editions, supporting future research on Wikipedia's global influence on the Web.

Factual Inconsistencies in Multilingual Wikipedia Tables

arXiv2025-07-24作者：Silvia Cappa, Lingxiao Kong, Pille-Riin Peet

Wikipedia serves as a globally accessible knowledge source with content in over 300 languages. Despite covering the same topics, the different versions of Wikipedia are written and updated independently. This leads to factual inconsistencies that can impact the neutrality and reliability of the encyclopedia and AI systems, which often rely on Wikipedia as a main training source. This study investigates cross-lingual inconsistencies in Wikipedia's structured content, with a focus on tabular data. We developed a methodology to collect, align, and analyze tables from Wikipedia multilingual articles, defining categories of inconsistency. We apply various quantitative and qualitative metrics to assess multilingual alignment using a sample dataset. These insights have implications for factual verification, multilingual knowledge interaction, and design for reliable AI systems leveraging Wikipedia content.

搜索结果：Wikipedia

Web2Wiki: Characterizing Wikipedia Linking Across the Web

Factual Inconsistencies in Multilingual Wikipedia Tables

Wikipedia Citations: Reproducible Citation Extraction from Multilingual Wikipedia

Recommended Practices for NPOV Research on Wikipedia

Wikipedia in the Era of LLMs: Evolution and Risks

Exploring Wikipedia Gender Diversity Over Time $\unicode{x2013}$ The Wikipedia Gender Dashboard (WGD)

Can we cite Wikipedia? What if Wikipedia was more reliable than its detractors ?

An Open Multilingual System for Scoring Readability of Wikipedia

Research Citations Building Trust in Wikipedia

TWikiL -- The Twitter Wikipedia Link Dataset

Estimating Gender Completeness in Wikipedia

Wikipedia Text Reuse: Within and Without

WikipediaBot: Automated Adversarial Manipulation of Wikipedia Articles

Polarization and reliability of news sources in Wikipedia

Orphan Articles: The Dark Matter of Wikipedia

Princ-wiki-a Mathematica: Wikipedia editing and mathematics

Why We Read Wikipedia

A Map of Science in Wikipedia

Improving Wikipedia Verifiability with AI

How much is Wikipedia Lagging Behind News?