Generative AI can adversely impact news publishers by lowering consumer demand. It can also reduce demand for newsroom employees, and increase the creation of news "slop." However, it can also form a source of traffic referrals and an information-discovery channel that increases demand. We use high-frequency granular data to analyze the strategic response of news publishers to the introduction of Generative AI. Many publishers strategically blocked LLM access to their websites using the robots.txt file standard. Using a difference-in-differences approach, we find that large publishers who block GenAI bots experience reduced website traffic compared to not blocking. In addition, we find that large publishers shift toward richer content that is harder for LLMs to replicate, without increasing text volume. Finally, we find that the share of new editorial and content-production job postings rises over time. Together, these findings illustrate the levers that publishers choose to use to strategically respond to competitive Generative AI threats, and their consequences.
Global scholarly publishing has been dominated by a small number of publishers for several decades. We aimed to revisit the debate on corporate control of scholarly publishing by analyzing the relative shares of major publishers and smaller, independent publishers. Using the Web of Science, Dimensions and OpenAlex, we managed to retrieve twice as many articles indexed in Dimensions and OpenAlex, compared to the rather selective Web of Science. As a result of excluding smaller publishers, the 'oligopoly' of scholarly publishers persists, at least in appearance, according to the Web of Science. However, both Dimensions' and OpenAlex' inclusive indexing revealed the share of smaller publishers has been growing rapidly, especially since the onset of large-scale online publishing around 2000, resulting in a current cumulative dominance of smaller publishers. While the expansion of small publishers was most pronounced in the social sciences and humanities, the natural and medical sciences showed a similar trend. A major geographical divergence is also revealed, with some countries, mostly Anglo-Saxon and/or located in northwestern Europe, relying heavily on major publishers for the disse
Political news on social media rarely circulates in isolation: audiences actively engage, react, and clash. Whether these interactions reflect agreement or conflict may depend on the ideological discrepancy between publishers and the news content they share. This study investigates this relationship using Facebook posts linking to political news during a Brazilian presidential election. We analyze five dimensions of engagement: ideological discrepancy between publishers and content, emotional responses, audience consensus, toxicity in posts, and content topics. Our results show that ideological discrepancy is associated with differences in engagement, exhibiting a nonlinear pattern: consensus declines under conditions of very high ideological mismatch and, in our data, also under very high alignment, while toxicity increases primarily under extreme mismatch. A statistical model indicates that emotional valence, toxicity, and ideological discrepancy are the factors most strongly associated with consensus. Among highly partisan publishers, higher toxicity is associated with increased audience consensus, suggesting that hostile discourse may co-occur with in-group agreement in strongl
Retractions are the primary mechanism for correcting the scholarly record, yet publishers differ markedly in how they use them. We present a bibliometric analysis of 46,087 retractions across 10 major publishers using data from the Retraction Watch database (1997-2026), examining retraction rates, reasons, temporal trends, and geographic distributions, among other dimensions. Normalized retraction rates vary by two orders of magnitude, from Elsevier's 3.97 per 10,000 publications to Hindawi's 320.02. China-affiliated authors account for the largest share of retractions at every publisher. Retraction lags and reason profiles also vary widely across publishers. Among the ten publishers, ACM is an outlier in its retraction profile. ACM's normalized rate is mid-range (5.65), yet 98.3% of its 354 retractions are related to one incident. Seven of the ten most common global retraction reasons (including misconduct, plagiarism, and data concerns) are entirely absent from ACM's record. ACM's first retraction dates to 2020, despite a catalog dating to 1997. ACM self-describes its retraction threshold as "extremely high." We discuss this threshold in relation to the COPE retraction guidelines
Regulators and browsers increasingly restrict user tracking to protect users' privacy online. In two large-scale empirical studies, we study the economic implications for publishers relying on selling advertising space to finance their content. In our first study, we draw on 42 million ad impressions from 111 publishers covering EU desktop browsing traffic in 2016. In our second study, we use 218 million ad impressions from 10,526 publishers (i.e., apps) covering EU and US mobile in-app browsing traffic in 2023. The two studies differ in the share of trackable users (Study 1: 85%; Study 2: Apple: 17%, Android: 91%). Still, we find similar average ad impression price decreases (Study 1: 18% and Study 2: 23%) when user tracking is unavailable. More than 90% of the publishers realize lower prices when selling ad impressions for untrackable users. Publishers offering content on sports, cars, lifestyle & shopping, and news & information suffer the most. Premium publishers with high-quality edited content and strong reputations, thematic-focused (niche) publishers, and smaller publishers suffer less from the unavailability of user tracking. In contrast, non-premium publishers wit
Existing methods for assessing the trustworthiness of news publishers face high costs and scalability issues. The tool presented in this paper supports the efforts of specialized organizations by providing a solution that, starting from an online discussion, provides (i) trustworthiness ratings for previously unclassified news publishers and (ii) an interactive platform to guide annotation efforts and improve the robustness of the ratings. The system implements a novel framework for assessing the trustworthiness of online news publishers based on user interactions on social media platforms.
This study examines the effect of article processing charge (APC) waivers on the participation of Ukrainian researchers in fully Gold Open Access (Gold OA) journals published by the five largest academic publishers - Elsevier, SAGE, Springer Nature, Taylor & Francis, and Wiley - during the period 2019-2024. These publishers were selected because, in response to the full-scale war launched against Ukraine in 2022, all five introduced emergency 100% APC-waiver policies for Ukrainian authors. Using bibliometric data from the Web of Science Core Collection, the study analyses publication trends in Ukrainian-authored articles in fully Gold OA journals of these publishers before and after 2022. The results show a marked post-2022 increase in Ukraine's Gold OA output, particularly in journals published by Springer Nature and Elsevier. Disciplinary and publisher-specific patterns are evident, with especially strong growth in the medical and applied sciences. The findings underscore the potential of targeted support measures during times of crisis, while also illustrating the inherent limitations of APC-based publishing models in fostering equitable scholarly communication.
Retractions serve as an indicator of failures in research integrity, yet most analyses focus on absolute counts rather than risk per paper. We use one of the largest open bibliographic databases to develop incidence metrics normalized by population: retractions per publication and per active author annually. Applying an epidemiological framework that models counts with exposure, we find evidence of exponential growth in retraction incidence, with approximately a 5-year doubling time at both the paper and author levels. These patterns vary significantly across fields, publishers, and countries. While scientific output is becoming more democratized globally, retractions are concentrated in fewer countries, creating a "concentration" paradox that calls for targeted monitoring. Despite exponential growth, the absolute incidence remains low (0.12% in 2021), allowing for corrective intervention. Incidence-based monitoring provides a framework for evaluating policies that safeguard research integrity at scale.
With the primary goal of raising readers' awareness of misinformation phenomena, extensive efforts have been made by both academic institutions and independent organizations to develop methodologies for assessing the trustworthiness of online news publishers. Unfortunately, existing approaches are costly and face critical scalability challenges. This study presents a novel framework for assessing the trustworthiness of online news publishers using user interactions on social media platforms. The proposed methodology provides a versatile solution that serves the dual purpose of i) identifying verifiable online publishers and ii) automatically performing an initial estimation of the trustworthiness of previously unclassified online news outlets.
We present the results of the Bibliometric Indicators for Publishers project (also known as BiPublishers). This project represents the first attempt to systematically develop bibliometric publisher rankings. The data for this project was derived from the Book Citation Index, and the study time period was 2009-2013. We have developed 42 rankings: 4 for by fields and 38 by disciplines. We display six indicators by publisher divided into three types: output, impact and publisher's profile. The aim is to capture different characteristics of the research performance of publishers. 254 publishers were processed and classified according to publisher type: commercial publishers and university presses. We present the main publishers by fields. Then, we discuss the main challenges presented when developing this type of tools. The BiPublishers ranking is an on-going project which aims to develop and explore new data sources and indicators to better capture and define the research impact of publishers.
This study presents estimates of the global expenditure on article processing charges (APCs) paid to six publishers for open access between 2019 and 2023. APCs are fees charged for publishing in some fully open access journals (gold) and in subscription journals to make individual articles open access (hybrid). There is currently no way to systematically track institutional, national or global expenses for open access publishing due to a lack of transparency in APC prices, what articles they are paid for, or who pays them. We therefore curated and used an open dataset of annual APC list prices from Elsevier, Frontiers, MDPI, PLOS, Springer Nature, and Wiley in combination with the number of open access articles from these publishers indexed by OpenAlex to estimate that, globally, a total of \$8.349 billion (\$8.968 billion in 2023 US dollars) were spent on APCs between 2019 and 2023. We estimate that in 2023 MDPI (\$681.6 million), Elsevier (\$582.8 million) and Springer Nature (\$546.6) generated the most revenue with APCs. After adjusting for inflation, we also show that annual spending almost tripled from \$910.3 million in 2019 to \$2.538 billion in 2023, that hybrid exceed gol
The proliferation of low-quality online information in today's era has underscored the need for robust and automatic mechanisms to evaluate the trustworthiness of online news publishers. In this paper, we analyse the trustworthiness of online news media outlets by leveraging a dataset of 4033 news stories from 40 different sources. We aim to infer the trustworthiness level of the source based on the classification of individual articles' content. The trust labels are obtained from NewsGuard, a journalistic organization that evaluates news sources using well-established editorial and publishing criteria. The results indicate that the classification model is highly effective in classifying the trustworthiness levels of the news articles. This research has practical applications in alerting readers to potentially untrustworthy news sources, assisting journalistic organizations in evaluating new or unfamiliar media outlets and supporting the selection of articles for their trustworthiness assessment.
Here, we note how academics, journals and publishers should no longer refer to the social media platform Twitter as such, rather as X. Relying on Google Scholar, we found 16 examples of papers published in the last months of 2023 - essentially during the transition period between Twitter and X - that used Twitter and X, but in different ways. Unlike that transition period in which the binary Twitter/X could have been used in academic papers, we suggest that papers should no longer refer to Twitter as Twitter, but only as X, except for historical studies about that social media platform, because such use would be factually incorrect.
This paper introduces a dataset of article processing charges (APCs) produced from the price lists of six large scholarly publishers - Elsevier, Frontiers, PLOS, MDPI, Springer Nature and Wiley - between 2019 and 2023. APC price lists were downloaded from publisher websites each year as well as via Wayback Machine snapshots to retrieve fees per journal per year. The dataset includes journal metadata, APC collection method, and annual APC price list information in several currencies (USD, EUR, GBP, CHF, JPY, CAD) for 8,712 unique journals and 36,618 journal-year combinations. The dataset was generated to allow for more precise analysis of APCs and can support library collection development and scientometric analysis estimating APCs paid in gold and hybrid OA journals.
Here we describe the Bibliometric Indicators for Publishers Project, an initiative undertaken by EC3Metrics SL for the analysis and development of indicators based on books and book chapters. Its goal is to study and analyze the publication and citation patterns of books and book chapters considering academic publishers as the unit of analysis. It aims at developing new methodologies and indicators that can better capture and define the research impact of publishers. It is an on-going project in which data sources and indicators are tested. We consider academic publishers as an analogy of journals, focusing on them as the unit of analysis. In this working paper we present the http://bipublishers.es website where all findings derived from the project are displayed. We describe the data retrieval and normalization process and we show the main results. A total 482,470 records have been retrieved and processed, identifying 342 publishers from which 254 have been analyzed. Then six indicators have been calculated for each publisher for four fields and 38 disciplines and displayed.
We study a game-theoretic information retrieval model in which strategic publishers aim to maximize their chances of being ranked first by the search engine while maintaining the integrity of their original documents. We show that the commonly used Probability Ranking Principle (PRP) ranking scheme results in an unstable environment where games often fail to reach pure Nash equilibrium. We propose two families of ranking functions that do not adhere to the PRP principle. We provide both theoretical and empirical evidence that these methods lead to a stable search ecosystem, by providing positive results on the learning dynamics convergence. We also define the publishers' and users' welfare, demonstrate a possible publisher-user trade-off, and provide means for a search system designer to control it. Finally, we show how instability harms long-term users' welfare.
Numerous national research assessment policies set the goal of promoting "excellence" and incentivise scholars to publish their research in the most prestigious journals or with the most prestigious book publishers. We investigate the practicalities of the assessment of book outputs based on the prestige of book publishers (Denmark, Finland, Flanders, Lithuania, Norway). Additionally, we test whether such judgments are transparent and yield consistent results. We show inconsistencies in the levelling of publishers, such as the same publisher being ranked as prestigious and not-so-prestigious in different states or in consequent years within the same country. Likewise, we find that verification of compliance with the mandatory prerequisites is not always possible because of the lack of transparency. Our findings support doubts about whether the assessment of books based on a judgement about their publisher yields acceptable outcomes. Currently used rankings of publishers focus on evaluating the gatekeeping role of publishers but do not assess other essential stages in scholarly book publishing. Our suggestion for future research is to develop approaches to evaluate books by accounti
The rapid advancement of generative AI has introduced a new class of tools capable of producing publication-quality scientific figures, graphical abstracts, and data visualizations. However, academic publishers have responded with inconsistent and often ambiguous policies regarding AI-generated imagery. This paper surveys the current stance of major journals and publishers -- including Nature, Science, Cell Press, Elsevier, and PLOS -- on the use of AI-generated figures. We identify key concerns raised by publishers, including reproducibility, authorship attribution, and potential for visual misinformation. Drawing on practical examples from tools such as SciDraw, an AI-powered platform designed specifically for scientific illustration, we propose a set of best-practice guidelines for researchers seeking to use AI figure-generation tools in a compliant and transparent manner. Our findings suggest that, with appropriate disclosure and quality control, AI-generated figures can meaningfully accelerate scientific communication without compromising integrity.
The recent exceptional growth in the number of special issues has led to the largest delegation of editorial power in the history of scientific publishing. Has this power been used responsibly? In this article we provide the first systematic analysis of a particular form of abuse of power by guest editors: endogeny, the practice of publishing articles in ones own special issue. While moderate levels of endogeny are common in special issues, excessive endogeny is a blatant case of scientific misconduct. We define special issues containing more than 33% endogeny as Published in Support of Self (PISS). We build a dataset of over 100,000 special issues published between 2015 and 2025 by five leading publishers. The large majority of guest editors engage in endogeny responsibly, if at all. Nonetheless, despite endogeny policies by publishers and indexers, PISS is comparable in magnitude to scientific fraud. All journals heavily relying on special issues host PISS, and more than 1,000 PISS special issues are published each year, hosting tens of thousands of endogenous articles. Extreme PISS abuses are rare, as the majority of PISS occurs at moderate levels of endogeny. Since the scientif
We aim to determine the extent and content of guidance for authors regarding the use of generative-AI (GAI), Generative Pretrained models (GPTs) and Large Language Models (LLMs) powered tools among the top 100 academic publishers and journals in science. The websites of these publishers and journals were screened from between 19th and 20th May 2023. Among the largest 100 publishers, 17% provided guidance on the use of GAI, of which 12 (70.6%) were among the top 25 publishers. Among the top 100 journals, 70% have provided guidance on GAI. Of those with guidance, 94.1% of publishers and 95.7% of journals prohibited the inclusion of GAI as an author. Four journals (5.7%) explicitly prohibit the use of GAI in the generation of a manuscript, while 3 (17.6%) publishers and 15 (21.4%) journals indicated their guidance exclusively applies to the writing process. When disclosing the use of GAI, 42.8% of publishers and 44.3% of journals included specific disclosure criteria. There was variability in guidance of where to disclose the use of GAI, including in the methods, acknowledgments, cover letter, or a new section. There was also variability in how to access GAI guidance and the linking o