Large language models (LLMs) have diffused rapidly into academic writing since late 2022. Using the complete population of 109,393 research articles published in \textit{PLOS ONE} between 2019 and 2025, we examine population-level structural publication indicators, including full-text manuscript length, authorship team size, reference volume, and cross-linguistic collaboration, before and after 2022. \textit{PLOS ONE}'s multidisciplinary scope and consistent editorial framework allow cross-field comparison under uniform conditions over an extended period. Manuscript length increased substantially, with gains ranging from 14.8\% among African-affiliated authors and 11.7\% among Asian-affiliated authors to 5.3\% among native English-speaking (NES) authors, cutting the word-count gap by 39\%. More strikingly, non-native English-speaking (NNES) authors reduced both authorship team size, from 6.54 to 6.06 authors, or 7.3\%, and collaboration with NES co-authors, from 17.8\% to 12.2\%, or 36\%, while NES authors remained stable in both team size and collaboration rates. Reference counts increased modestly and uniformly across groups. These findings suggest that post-2022 tools may be res
This study aims to evaluate the accuracy of authorship attributions in scientific publications, focusing on the fairness and precision of individual contributions within academic works. The study analyzes 81,823 publications from the journal PLOS ONE, covering the period from January 2018 to June 2023. It examines the authorship attributions within these publications to try and determine the prevalence of inappropriate authorship. It also investigates the demographic and professional profiles of affected authors, exploring trends and potential factors contributing to inaccuracies in authorship. Surprisingly, 9.14% of articles feature at least one author with inappropriate authorship, affecting over 14,000 individuals (2.56% of the sample). Inappropriate authorship is more concentrated in Asia, Africa, and specific European countries like Italy. Established researchers with significant publication records and those affiliated with companies or nonprofits show higher instances of potential monetary authorship. Our findings are based on contributions as declared by the authors, which implies a degree of trust in their transparency. However, this reliance on self-reporting may introduc
The Ward et al. (2016) Plos-One paper is an important, heavily-cited paper in the decoupling literature. The authors present evidence of 1990-2015 growth in material and energy consumption and GDP at a world level, and for selected countries. They find only relative decoupling has occurred, leading to their central claim that future absolute decoupling is implausible. However, the authors have made two key errors in their collected data: GDP data is in current prices which includes inflation, and their global material use data is the total mass of fossil energy materials. Strictly, GDP data should be in constant prices to allow for its comparison over time, and material inputs to an economy should be the sum of mineral raw materials. Amending for these errors, we find much smaller levels of energy-GDP relative decoupling, and no materials-GDP decoupling at all at a global level. We check these new results by adding data for 1900-1990 to provide a longer time series, and find consistently low (and even no) levels of global relative decoupling of material use. The central claim for materials over the implausibility of future absolute decoupling therefore not only remains valid but is
Contributorship statements have been effective at recording granular author contributions in research articles and have been broadly used to understand how labor is divided across research teams. However, one major limitation in existing empirical studies is that two classification systems have been adopted, especially from its most important data source, journals published by the Public Library of Science (PLoS). This research aims to address this limitation by developing a mapping scheme between the two systems and using it to understand whether there are differences in the assignment of contribution by authors under the two systems. We use all research articles published in PLoS ONE between 2012 to 2020, divided into two five-year publication windows centered by the shift of the classification systems in 2016. Our results show that most tasks (except for writing- and resource-related tasks) are used similarly under the two systems. Moreover, notable differences between how researchers used the two systems are also examined and discussed. This research offers an important foundation for empirical research on division of labor in the future, by enabling a larger dataset that cross
As the importance of research data gradually grows in sciences, data sharing has come to be encouraged and even mandated by journals and funders in recent years. Following this trend, the data availability statement has been increasingly embraced by academic communities as a means of sharing research data as part of research articles. This paper presents a quantitative study of which mechanisms and repositories are used to share research data in PLOS ONE articles. We offer a dynamic examination of this topic from the disciplinary and temporal perspectives based on all statements in English-language research articles published between 2014 and 2020 in the journal. We find a slow yet steady growth in the use of data repositories to share data over time, as opposed to sharing data in the paper or supplementary materials; this indicates improved compliance with the journal's data sharing policies. We also find that multidisciplinary data repositories have been increasingly used over time, whereas some disciplinary repositories show a decreasing trend. Our findings can help academic publishers and funders to improve their data sharing policies and serve as an important baseline dataset
PLOS and Mozilla conducted a month-long pilot study in which professional developers performed code reviews on software associated with papers published in PLOS Computational Biology. While the developers felt the reviews were limited by (a) lack of familiarity with the domain and (b) lack of two-way contact with authors, the scientists appreciated the reviews, and both sides were enthusiastic about repeating the experiment.
A mysterious cosmic explosion has astronomers buzzing, as a strange event may hint at an entirely new kind of stellar cataclysm。 After detecting ripples in space-time, scientists spotted a fast-fading red glow that initially looked like a rare kilonova—the kind of collision that forges gold and uranium。 But just days later, the signal shifted, beha
In order to capture the effects of social ties in knowledge diffusion, this paper examines the publication network that emerges from the collaboration of researchers, using citation information as means to estimate knowledge flow. For this purpose, we analyzed the papers published in the PLOS ONE journal finding strong evidence to support that the closer two authors are in the co-authorship network, the larger the probability that knowledge flow will occur between them. Moreover, we also found that when it comes to knowledge diffusion, strong co-authorship proximity is more determinant than geographic proximity.
We analyzed the longitudinal activity of nearly 7,000 editors at the mega-journal PLOS ONE over the 10-year period 2006-2015. Using the article-editor associations, we develop editor-specific measures of power, activity, article acceptance time, citation impact, and editorial renumeration (an analogue to self-citation). We observe remarkably high levels of power inequality among the PLOS ONE editors, with the top-10 editors responsible for 3,366 articles -- corresponding to 2.4% of the 141,986 articles we analyzed. Such high inequality levels suggest the presence of unintended incentives, which may reinforce unethical behavior in the form of decision-level biases at the editorial level. Our results indicate that editors may become apathetic in judging the quality of articles and susceptible to modes of power-driven misconduct. We used the longitudinal dimension of editor activity to develop two panel regression models which test and verify the presence of editor-level bias. In the first model we analyzed the citation impact of articles, and in the second model we modeled the decision time between an article being submitted and ultimately accepted by the editor. We focused on two va
The particular day of the week when an event occurs seems to have unexpected consequences. For example, the day of the week when a paper is submitted to a peer reviewed journal correlates with whether that paper is accepted. Using an econometric analysis (a mix of log-log and semi-log based on undated and panel structured data) we find that more papers are submitted to certain peer review journals on particular weekdays than others, with fewer papers being submitted on weekends. Seasonal effects, geographical information as well as potential changes over time are examined. This finding rests on a large (178 000) and reliable sample; the journals polled are broadly recognized (Nature, Cell, PLOS ONE and Physica A). Day of the week effect in the submission of accepted papers should be of interest to many researchers, editors and publishers, and perhaps also to managers and psychologists.
The paper of Little et al. (PloS Comput Biol 2009 5(10) e1000539) outlined a system of reaction-diffusion equations that were used to describe induction of atherosclerotic disease. These were solved by considering an equilibrium solution and small perturbations around this equilibrium. Here we consider slight variant sets of assumptions that could be used to derive equilibrium solutions. In general they do not imply any change in the numerical results relating to monocyte chemo-attractant protein-1 (MCP-1) presented in that paper.
We consider a slight modification to the monocyte and T-lymphocyte boundary conditions of Little et al. (PLoS Comput Biol 2009 5(10) e1000539) and derive alternative parameter estimates. No changes to the results and conclusions of the paper of Little et al. (PLoS Comput Biol 2009 5(10) e1000539) are implied.
Purpose: Whereas citation counts allow the measurement of the impact of research on research itself, an important role in the measurement of the impact of research on other parts of society is ascribed to altmetrics. The present case study investigates the usefulness of altmetrics for measuring the broader impact of research. Methods: This case study is essentially based on a dataset with papers obtained from F1000. The dataset was augmented with altmetrics (such as Twitter counts) which were provided by PLOS (the Public Library of Science). In total, the case study covers a total of 1,082 papers. Findings: The F1000 dataset contains tags on papers which were assigned intellectually by experts and which can characterise a paper. The most interesting tag for altmetric research is "good for teaching". This tag is assigned to papers which could be of interest to a wider circle of readers than the peers in a specialist area. Particularly on Facebook and Twitter, one could expect papers with this tag to be mentioned more often than those without this tag. With respect to the "good for teaching" tag, the results from regression models were able to confirm these expectations: Papers with
Despite its undisputed position as the biggest social media platform, Facebook has never entered the main stage of altmetrics research. In this study, we argue that the lack of attention by altmetrics researchers is due, in part, to the challenges in collecting Facebook data regarding activity that takes place outside of public pages and groups. We present a new method of collecting aggregate counts of shares, reactions, and comments across the platform-including users' personal timelines-and use it to gather data for all articles published between 2015 to 2017 in the journal PLOS ONE. We compare the gathered data with altmetrics collected and aggregated by Altmetric. The results show that 58.7% of papers shared on Facebook happen outside of public spaces and that, when collecting all shares, the volume of activity approximates patterns of engagement previously only observed for Twitter. Both results suggest that the role and impact of Facebook as a medium for science and scholarly communication has been underestimated. Furthermore, they emphasise the importance of openness and transparency around the collection and aggregation of altmetrics.
In this article, we describe highly cited publications in a PLOS ONE full-text corpus. For these publications, we analyse the citation contexts concerning their position in the text and their age at the time of citing. By selecting the perspective of highly cited papers, we can distinguish them based on the context during citation even if we do not have any other information source or metrics. We describe the top cited references based on how, when and in which context they are cited. The focus of this study is on a time perspective to explain the nature of the reception of highly cited papers. We have found that these references are distinguishable by the IMRaD sections of their citation. And further, we can show that the section usage of highly cited papers is time-dependent: the longer the citation interval, the higher the probability that a reference is cited in a method section.
This study proposes a quantitative framework to enhance curriculum coherence through the systematic alignment of Course Learning Outcomes (CLOs) and Program Learning Outcomes (PLOs), contributing to continuous improvement in outcome-based education. Grounded in accreditation standards such as ABET and NCAAA, the model introduces mathematical tools that map exercises, assessment questions, teaching units (TUs), and student assessment components (SACs) to CLOs and PLOs. This dual-layer approach-combining micro-level analysis of assessment elements with macro-level curriculum evaluation-enables detailed tracking of learning outcomes and helps identify misalignments between instructional delivery, assessment strategies, and program objectives. The framework incorporates alignment matrices, weighted relationships, and practical indicators to quantify coherence and evaluate course or program performance. Application of this model reveals gaps in outcome coverage and underscores the importance of realignment, especially when specific PLOs are underrepresented or CLOs are not adequately supported by assessments. The proposed model is practical, adaptable, and scalable, making it suitable f
There are a number of errors in "mbtransfer: Microbiome intervention analysis using transfer functions and mirror statistics" PLOS Comp Bio (2024) spanning multiple aspects of the paper. The wrong inputs were provided to comparator methods for model training, when forecasting one method was provided initial conditions in the wrong units, and performance metrics were calculated without proper unit conversion. The false discovery rate and power analysis conclusions provided in the text are not supported by theory or the empirical testing that was performed within the paper. The paper also has data leakage issues, equations are written down incorrectly, and incorrect definitions/terminology are used.
Real-world network datasets are typically obtained in ways that fail to capture all edges. The patterns of missing data are often non-uniform as they reflect biases and other shortcomings of different data collection methods. Nevertheless, uniform missing data is a common assumption made when no additional information is available about the underlying missing-edge pattern, and link prediction methods are frequently tested against uniformly missing edges. To investigate the impact of different missing-edge patterns on link prediction accuracy, we employ 9 link prediction algorithms from 4 different families to analyze 20 different missing-edge patterns that we categorize into 5 groups. Our comparative simulation study, spanning 250 real-world network datasets from 6 different domains, provides a detailed picture of the significant variations in the performance of different link prediction algorithms in these different settings. With this study, we aim to provide a guide for future researchers to help them select a link prediction algorithm that is well suited to their sampled network data, considering the data collection process and application domain.
Successive image generation using cyclic transformations is demonstrated by extending the CycleGAN model to transform images among three different categories. Repeated application of the trained generators produces sequences of images that transition among the different categories. The generated image sequences occupy a more limited region of the image space compared with the original training dataset. Quantitative evaluation using precision and recall metrics indicates that the generated images have high quality but reduced diversity relative to the training dataset. Such successive generation processes are characterized as chaotic dynamics in terms of dynamical system theory. Positive Lyapunov exponents estimated from the generated trajectories confirm the presence of chaotic dynamics, with the Lyapunov dimension of the attractor found to be comparable to the intrinsic dimension of the training data manifold. The results suggest that chaotic dynamics in the image space defined by the deep generative model contribute to the diversity of the generated images, constituting a novel approach for multi-class image generation. This model can be interpreted as an extension of classical a
We use topological data analysis as a tool to analyze the fit of mathematical models to experimental data. This study is built on data obtained from motion tracking groups of aphids in [Nilsen et al., PLOS One, 2013] and two random walk models that were proposed to describe the data. One model incorporates social interactions between the insects, and the second model is a control model that excludes these interactions. We compare data from each model to data from experiment by performing statistical tests based on three different sets of measures. First, we use time series of order parameters commonly used in collective motion studies. These order parameters measure the overall polarization and angular momentum of the group, and do not rely on a priori knowledge of the models that produced the data. Second, we use order parameter time series that do rely on a priori knowledge, namely average distance to nearest neighbor and percentage of aphids moving. Third, we use computational persistent homology to calculate topological signatures of the data. Analysis of the a priori order parameters indicates that the interactive model better describes the experimental data than the control m