With the rapid growth of scholarly archives, researchers subscribe to "paper alert" systems that periodically provide them with recommendations of recently published papers that are similar to previously collected papers. However, researchers sometimes struggle to make sense of nuanced connections between recommended papers and their own research context, as existing systems only present paper titles and abstracts. To help researchers spot these connections, we present PaperWeaver, an enriched paper alerts system that provides contextualized text descriptions of recommended papers based on user-collected papers. PaperWeaver employs a computational method based on Large Language Models (LLMs) to infer users' research interests from their collected papers, extract context-specific aspects of papers, and compare recommended and collected papers on these aspects. Our user study (N=15) showed that participants using PaperWeaver were able to better understand the relevance of recommended papers and triage them more confidently when compared to a baseline that presented the related work sections from recommended papers.
We read twelve well-known LLM agent benchmark papers and recorded, dimension by dimension, what each paper actually says about how its evaluation was run. The motivation came from a familiar frustration: two papers will report results on the same benchmark with the same model name and disagree, and you cannot tell why -- the scaffold, the sampling settings, the subset, or the evaluator version. In many cases the published artifact does not let you answer. This paper is an implementation report on the attempt. We designed a small audit schema (five fields: benchmark identity, harness specification, inference settings, cost reporting, failure breakdown), wrote a scoring codebook with the boundary cases we hit during pilot scoring, applied it to twelve canonical papers (eight agent, four classical static), and recorded what we saw. We score the disclosure of an agent run, not its correctness, and make no claim that disclosure implies a trustworthy result. The mean audit score across the eight agent-benchmark papers is 0.38 (out of 1.0), and across the four classical static benchmarks 0.66; the largest gap is on cost (none of the eight agent benchmark papers disclose inference cost in
Automated paper reproduction -- generating executable code from academic papers -- is bottlenecked not by information retrieval but by the tacit knowledge that papers inevitably leave implicit. We formalize this challenge as the progressive recovery of three types of tacit knowledge -- relational, somatic, and collective -- and propose \method, a graph-based agent framework with a dedicated mechanism for each: node-level relation-aware aggregation recovers relational knowledge by analyzing implementation-unit-level reuse and adaptation relationships between the target paper and its citation neighbors; execution-feedback refinement recovers somatic knowledge through iterative debugging driven by runtime signals; and graph-level knowledge induction distills collective knowledge from clusters of papers sharing similar implementations. On an extended ReproduceBench spanning 3 domains, 10 tasks, and 40 recent papers, \method{} achieves an average performance gap of 10.04\% against official implementations, improving over the strongest baseline by 24.68\%. The code will be publicly released upon acceptance; the repository link will be provided in the final version.
This paper introduces the first systematic evaluation framework for quantifying the quality and risks of papers written by modern coding agents. While AI-driven paper writing has become a growing concern, rigorous evaluation of the quality and potential risks of AI-written papers remains limited, and a unified understanding of their reliability is still lacking. We introduce Paper Reconstruction Evaluation (PaperRecon), an evaluation framework in which an overview (overview.md) is created from an existing paper, after which an agent generates a full paper based on the overview and minimal additional resources, and the result is subsequently compared against the original paper. PaperRecon disentangles the evaluation of the AI-written papers into two orthogonal dimensions, Presentation and Hallucination, where Presentation is evaluated using a rubric and Hallucination is assessed via agentic evaluation grounded in the original paper source. For evaluation, we introduce PaperWrite-Bench, a benchmark of 51 papers from top-tier venues across diverse domains published after 2025. Our experiments reveal a clear trade-off: while both ClaudeCode and Codex improve with model advances, Claude
A reference-based classification system for individual Scopus publications is presented which takes into account the categories of the papers citing those references instead of the journals in which those cited papers are published. It supports multiple assignments of up to 5 categories within the Scopus ASJC structure, but eliminates the Multidisciplinary Area and the miscellaneous categories, and it allows for the reclassification of a greater number of publications (potentially 100%) than traditional reference-based systems. Twelve variants of the system were obtained by adjusting different parameters, which were applied to the more than 3.2 million citable papers from the active Scientific Journals in 2020 indexed in Scopus. The results were analyzed and compared with other classification systems such as the original journal-based Scopus ASJC, the 2-generation-reference based M3-AWC-0.8 (Álvarez-Llorente et al., 2024), and the corresponding authors' assignment based AAC (Álvarez-Llorente et al., 2023). The different variants obtained of the classification give results that improve those used as references in multiple scientometric fields. The variation called U1-F-0.8 seems esp
To measure how HCI papers are cited across disciplinary boundaries, we collected a citation dataset of CHI, UIST, and CSCW papers published between 2010 and 2020. Our analysis indicates that HCI papers have been more and more likely to be cited by HCI papers rather than by non-HCI papers.
In a letter to Weierstrass Riemann asserted that the number $N_0(T)$ of zeros of $ζ(s)$ on the critical line to height $T$ is approximately equal to the total number of zeros to this height $N(T)$. Siegel studied some posthumous papers of Riemann trying to find a proof of this. He found a function $\mathop{\mathcal R }(s)$ whose zeros are related to the zeros of the function $ζ(s)$. Siegel concluded that Riemann's papers contained no ideas for a proof of his assertion, connected the position of the zeros of $\mathop{\mathcal R }(s)$ with the position of the zeros of $ζ(s)$ and asked about the position of the zeros of $\mathop{\mathcal R }(s)$. This paper is a summary of several papers that we will soon upload to arXiv, in which we try to answer Siegel's question about the position of the zeros of $\mathop{\mathcal R }(s)$. The articles contain also improvements on Siegel's results and also other possible ways to prove Riemann's assertion, but without achieving this goal.
With the introduction of the Visualization for Communication workshop (VisComm) at IEEE VIS and in light of the COVID-19 pandemic, there has been renewed interest in studying visualization as a medium of communication. However the characteristics and definition of this line of study tend to vary from paper to paper and person to person. In this work, we examine the 37 papers accepted to VisComm from 2018 through 2022. Using grounded theory we identify nuances in how VisComm defines visualization, common themes in the work in this area, and a noticeable gap in DEI practices.
This paper is a commentary and a reading guide to three papers by Herbert Busemann, Über die Geometrien, in denen die "Kreise mit unendlichem Radius" die kürzesten Linien sind." (On the geometries where circles of infinite radius are the shortest lines) (1932), "Paschsches Axiom und Zweidimensionalität," (Pasch's Axiom and Two--Dimensionality) (1933) and "Über Räume mit konvexen Kugeln und Parallelenaxiom (On spaces with convex spheres and the parallel postulate) (1933). These are the first papers that Busemann wrote on the foundations of geometry and the axiomatic characterization of Minkowski spaces (finite-dimensional normed spaces). The subject of these papers followed Busemann for the rest of his life, and the three papers already contain several ideas and techniques that he developed later on, in his work on the subject which lasted several decades. The three papers were translated into English by Annette A'Campo. These translations, together with the final version of present commentary, will be part of the forthcoming edition of Busemann's Collected Papers edition.
Subject categories of scholarly papers generally refer to the knowledge domain(s) to which the papers belong, examples being computer science or physics. Subject category information can be used for building faceted search for digital library search engines. This can significantly assist users in narrowing down their search space of relevant documents. Unfortunately, many academic papers do not have such information as part of their metadata. Existing methods for solving this task usually focus on unsupervised learning that often relies on citation networks. However, a complete list of papers citing the current paper may not be readily available. In particular, new papers that have few or no citations cannot be classified using such methods. Here, we propose a deep attentive neural network (DANN) that classifies scholarly papers using only their abstracts. The network is trained using 9 million abstracts from Web of Science (WoS). We also use the WoS schema that covers 104 subject categories. The proposed network consists of two bi-directional recurrent neural networks followed by an attention layer. We compare our model against baselines by varying the architecture and text repres
When seeking information not covered in patient-friendly documents, like medical pamphlets, healthcare consumers may turn to the research literature. Reading medical papers, however, can be a challenging experience. To improve access to medical papers, we introduce a novel interactive interface-Paper Plain-with four features powered by natural language processing: definitions of unfamiliar terms, in-situ plain language section summaries, a collection of key questions that guide readers to answering passages, and plain language summaries of the answering passages. We evaluate Paper Plain, finding that participants who use Paper Plain have an easier time reading and understanding research papers without a loss in paper comprehension compared to those who use a typical PDF reader. Altogether, the study results suggest that guiding readers to relevant passages and providing plain language summaries, or "gists," alongside the original paper content can make reading medical papers easier and give readers more confidence to approach these papers.
In an article written five years ago [arXiv:0809.0522], we described a method for predicting which scientific papers will be highly cited in the future, even if they are currently not highly cited. Applying the method to real citation data we made predictions about papers we believed would end up being well cited. Here we revisit those predictions, five years on, to see how well we did. Among the over 2000 papers in our original data set, we examine the fifty that, by the measures of our previous study, were predicted to do best and we find that they have indeed received substantially more citations in the intervening years than other papers, even after controlling for the number of prior citations. On average these top fifty papers have received 23 times as many citations in the last five years as the average paper in the data set as a whole, and 15 times as many as the average paper in a randomly drawn control group that started out with the same number of citations. Applying our prediction technique to current data, we also make new predictions of papers that we believe will be well cited in the next few years.
We study citation dynamics of the Physics, Economics, and Mathematics papers published in 1984 and focus on the fraction of uncited papers in these three collections. Our model of citation dynamics, which considers citation process as an inhomogeneous Poisson process, captures this uncitedness ratio fairly well. It should be noted that all parameters and variables in our model are related to citations and their dynamics, while uncited papers appear as a byproduct of the citation process and this is the Poisson statistics which makes the cited and uncited papers inseparable. This indicates that the most part of uncited papers constitute the inherent part of the scientific enterprise, namely, uncited papers are not unread.
This study attempts to detect papers originating from the Russia-based paper mill International publisher LLC. A total of 1009 offers published during 2019-2021 on the 123mi.ru website were analysed. The study allowed us to identify at least 434 papers that are potentially linked to the paper mill including one preprint, a duplication paper and 15 republications of papers erroneously published in hijacked journals. Evidence of suspicious provenance from the paper mill is provided: matches in title, number of coauthorship slots, year of publication, country of the journal, country of a coauthorship slot and similarities of abstracts. These problematic papers are coauthored by scholars associated with at least 39 countries and submitted both to predatory and reputable journals. This study also demonstrates collaboration anomalies and the phenomenon of suspicious collaboration in questionable papers and examines the predictors of the Russia-based paper mill. The value of coauthorship slots offered by International Publisher LLC in 2019-2021 is estimated at $6.5 million. Since the study analysed a particular paper mill, it is likely that the number of papers with forged authorship is m
We demonstrate a comprehensive framework that accounts for citation dynamics of scientific papers and for the age distribution of references. We show that citation dynamics of scientific papers is nonlinear and this nonlinearity has far-reaching consequences, such as diverging citation distributions and runaway papers. We propose a nonlinear stochastic dynamic model of citation dynamics based on link copying/redirection mechanism. The model is fully calibrated by empirical data and does not contain free parameters. This model can be a basis for quantitative probabilistic prediction of citation dynamics of individual papers and of the journal impact factor.
A multitude of factors are responsible for the overall quality of scientific papers, including readability, linguistic quality, fluency,semantic complexity, and of course domain-specific technical factors. These factors vary from one field of study to another. In this paper, we propose a measure and method for assessing the overall quality of the scientific papers in a particular field of study. We evaluate our method in the computer science domain, but it can be applied to other technical and scientific fields.Our method is based on the corpus linguistics technique. This technique enables the extraction of required information and knowledge associated with a specific domain. For this purpose, we have created a large corpus, consisting of papers from very high impact conferences. First, we analyze this corpus in order to extract rich domain-specific terminology and knowledge. Then we use the acquired knowledge to estimate the quality of scientific papers by applying our proposed measure. We examine our measure on high and low scientific impact test corpora. Our results show a significant difference in the measure scores of the high and low impact test corpora. Second, we develop a
This is a combination of the errata of seven papers published between 2008 and 2016 with Jiang-Tao Li (JTL) as the first author. All the problems are caused by two mistakes in the original scripts written by JTL used to calculate the physical parameters of the hot gas from X-ray spectral analysis with a thermal plasma code. The mistakes will result in an overestimate of some parameters, such as the electron number density and hot gas mass by a factor of $\sqrt{10}\approx3.162$, and an overestimate of the thermal pressure by a factor of $\approx2.725$. JTL apologizes to the community for the inconvenience caused by these mistakes. We present an update on the text, numbers, figures, and tables of all the seven papers affected by these mistakes. Other papers led by JTL or co-authored papers are not affected.
Bibliometric information retrieval in databases can employ different strategies. Com-monly, queries are performed by searching in title, abstract and/or author keywords (author vocabulary). More advanced queries employ database keywords to search in a controlled vo-cabulary. Queries based on search terms can be augmented with their citing papers if a re-search field cannot be curtailed by the search query alone. Here, we present another strategy to discover the most important papers of a research field. A marker paper is used to reveal the most important works for the relevant community. All papers co-cited with the marker paper are analyzed using reference publication year spectroscopy (RPYS). For demonstration of the marker paper approach, density functional theory (DFT) is used as a research field. Compari-sons between a prior RPYS on a publication set compiled using a keyword-based search in a controlled vocabulary and three different co-citation RPYS (RPYS-CO) analyses show very similar results. Similarities and differences are discussed.
The accelerating pace of scientific publishing makes it increasingly difficult for researchers to stay current. We present Paper Espresso, an open-source platform that automatically discovers, summarizes, and analyzes trending arXiv papers. The system uses large language models (LLMs) to generate structured summaries with topical labels and keywords, and provides multi-granularity trend analysis at daily, weekly, and monthly scales through LLM-driven topic consolidation. Over 35 months of continuous deployment, Paper Espresso has processed over 13,300 papers and publicly released all structured metadata, revealing rich dynamics in the AI research landscape: a mid-2025 surge in reinforcement learning for LLM reasoning, non-saturating topic emergence (6,673 unique topics), and a positive correlation between topic novelty and community engagement (2.0x median upvotes for the most novel papers). A live demo is available at https://huggingface.co/spaces/Elfsong/Paper_Espresso.
In this study, we investigated a phenomenon that one intuitively would assume does not exist: self-citations on the paper basis. Actually, papers citing themselves do exist in the Web of Science (WoS) database. In total, we obtained 44,857 papers that have self-citation relations in the WoS raw dataset. In part, they are database artefacts but in part they are due to papers citing themselves in the conclusion or appendix. We also found cases where paper self-citations occur due to publisher-made highlights promoting and citing the paper. We analyzed the self-citing papers according to selected metadata. We observed accumulations of the number of self-citing papers across publication years. We found a skewed distribution across countries, journals, authors, fields, and document types. Finally, we discuss the implications of paper self-citations for bibliometric indicators.