We compare the network of aggregated journal-journal citation relations provided by the Journal Citation Reports (JCR) 2012 of the Science and Social Science Citation Indexes (SCI and SSCI) with similar data based on Scopus 2012. First, global maps were developed for the two sets separately; sets of documents can then be compared using overlays to both maps. Using fuzzy-string matching and ISSN numbers, we were able to match 10,524 journal names between the two sets; that is, 96.4% of the 10,936 journals contained in JCR or 51.2% of the 20,554 journals covered by Scopus. Network analysis was then pursued on the set of journals shared between the two databases and the two sets of unique journals. Citations among the shared journals are more comprehensively covered in JCR than Scopus, so the network in JCR is denser and more connected than in Scopus. The ranking of shared journals in terms of indegree (that is, numbers of citing journals) or total citations is similar in both databases overall (Spearman's \r{ho} > 0.97), but some individual journals rank very differently. Journals that are unique to Scopus seem to be less important--they are citing shared journals rather than bein
Computational models in chemistry rely on a number of approximations. The effect of such approximations on observables derived from them is often unpredictable. Therefore, it is challenging to quantify the uncertainty of a computational result, which, however, is necessary to assess the suitability of a computational model. Common performance statistics such as the mean absolute error are prone to failure as they do not distinguish the explainable (systematic) part of the errors from their unexplainable (random) part. In this paper, we discuss problems and solutions for performance assessment of computational models based on several examples from the quantum chemistry literature. For this purpose, we elucidate the different sources of uncertainty, the elimination of systematic errors, and the combination of individual uncertainty components to the uncertainty of a prediction.
Rankings of scholarly journals based on citation data are often met with skepticism by the scientific community. Part of the skepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of authors. This paper focuses on analysis of the table of cross-citations among a selection of Statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care in order to avoid potential over-interpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's Research Assessment Exercise shows strong correlation at aggregate level between assessed research quality and journal citation `export scores' within the discipline of Statistics.
Computational chemistry has become an indispensable tool for generating data and insights, pervading all branches of experimental chemistry. Its most central concept is the potential energy hypersurface, key to all chemistry and materials science, as it assigns an energy to a molecular structure, the necessary ingredient for reaction mechanism elucidation and reaction rate calculation. Density functional theory (DFT) has been the most important method in practice for obtaining such energies, which is mirrored in the use of high-performance computing hardware. In the last two decades, a new class of surrogate potential energy functions has been evolving with remarkable properties: quantum accuracy combined with force-field speed. Until very recently, their application was hampered by the fact that they needed to be trained on truly large system-specific data sets, generated before a computational chemistry study could be started (in sharp contrast to DFT, which, as a first-principles method, works out of the box, but at a far higher price of computational cost). Very recently, this roadblock has been overcome by so-called foundation machine learning interatomic potentials, which are
We introduce ChemPro, a progressive benchmark with 4100 natural language question-answer pairs in Chemistry, across 4 coherent sections of difficulty designed to assess the proficiency of Large Language Models (LLMs) in a broad spectrum of general chemistry topics. We include Multiple Choice Questions and Numerical Questions spread across fine-grained information recall, long-horizon reasoning, multi-concept questions, problem-solving with nuanced articulation, and straightforward questions in a balanced ratio, effectively covering Bio-Chemistry, Inorganic-Chemistry, Organic-Chemistry and Physical-Chemistry. ChemPro is carefully designed analogous to a student's academic evaluation for basic to high-school chemistry. A gradual increase in the question difficulty rigorously tests the ability of LLMs to progress from solving basic problems to solving more sophisticated challenges. We evaluate 45+7 state-of-the-art LLMs, spanning both open-source and proprietary variants, and our analysis reveals that while LLMs perform well on basic chemistry questions, their accuracy declines with different types and levels of complexity. These findings highlight the critical limitations of LLMs in
Using the Scopus dataset (1996-2007) a grand matrix of aggregated journal-journal citations was constructed. This matrix can be compared in terms of the network structures with the matrix contained in the Journal Citation Reports (JCR) of the Institute of Scientific Information (ISI). Since the Scopus database contains a larger number of journals and covers also the humanities, one would expect richer maps. However, the matrix is in this case sparser than in the case of the ISI data. This is due to (i) the larger number of journals covered by Scopus and (ii) the historical record of citations older than ten years contained in the ISI database. When the data is highly structured, as in the case of large journals, the maps are comparable, although one may have to vary a threshold (because of the differences in densities). In the case of interdisciplinary journals and journals in the social sciences and humanities, the new database does not add a lot to what is possible with the ISI databases.
A number of journal classification systems have been developed in bibliometrics since the launch of the Citation Indices by the Institute of Scientific Information (ISI) in the 1960s. These systems are used to normalize citation counts with respect to field-specific citation patterns. The best known system is the so-called "Web-of-Science Subject Categories" (WCs). In other systems papers are classified by algorithmic solutions. Using the Journal Citation Reports 2014 of the Science Citation Index and the Social Science Citation Index (n of journals = 11,149), we examine options for developing a new system based on journal classifications into subject categories using aggregated journal-journal citation data. Combining routines in VOSviewer and Pajek, a tree-like classification is developed. At each level one can generate a map of science for all the journals subsumed under a category. Nine major fields are distinguished at the top level. Further decomposition of the social sciences is pursued for the sake of example with a focus on journals in information science (LIS) and science studies (STS). The new classification system improves on alternative options by avoiding the problem
Using three years of the Journal Citation Reports (2011, 2012, and 2013), indicators of transitions in 2012 (between 2011 and 2013) are studied using methodologies based on entropy statistics. Changes can be indicated at the level of journals using the margin totals of entropy production along the row or column vectors, but also at the level of links among journals by importing the transition matrices into network analysis and visualization programs (and using community-finding algorithms). Seventy-four journals are flagged in terms of discontinuous changes in their citations; but 3,114 journals are involved in "hot" links. Most of these links are embedded in a main component; 78 clusters (containing 172 journals) are flagged as potential "hot spots" emerging at the network level. An additional finding is that PLoS ONE introduced a new communication dynamics into the database. The limitations of the methodology are elaborated using an example. The results of the study indicate where developments in the citation dynamics can be considered as significantly unexpected. This can be used as heuristic information; but what a "hot spot" in terms of the entropy statistics of aggregated cit
Defining and measuring internationality as a function of influence diffusion of scientific journals is an open problem. There exists no metric to rank journals based on the extent or scale of internationality. Measuring internationality is qualitative, vague, open to interpretation and is limited by vested interests. With the tremendous increase in the number of journals in various fields and the unflinching desire of academics across the globe to publish in "international" journals, it has become an absolute necessity to evaluate, rank and categorize journals based on internationality. Authors, in the current work have defined internationality as a measure of influence that transcends across geographic boundaries. There are concerns raised by the authors about unethical practices reflected in the process of journal publication whereby scholarly influence of a select few are artificially boosted, primarily by resorting to editorial maneuvres. To counter the impact of such tactics, authors have come up with a new method that defines and measures internationality by eliminating such local effects when computing the influence of journals. A new metric, Non-Local Influence Quotient(NLI
To enhance large language models (LLMs) for chemistry problem solving, several LLM-based agents augmented with tools have been proposed, such as ChemCrow and Coscientist. However, their evaluations are narrow in scope, leaving a large gap in understanding the benefits of tools across diverse chemistry tasks. To bridge this gap, we develop ChemToolAgent, an enhanced chemistry agent over ChemCrow, and conduct a comprehensive evaluation of its performance on both specialized chemistry tasks and general chemistry questions. Surprisingly, ChemToolAgent does not consistently outperform its base LLMs without tools. Our error analysis with a chemistry expert suggests that: For specialized chemistry tasks, such as synthesis prediction, we should augment agents with specialized tools; however, for general chemistry questions like those in exams, agents' ability to reason correctly with chemistry knowledge matters more, and tool augmentation does not always help.
Using "Analyze Results" at the Web of Science, one can directly generate overlays onto global journal maps of science. The maps are based on the 10,000+ journals contained in the Journal Citation Reports (JCR) of the Science and Social Science Citation Indices (2011). The disciplinary diversity of the retrieval is measured in terms of Rao-Stirling's "quadratic entropy." Since this indicator of interdisciplinarity is normalized between zero and one, the interdisciplinarity can be compared among document sets and across years, cited or citing. The colors used for the overlays are based on Blondel et al.'s (2008) community-finding algorithms operating on the relations journals included in JCRs. The results can be exported from VOSViewer with different options such as proportional labels, heat maps, or cluster density maps. The maps can also be web-started and/or animated (e.g., using PowerPoint). The "citing" dimension of the aggregated journal-journal citation matrix was found to provide a more comprehensive description than the matrix based on the cited archive. The relations between local and global maps and their different functions in studying the sciences in terms of journal lit
The past decade has led to significant improvements in our understanding of the physical structure of the molecular cores of cold dark clouds. Observational efforts, in combination with improved knowledge of cloud structure, now provide clear evidence that the chemistry of dark clouds is dominated by the depletion of gaseous species onto grain surfaces. We outline the basis of these observational efforts and show how the abundance determinations have moved beyond single point analyses to the derivation of abundance profiles. We discuss the basic physics of the interaction between molecules and grain surfaces and show that when physics is coupled into a chemical model there is excellent agreement, for a limited set of species, between theory and observations. We discuss our improved understanding of cloud chemistry can be used as a new tool for studies of the formation of stars and planetary systems.
Topological surface states, a new kind of electronic state of matter, have recently been observed on the cleaved surfaces of crystals of a handful of small band gap semiconductors. The underlying chemical factors that enable these states are crystal symmetry, the presence of strong spin orbit coupling, and an inversion of the energies of the bulk electronic states that normally contribute to the valence and conduction bands. The goals of this review are to briefly introduce the physics of topological insulators to a chemical audience and to describe the chemistry, defect chemistry, and crystal structures of the compounds in this emergent field.
Astrochemistry lies at the nexus of astronomy, chemistry, and molecular physics. On the basis of precise laboratory data, a rich collection of more than 200 familiar and exotic molecules have been identified in the interstellar medium, the vast majority by their unique rotational fingerprint. Despite this large body of work, there is scant evidence in the radio band for the basic building blocks of chemistry on earth -- five and six-membered rings -- despite long standing and sustained efforts during the past 50 years. In contrast, a peculiar structural motif, highly unsaturated carbon in a chain-like arrangement, is instead quite common in space. The recent astronomical detection of cyanobenzene, the simplest aromatic nitrile, in the dark molecular cloud TMC-1, and soon afterwards in additional pre-stellar, and possibly protostellar sources, establishes that aromatic chemistry is likely widespread in the earliest stages of star formation. The subsequent discovery of cyanocyclopentadienes and even cyanonapthlenes in TMC-1 provides further evidence that organic molecules of considerable complexity are readily synthesized in regions with high visual extinction but where the low tempe
The stereochemistry of 6s2 (E) lone pair of divalent Pb and trivalent Bi (PbII and BiIII designated by M*) in structurally related PbO, PbFX (X= Cl, Br, I), BiOX (X= F, Cl, Br, I) and Bi2NbO5F is rationalized. The lone pair LP presence determined by its sphere of influence E, equal to those of oxygen or fluorine anions, was settled by its center then giving M*-E directions and distances. Detailed description of structural features of both elements in the title compounds characterized by [PbEO]n and [BiEO]n layers allowed to show the evolution of M*-E distance versus the changes with the square pyramidal SP coordination polyhedra. All are different, in red PbO one finds {PbEO4E4} square antiprism, a {[Bi.E]O4X4Xapical} monocapped square antiprism in PbFX and BiOX and {BiEO4F4}square antiprism in Bi2NbO5F. To analyze the crystal chemistry results, the electronic structures of these compounds were calculated within density functional theory DFT. Real space analyses of electron localization illustrate a full volume development of the lone pair on PbII within {PbEO4E4} in PbOE, {PbEF4X4} in PbFXE and Bi(III) within {BiEO4X4} square antiprisms, contrary to Bi(III) within {[Bi.E]O4F4Fapic
Dyads of journals related by citations can agglomerate into specialties through the mechanism of triadic closure. Using the Journal Citation Reports 2011, 2012, and 2013, we analyze triad formation as indicators of integration (specialty growth) and disintegration (restructuring). The strongest integration is found among the large journals that report on studies in different scientific specialties, such as PLoS ONE, Nature Communications, Nature, and Science. This tendency towards large-scale integration has not yet stabilized. Using the Islands algorithm, we also distinguish 51 local maxima of integration. We zoom into the cited articles that carry the integration for: (i) a new development within high-energy physics and (ii) an emerging interface between the journals Applied Mathematical Modeling and the International Journal of Advanced Manufacturing Technology. In the first case, integration is brought about by a specific communication reaching across specialty boundaries, whereas in the second, the dyad of journals indicates an emerging interface between specialties. These results suggest that integration picks up substantive developments at the specialty level. An advantage o
Conference publications in computer science (CS) have attracted scholarly attention due to their unique status as a main research outlet unlike other science fields where journals are dominantly used for communicating research findings. One frequent research question has been how different conference and journal publications are, considering a paper as a unit of analysis. This study takes an author-based approach to analyze publishing patterns of 517,763 scholars who have ever published both in CS conferences and journals for the last 57 years, as recorded in DBLP. The analysis shows that the majority of CS scholars tend to make their scholarly debut, publish more papers, and collaborate with more coauthors in conferences than in journals. Importantly, conference papers seem to serve as a distinct channel of scholarly communication, not a mere preceding step to journal publications: coauthors and title words of authors across conferences and journals tend not to overlap much. This study corroborates findings of previous studies on this topic from a distinctive perspective and suggests that conference authorship in CS calls for more special attention from scholars and administrators
We introduce a novel methodology for mapping academic institutions based on their journal publication profiles. We believe that journals in which researchers from academic institutions publish their works can be considered as useful identifiers for representing the relationships between these institutions and establishing comparisons. However, when academic journals are used for research output representation, distinctions must be introduced between them, based on their value as institution descriptors. This leads us to the use of journal weights attached to the institution identifiers. Since a journal in which researchers from a large proportion of institutions published their papers may be a bad indicator of similarity between two academic institutions, it seems reasonable to weight it in accordance with how frequently researchers from different institutions published their papers in this journal. Cluster analysis can then be applied to group the academic institutions, and dendrograms can be provided to illustrate groups of institutions following agglomerative hierarchical clustering. In order to test this methodology, we use a sample of Spanish universities as a case study. We f
Publication patterns of 79 forest scientists awarded major international forestry prizes during 1990-2010 were compared with the journal classification and ranking promoted as part of the 'Excellence in Research for Australia' (ERA) by the Australian Research Council. The data revealed that these scientists exhibited an elite publication performance during the decade before and two decades following their first major award. An analysis of their 1703 articles in 431 journals revealed substantial differences between the journal choices of these elite scientists and the ERA classification and ranking of journals. Implications from these findings are that additional cross-classifications should be added for many journals, and there should be an adjustment to the ranking of several journals relevant to the ERA Field of Research classified as 0705 Forestry Sciences.
The Monte Carlo Computational Summit was held on the campus of the University of Notre Dame in South Bend, Indiana, USA on 25--26 October 2023. The goals of the summit were to discuss algorithmic and software alterations required for successfully porting respective code bases to exascale-class computing hardware, compare software engineering techniques used by various code teams, and consider the adoption of industry-standard benchmark problems to better facilitate code-to-code performance comparisons. A large portion of the meeting included candid discussions of direct experiences with approaches that have and have not worked. Participants reported that identifying and implementing suitable Monte Carlo algorithms for GPUs continues to be a sticking point. They also report significant difficulty porting existing algorithms between GPU APIs (specifically Nvidia CUDA to AMD ROCm). To better compare code-to-code performance, participants decided to design a C5G7-like benchmark problem with a defined figure of merit, with the expectation of adding more benchmarks in the future. Problem specifications and results will eventually be hosted in a public repository and will be open to submi