Oscar Wilde said, "The difference between literature and journalism is that journalism is unreadable, and literature is not read." Unfortunately, The digitally archived journalism of Oscar Wilde's 19th century often has no or poor quality Optical Character Recognition (OCR), reducing the accessibility of these archives and making them unreadable both figuratively and literally. This paper helps address the issue by performing OCR on "The Nineteenth Century Serials Edition" (NCSE), an 84k-page collection of 19th-century English newspapers and periodicals, using Pixtral 12B, a pre-trained image-to-text language model. The OCR capability of Pixtral was compared to 4 other OCR approaches, achieving a median character error rate of 1%, 5x lower than the next best model. The resulting NCSE v2.0 dataset features improved article identification, high-quality OCR, and text classified into four types and seventeen topics. The dataset contains 1.4 million entries, and 321 million words. Example use cases demonstrate analysis of topic similarity, readability, and event tracking. NCSE v2.0 is freely available to encourage historical and sociological research. As a result, 21st-century readers c
The Harvard College Observatory was the preeminent astronomical data center of the early 20th century: it gathered and archived an enormous collection of glass photographic plates that became, and remains, the largest in the world. For nearly twenty years DASCH (Digital Access to a Sky Century @ Harvard) actively digitized this library using a one-of-a kind plate scanner. In early 2024, after 470,000 scans, the DASCH project finished. Now, this unique analog dataset can be integrated into 21st-century, digital analyses. The key DASCH data products include ~200 TB of plate images, ~16 TB of calibrated light curves, and a variety of supporting metadata and calibration outputs. Virtually every part of the sky is covered by thousands of DASCH images with a time baseline spanning more than 100 years; most stars brighter than B ~ 15 have hundreds or thousands of detections. DASCH Data Release 7, issued in late 2024, represents the culmination of the DASCH scanning project.
In English literature, the 19th century witnessed a significant transition in styles, themes, and genres. Consequently, the novels from this period display remarkable diversity. This paper explores these variations by examining the evolution of term usage in 19th century English novels through the lens of information retrieval. By applying a query expansion-based approach to a decade-segmented collection of fiction from the British Library, we examine how related terms vary over time. Our analysis employs multiple standard metrics including Kendall's tau, Jaccard similarity, and Jensen-Shannon divergence to assess overlaps and shifts in expanded query term sets. Our results indicate a significant degree of divergence in the related terms across decades as selected by the query expansion technique, suggesting substantial linguistic and conceptual changes throughout the 19th century novels.
During the century of existence of the notion of coherent states, either linear or nonlinear, several schemes for their construction, theoretical or experimental, have been developed. Generally, the mathematical structure of coherent states depends on the choice of ladder operators, and consequently on the structure constants. In this paper, we propose a way to construct generalized coherent states for anharmonic oscillators that is based on a diagonal operator ordering technique (DOOT) applied to generalized hypergeometric functions, that is, on some of the most general special functions. These states are generated by the action of a pair of ladder operators, the creation and the annihilation, whose ordered normal product is equal to the dimensionless Hamiltonian of the quantum system. In addition, the action of these operators is easy to find if the expression for the dimensionless energy eigenvalues is known.
The natural fluxes of CO2 and CH4 into the atmosphere from the territory of Russia in the 21st century have been analyzed using the results of calculations with the ensemble of global climate models of the international project CMIP6. Estimates of natural CO2 fluxes for Russian regions differ greatly for different models. Their values for the beginning of the 21st century range from -1 to 1 GtC/yr. In the 21st century the differences in model estimates of fluxes grow and at the end of the 21st century under the scenario with the largest anthropogenic impacts SSP5-8.5 are in the range from -2.5 to 2.5 GtC/year. Estimates of natural methane emissions to the atmosphere from the territory of Russia also differ greatly for different models. Present-day methane emissions are estimated in the range from 10 to 35 MtCH4/year, while the growth in the 21st century may reach 300%. Ensemble model calculations show general trends for changes in natural greenhouse gas fluxes. Most CMIP6 ensemble models are characterized by a maximum of CO2 uptake by terrestrial ecosystems and its further reduction by the end of the 21st century, while natural methane emissions to the atmosphere for all models and
The IPCC AR6 assessment of the impacts and risks associated with projected climate changes for the 21st century is both alarming and ambiguous. According to computer projections, the global surface may warm from 1.3 to 8.0 °C by 2100, depending on the global climate model (GCM) and the shared socioeconomic pathway (SSP) scenario used for the simulations. However a substantial number of CMIP6 GCMs run "too hot" because they appear to be too sensitive to radiative forcing, and that the high/extreme emission scenarios SSP3-7.0 and SSP5-8.5 must be rejected because judged to be "unlikely" and "highly unlikely", respectively. This paper examines the impacts and risks of "realistic" climate change projections for the 21st century generated by assessing the theoretical models and integrating them with the existing empirical knowledge on global warming and the various natural cycles of climate change that have been recorded by a variety of scientists and historians. This is achieved by combining the "realistic" SSP2-4.5 scenario and empirically optimized climate modeling. The obtained climate projections show that the expected global surface warming for the 21st century will likely be mild
The experiment performed by Henry Cavendish to measure the density of the earth, is in numerous textbooks described as a measurement of the universal gravitational constant, G, even if we know that this was not true. In this paper, a study on how common this "myth" is based on the checklist developed by Leite on a total of 84 textbooks. The prevalence of the myth in most textbooks throughout the 20th century indicates a focus on the contemporary interests of authors and the physics community in presenting the development of physics. An explanation of the prevalence of the myth and a different approach
Bose-Einstein Condensation is a phenomenon at the heart of many of the past century's most intriguing and fundamental manifestations, such as superfluidity and superconductivity. It was discovered theoretically some 100 years ago, and unequivocally experimentally demonstrated in the context of weakly interacting gases 30 years ago. Since then, it has spawned a revolution in our understanding of fundamental phases of matter and collective quantum dynamics extending across all physical scales and energies, with unforeseen implications and the potential for envisaged quantum technological applications.
Humans acquire and accumulate knowledge through language usage and eagerly exchange their knowledge for advancement. Although geographical barriers had previously limited communication, the emergence of information technology has opened new avenues for knowledge exchange. However, it is unclear which communication pathway is dominant in the 21st century. Here, we explore the dominant path of knowledge diffusion in the 21st century using Wikipedia, the largest communal dataset. We evaluate the similarity of shared knowledge between population groups, distinguished based on their language usage. When population groups are more engaged with each other, their knowledge structure is more similar, where engagement is indicated by socioeconomic connections, such as cultural, linguistic, and historical features. Moreover, geographical proximity is no longer a critical requirement for knowledge dissemination. Furthermore, we integrate our data into a mechanistic model to better understand the underlying mechanism and suggest that the knowledge "Silk Road" of the 21st century is based online.
KIC8462852 is a completely-ordinary F3 main sequence star, except that the light curve from Kepler shows episodes of unique and inexplicable day-long dips with up to 20% dimming. Here, I provide a light curve of 1338 Johnson B-band magnitudes from 1890 to 1989 taken from archival photographic plates at Harvard. KIC8462852 displays a secular dimming at an average rate of 0.164+-0.013 magnitudes per century. From the early-1890s to the late-1980s, KIC8462852 faded by 0.193+-0.030 mag. The decline is not an artifact because nearby check stars have closely flat light curves. This century-long dimming is unprecedented for any F-type main sequence star. Thus the Harvard light curve provides the first confirmation (past the several dips seen in the Kepler light curve alone) that KIC8462852 has anything unusual. The century-long dimming and the day-long dips are both just extreme ends of a spectrum of timescales for unique dimming events. By Ockham's Razor, two such unique and similar effects are very likely produced by one physical mechanism. This one mechanism does not appear as any isolated catastrophic event in the last century, but rather must be some ongoing process with continuous e
The twentieth century was a period of outstanding economic growth together with an unequal income distribution. This paper analyses the international distribution of growth rates and its dynamics during the twentieth century. We show that the whole century is characterized by a high heterogeneity in the distribution of GDP per capita growth rates, which is reflected in different shapes and a persistent asymmetry of the distributions at the regional level and for countries of different development levels. We find that in the context of the global conflicts that characterized the first half of the twentieth century and involved mainly large economies, the well-known negative scale relation between volatility and size of countries is not significant. After the year 1956, a redistribution of volatility leads to a significant negative scale-relation, which has been recently considered as a robust feature of the evolution of economic organizations. Our results contribute with more empirical facts that call the attention to traditional macroeconomic theories to better explain the underlying complexity of the growth process and sheds light on its historical evolution.
Physical science has changed in the century since Lord Kelvin's celebrated essay on Nineteenth Century Clouds over the Dynamical Theory of Heat and Light, but some things are the same. Analogs in what was happening in physics then and what is happening in astronomy today serve to remind us why we can be confident the Virtual Observatory of the twenty-first century will have a rich list of challenges to explore.
Progress in science has advanced the development of human society across history, with dramatic revolutions shaped by information theory, genetic cloning, and artificial intelligence, among the many scientific achievements produced in the 20th century. However, the way that science advances itself is much less well-understood. In this work, we study the evolution of scientific development over the past century by presenting an anatomy of 89 million digitalized papers published between 1900 and 2015. We find that science has benefited from the shift from individual work to collaborative effort, with over 90% of the world-leading innovations generated by collaborations in this century, nearly four times higher than they were in the 1900s. We discover that rather than the frequent myopic- and self-referencing that was common in the early 20th century, modern scientists instead tend to look for literature further back and farther around. Finally, we also observe the globalization of scientific development from 1900 to 2015, including 25-fold and 7-fold increases in international collaborations and citations, respectively, as well as a dramatic decline in the dominant accumulation of ci
This paper aims to assess the effects of industrial pollution on infant mortality between the years 1850-1940 using full count decennial censuses. In this period, US economy experienced a tremendous rise in industrial activity with significant variation among different counties in absorbing manufacturing industries. Since manufacturing industries are shown to be the main source of pollution, we use the share of employment at the county level in this industry to proxy for space-time variation in industrial pollution. Since male embryos are more vulnerable to external stressors like pollution during prenatal development, they will face higher likelihood of fetal death. Therefore, we proxy infant mortality with different measures of gender ratio. We show that the upswing in industrial pollution during late nineteenth century and early twentieth century has led to an increase in infant mortality. The results are consistent and robust across different scenarios, measures for our proxies, and aggregation levels. We find that infants and more specifically male infants had paid the price of pollution during upswing in industrial growth at the dawn of the 20th century. Contemporary datasets
The evolution of theoretical physics in the XX century differs significantly from that in XVII-XIX centuries. While continuous progress is observed for theoretical physics in XVII-XIX centuries, modern physics contains many questions that have not been resolved despite many decades of discussion. Based upon the analysis of works by the founders of the XX-century physics, the conclusion is made that the roots of the "eternal" questions by the XX-century theoretical physics lie in the philosophy used by its founders. The conclusion is made about the need to use the ideas of philosophy that guided C. Huygens, I. Newton, W. Thomson (Lord Kelvin), J. K. Maxwell, and the other great physicists of the XVII-XIX centuries, in all areas of theoretical physics.
This paper gives an overview of several key innovations in the 19th century which led to complex geometry in the 20th century. This includes the creation of the complex plane, the work of Abel on addition theorems for generalized elliptic integrals, the theory of elliptic functions, holomorphic functions, and the creation of Riemann surfaces by Riemann in the mid 19th century. A number of the original papers which contain these new ideas are looked at in some detail and a detailed set of references is included.
We give a few examples of Chebyshev polynomials that appeared in mathematical problems from the 16th and 17th century. The main example is the famous equation of Adrianus Romanus (Adriaan van Roomen) containing a polynomial of degree $45$.
If the use of the apostrophe in contemporary English often marks the Saxon genitive, it may also indicate the omission of one or more let-ters. Some writers (wrongly?) use it to mark the plural in symbols or abbreviations, visual-ised thanks to the isolation of the morpheme "s". This punctuation mark was imported from the Continent in the 16th century. During the 19th century its use was standardised. However the rules of its usage still seem problematic to many, including literate speakers of English. "All too often, the apostrophe is misplaced", or "errant apostrophes are springing up every-where" is a complaint that Internet users fre-quently come across when visiting grammar websites. Many of them detail its various uses and misuses, and attempt to correct the most common mistakes about it, especially its mis-use in the plural, called greengrocers' apostro-phes and humorously misspelled "greengro-cers apostrophe's". While studying English travel accounts published in the seventeenth century, we noticed that the different uses of this symbol may accompany various models of metaplasms. We were able to highlight the linguistic variations of some lexemes, and trace the origin of mo
Historical maps offer an invaluable perspective into territory evolution across past centuries--long before satellite or remote sensing technologies existed. Deep learning methods have shown promising results in segmenting historical maps, but publicly available datasets typically focus on a single map type or period, require extensive and costly annotations, and are not suited for nationwide, long-term analyses. In this paper, we introduce a new dataset of historical maps tailored for analyzing large-scale, long-term land use and land cover evolution with limited annotations. Spanning metropolitan France (548,305 km^2), our dataset contains three map collections from the 18th, 19th, and 20th centuries. We provide both comprehensive modern labels and 22,878 km^2 of manually annotated historical labels for the 18th and 19th century maps. Our dataset illustrates the complexity of the segmentation task, featuring stylistic inconsistencies, interpretive ambiguities, and significant landscape changes (e.g., marshlands disappearing in favor of forests). We assess the difficulty of these challenges by benchmarking three approaches: a fully-supervised model trained with historical labels,
We give a brief, incomplete, and idiosyncratic review of the early years of supergravity in superspace as our contribution to the book Half a Century of Supergravity edited by Anna Ceresole and Gianguido Dall'Agata.