The purpose of this study is to introduce a new model of teaching Chinese as a foreign language from the perspective of integrating wisdom. Its characteristics are as follows: focusing on the butterfly model of interpretation before translation, highlighting the new method of bilingual thinking training, on the one hand, applying the new theory of Chinese characters, the theory of the relationship between language and speech, and the forward-looking research results of language science; On the other hand, the application of the new model of teaching Chinese as a foreign language, AI empowering teaching and learning, and the forward-looking research results of educational science fully reflect a series of characteristics of the new model of teaching Chinese as a foreign language from the perspective of integrating wisdom. Its beneficial effects are: not only the old view of language and education, especially the old view of teaching Chinese as a foreign language, but also the old view of human-computer interaction. Its significance lies in that a series of great cross-border Rongzhixue such as language, knowledge, education and teaching, as well as new methods and new topics of bili
The evolving landscape of open access (OA) journal publishing holds significant importance for policymakers and stakeholders who seek to make informed decisions and develop strategies that foster sustainable growth and advancements in open access initiatives within China. This study addressed the shortcomings of the current journal evaluation system and recognized the necessity of researching the elasticity of annual publication capacity (PUB) in relation to the Journal Impact Factor (JIF). By constructing an economic model of elasticity, a comparative analysis of the characteristics and dynamics of international OA journals from China and overseas was conducted. The analysis categorized OA journals based on their respective elasticity values and provided specific recommendations tailored to each category. These recommendations offer valuable insights into the development and growth potential of both OA journals from China and overseas. Moreover, the findings underscore the importance of strategic decision-making to strike a balance between quantity and quality in OA journal management. By comprehending the dynamic nature of elasticity, China can enhance its OA journal landscape, e
The journal structure in the China Scientific and Technical Papers and Citations Database (CSTPCD) is analysed from three perspectives: the database level, the specialty level and the institutional level (i.e., university journals versus journals issued by the Chinese Academy of Sciences). The results are compared with those for (Chinese) journals included in the Science Citation Index. The frequency of journal-journal citation relations in the CSTPCD is an order of magnitude lower than in the SCI. Chinese journals, especially high-quality journals, prefer to cite international journals rather than domestic ones. However, Chinese journals do not get an equivalent reception from their international counterparts. The international visibility of Chinese journals is low, but varies among fields of science. Journals of the Chinese Academy of Sciences (CAS) have a better reception in the international scientific community than university journals.
This is one of the first studies that quantitatively examine the usage of English acronyms (e.g. WTO) in Chinese texts. Using newspaper corpora, I try to answer 1) for all instances of a concept that has an English acronym (e.g. World Trade Organization), what percentage is expressed in the English acronym (WTO), and what percentage in its Chinese translation (shijie maoyi zuzhi), and 2) what factors are at play in language users' choice between the English and Chinese forms? Results show that different concepts have different percentage for English acronyms (PercentOfEn), ranging from 2% to 98%. Linear models show that PercentOfEn for individual concepts can be predicted by language economy (how long the Chinese translation is), concept frequency, and whether the first appearance of the concept in Chinese newspapers is the English acronym or its Chinese translation (all p < .05).
Based on the citation data of journals covered by the China Scientific and Technical Papers and Citations Database (CSTPCD), we obtained aggregated journal-journal citation environments by applying routines developed specifically for this purpose. Local citation impact of journals is defined as the share of the total citations in a local citation environment, which is expressed as a ratio and can be visualized by the size of the nodes. The vertical size of the nodes varies proportionally to a journal's total citation share, while the horizontal size of the nodes is used to provide citation information after correction for the within-journal (self-) citations. In this study, we analyze citation impacts of three Chinese journals in mathematics and compare local citation impacts with impact factors. Local citation impacts reflect a journal's status and function better than (global) impact factors. We also found that authors in Chinese journals prefer international instead of domestic ones as sources for their citations.
Contemporary language models are increasingly multilingual, but Chinese LLM developers must navigate complex political and business considerations of language diversity. Language policy in China aims at influencing the public discourse and governing a multi-ethnic society, and has gradually transitioned from a pluralist to a more assimilationist approach since 1949. We explore the impact of these influences on current language technology. We evaluate six open-source multilingual LLMs pre-trained by Chinese companies on 18 languages, spanning a wide range of Chinese, Asian, and Anglo-European languages. Our experiments show Chinese LLMs performance on diverse languages is indistinguishable from international LLMs. Similarly, the models' technical reports also show lack of consideration for pretraining data language coverage except for English and Mandarin Chinese. Examining Chinese AI policy, model experiments, and technical reports, we find no sign of any consistent policy, either for or against, language diversity in China's LLM development. This leaves a puzzling fact that while China regulates both the languages people use daily as well as language model development, they do not
The study of how science is discussed and how scholarly actors interact on social media has increasingly become popular in the field of scientometrics in recent years. While most prior studies focused on research outputs discussed on global platforms, such as Twitter or Facebook, the presence of scholarly journals on local platforms was seldom studied, especially in the Chinese social media context. To fill this gap, this study investigates the uptake of WeChat (a Chinese social network app) by the Chinese scholarly journals indexed by the Chinese Social Sciences Citation Index (CSSCI). The results show that 65.3% of CSSCI-indexed journals have created WeChat public accounts and posted over 193 thousand WeChat posts in total. At the journal level, bibliometric indicators (e.g., citations, downloads, and journal impact factors) and WeChat indicators (e.g., clicks, likes, replies, and recommendations) are weakly correlated with each other, reinforcing the idea of fundamentally differentiated dimensions of indicators between bibliometrics and social media metrics. Results also show that journals with WeChat public accounts slightly outperform those without WeChat public accounts in te
As large language models increasingly mediate access to information and facilitate decision-making, they are becoming instruments in soft power competitions between global actors such as the United States and China. So far, language models seem to be aligned with the values of Western countries, but evidence for this ethical bias comes mostly from models made by American companies. The current crop of state-of-the-art models includes several made in China, so we conducted the first large-scale investigation of how models made in China and the USA align with people from China and the USA. We elicited responses to the Moral Foundations Questionnaire 2.0 and the World Values Survey from ten Chinese models and ten American models, and we compared their responses to responses from thousands of Chinese and American people. We found that all models respond to both surveys more like American people than like Chinese people. This skew toward American values is only slightly mitigated when prompting the models in Chinese or imposing a Chinese persona on the models. These findings have important implications for a near future in which large language models generate much of the content people
Large language models (LLMs) have demonstrated remarkable capabilities, but their success heavily relies on the quality of pretraining corpora. For Chinese LLMs, the scarcity of high-quality Chinese datasets presents a significant challenge, often limiting their performance. To address this issue, we propose the OpenCSG Chinese Corpus, a series of high-quality datasets specifically designed for LLM pretraining, post-training, and fine-tuning. This corpus includes Fineweb-edu-chinese, Fineweb-edu-chinese-v2, Cosmopedia-chinese, and Smoltalk-chinese, each with distinct characteristics: Fineweb-edu datasets focus on filtered, high-quality content derived from diverse Chinese web sources; Cosmopedia-chinese provides synthetic, textbook-style data for knowledge-intensive training; and Smoltalk-chinese emphasizes stylistic and diverse chat-format data. The OpenCSG Chinese Corpus is characterized by its high-quality text, diverse coverage across domains, and scalable, reproducible data curation processes. Additionally, we conducted extensive experimental analyses, including evaluations on smaller parameter models, which demonstrated significant performance improvements in tasks such as C-
Rankings of scholarly journals based on citation data are often met with skepticism by the scientific community. Part of the skepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of authors. This paper focuses on analysis of the table of cross-citations among a selection of Statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care in order to avoid potential over-interpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's Research Assessment Exercise shows strong correlation at aggregate level between assessed research quality and journal citation `export scores' within the discipline of Statistics.
Tuberculosis (TB) is an infectious disease transmitted through the respiratory system. China is one of the countries with a high burden of TB. Since 2004, an average of more than 800,000 cases of active TB have been reported each year in China. Analyzing the case data from 2004-2018, we find significant differences in TB incidence by age group. Therefore, the effect of age heterogeneous structure on TB transmission needs further study. We develop a model of TB to explore the role of age heterogeneity as a factor in TB transmission. The model is fitted numerically using the nonlinear least squares method to obtain the key parameters in the model, and the basic reproduction number Rv 0.8017 is calculated and the sensitivity anal-ysis of Rv to the parameters is given. The simulation results show that reducing the number of new infections in the elderly population and increasing the recovery rate of elderly patients with the disease could significantly reduce the transmission of tuberculosis. Furthermore the feasibility of achieving the goals of the WHO End TB Strategy in China is assessed, and we obtain that with existing TB control measures it will take another 30 years for China to
China has become the fifth leading nation in terms of its share of the world's scientific publications. The citation rate of papers with a Chinese address for the corresponding author also exhibits exponential growth. More specifically, China has become a major player in critical technologies like nanotechnology. Although it is difficult to delineate nanoscience and nanotechnology, we show that China has recently achieved a position second only to that of the USA. Funding for R&D has been growing exponentially, but since 1997 even more in terms of business expenditure than in terms of government expenditure. It seems that the Chinese government has effectively used the public-sector research potential to boost the knowledge-based economy of the country. Thus, China may be achieving the ("Lisbon") objectives of the transition to a knowledge-based economy more broadly and rapidly than its western counterparts. Because of the sustained increase in Chinese government funding and the virtually unlimited reservoir of highly-skilled human resources, one may expect a continuation of this growth pattern in the near future.
Using the Scopus dataset (1996-2007) a grand matrix of aggregated journal-journal citations was constructed. This matrix can be compared in terms of the network structures with the matrix contained in the Journal Citation Reports (JCR) of the Institute of Scientific Information (ISI). Since the Scopus database contains a larger number of journals and covers also the humanities, one would expect richer maps. However, the matrix is in this case sparser than in the case of the ISI data. This is due to (i) the larger number of journals covered by Scopus and (ii) the historical record of citations older than ten years contained in the ISI database. When the data is highly structured, as in the case of large journals, the maps are comparable, although one may have to vary a threshold (because of the differences in densities). In the case of interdisciplinary journals and journals in the social sciences and humanities, the new database does not add a lot to what is possible with the ISI databases.
In this paper, we use bibliometric methods and social network analysis to analyze the pattern of China-US scientific collaboration on individual level in nanotechnology. Results show that Chinese-American scientists have been playing an important role in China-US scientific collaboration. We find that China-US collaboration in nanotechnology mainly occurs between Chinese and Chinese-American scientists. In the co-authorship network, Chinese-American scientists tend to have higher betweenness centrality. Moreover, the series of polices implemented by the Chinese government to recruit oversea experts seems to contribute a lot to China-US scientific collaboration.
We argue that the communication structures in the Chinese social sciences have not yet been sufficiently reformed. Citation patterns among Chinese domestic journals in three subject areas -- political science and marxism, library and information science, and economics -- are compared with their counterparts internationally. Like their colleagues in the natural and life sciences, Chinese scholars in the social sciences provide fewer references to journal publications than their international counterparts; like their international colleagues, social scientists provide fewer references than natural sciences. The resulting citation networks, therefore, are sparse. Nevertheless, the citation structures clearly suggest that the Chinese social sciences are far less specialized in terms of disciplinary delineations than their international counterparts. Marxism studies are more established than political science in China. In terms of the impact of the Chinese political system on academic fields, disciplines closely related to the political system are less specialized than those weakly related. In the discussion section, we explore reasons that may cause the current stagnation and provide p
A number of journal classification systems have been developed in bibliometrics since the launch of the Citation Indices by the Institute of Scientific Information (ISI) in the 1960s. These systems are used to normalize citation counts with respect to field-specific citation patterns. The best known system is the so-called "Web-of-Science Subject Categories" (WCs). In other systems papers are classified by algorithmic solutions. Using the Journal Citation Reports 2014 of the Science Citation Index and the Social Science Citation Index (n of journals = 11,149), we examine options for developing a new system based on journal classifications into subject categories using aggregated journal-journal citation data. Combining routines in VOSviewer and Pajek, a tree-like classification is developed. At each level one can generate a map of science for all the journals subsumed under a category. Nine major fields are distinguished at the top level. Further decomposition of the social sciences is pursued for the sake of example with a focus on journals in information science (LIS) and science studies (STS). The new classification system improves on alternative options by avoiding the problem
Developing intelligent pediatric consultation systems offers promising prospects for improving diagnostic efficiency, especially in China, where healthcare resources are scarce. Despite recent advances in Large Language Models (LLMs) for Chinese medicine, their performance is sub-optimal in pediatric applications due to inadequate instruction data and vulnerable training procedures. To address the above issues, this paper builds PedCorpus, a high-quality dataset of over 300,000 multi-task instructions from pediatric textbooks, guidelines, and knowledge graph resources to fulfil diverse diagnostic demands. Upon well-designed PedCorpus, we propose PediatricsGPT, the first Chinese pediatric LLM assistant built on a systematic and robust training pipeline. In the continuous pre-training phase, we introduce a hybrid instruction pre-training mechanism to mitigate the internal-injected knowledge inconsistency of LLMs for medical domain adaptation. Immediately, the full-parameter Supervised Fine-Tuning (SFT) is utilized to incorporate the general medical knowledge schema into the models. After that, we devise a direct following preference optimization to enhance the generation of pediatric
Insight-HXMT is the first Chinese X-ray astronomical mission, launched successfully on June 15, 2017, from China's Jiuquan Satellite Launch Center. Insight-HXMT was designed to have a broad energy coverage in X-rays, from 1-250 keV, with excellent timing and adequate energy resolution at soft X-rays, and the largest effective area at hard X-rays. Here, we present a collection of papers of the Journal of High Energy Astrophysics on the Early Results of China's 1st X-ray Astronomy Satellite Insight-HXMT. These papers cover the in-orbit performance, the background model, and all calibration results, together with several first results on observations of binaries and details on the Galactic plane survey.
The Chinese approach to developing a world-class science system includes a vigorous set of programmes to attract back Chinese researchers who have overseas training and work experience. No analysis is available to show the performance of these mobile researchers. This article attempts to close part of this gap. Using a novel bibliometric approach, we estimate the stocks of overseas Chinese and returnees from the perspective of their publication activities, albeit with some limitations. We show that the share of overseas Chinese scientists in the US is considerably larger than that in the EU. We also show that Chinese returnees publish higher impact work, and continue to publish more and at the international level than domestic counterparts. Returnees not only tend to publish more, but they are instrumental in linking China into the global network. Indeed, returnees actively co-publish with researchers in their former host system, showing the importance of scientific social capital. Future research will examine the impact of length of stay, among other factors, on such impact and integration.
Using "Analyze Results" at the Web of Science, one can directly generate overlays onto global journal maps of science. The maps are based on the 10,000+ journals contained in the Journal Citation Reports (JCR) of the Science and Social Science Citation Indices (2011). The disciplinary diversity of the retrieval is measured in terms of Rao-Stirling's "quadratic entropy." Since this indicator of interdisciplinarity is normalized between zero and one, the interdisciplinarity can be compared among document sets and across years, cited or citing. The colors used for the overlays are based on Blondel et al.'s (2008) community-finding algorithms operating on the relations journals included in JCRs. The results can be exported from VOSViewer with different options such as proportional labels, heat maps, or cluster density maps. The maps can also be web-started and/or animated (e.g., using PowerPoint). The "citing" dimension of the aggregated journal-journal citation matrix was found to provide a more comprehensive description than the matrix based on the cited archive. The relations between local and global maps and their different functions in studying the sciences in terms of journal lit