共找到 20 条结果
The purpose of this study is to introduce a new model of teaching Chinese as a foreign language from the perspective of integrating wisdom. Its characteristics are as follows: focusing on the butterfly model of interpretation before translation, highlighting the new method of bilingual thinking training, on the one hand, applying the new theory of Chinese characters, the theory of the relationship between language and speech, and the forward-looking research results of language science; On the other hand, the application of the new model of teaching Chinese as a foreign language, AI empowering teaching and learning, and the forward-looking research results of educational science fully reflect a series of characteristics of the new model of teaching Chinese as a foreign language from the perspective of integrating wisdom. Its beneficial effects are: not only the old view of language and education, especially the old view of teaching Chinese as a foreign language, but also the old view of human-computer interaction. Its significance lies in that a series of great cross-border Rongzhixue such as language, knowledge, education and teaching, as well as new methods and new topics of bili
The clinical burden of spleen-stomach disorders is substantial. While large language models (LLMs) offer new potential for medical applications, they face three major challenges in the context of integrative Chinese and Western medicine (ICWM): a lack of high-quality data, the absence of models capable of effectively integrating the reasoning logic of traditional Chinese medicine (TCM) syndrome differentiation with that of Western medical (WM) disease diagnosis, and the shortage of a standardized evaluation benchmark. To address these interrelated challenges, we propose DongYuan, an ICWM spleen-stomach diagnostic framework. Specifically, three ICWM datasets (SSDF-Syndrome, SSDF-Dialogue, and SSDF-PD) were curated to fill the gap in high-quality data for spleen-stomach disorders. We then developed SSDF-Core, a core diagnostic LLM that acquires robust ICWM reasoning capabilities through a two-stage training regimen of supervised fine-tuning. tuning (SFT) and direct preference optimization (DPO), and complemented it with SSDF-Navigator, a pluggable consultation navigation model designed to optimize clinical inquiry strategies. Additionally, we established SSDF-Bench, a comprehensive e
The journal structure in the China Scientific and Technical Papers and Citations Database (CSTPCD) is analysed from three perspectives: the database level, the specialty level and the institutional level (i.e., university journals versus journals issued by the Chinese Academy of Sciences). The results are compared with those for (Chinese) journals included in the Science Citation Index. The frequency of journal-journal citation relations in the CSTPCD is an order of magnitude lower than in the SCI. Chinese journals, especially high-quality journals, prefer to cite international journals rather than domestic ones. However, Chinese journals do not get an equivalent reception from their international counterparts. The international visibility of Chinese journals is low, but varies among fields of science. Journals of the Chinese Academy of Sciences (CAS) have a better reception in the international scientific community than university journals.
Based on the citation data of journals covered by the China Scientific and Technical Papers and Citations Database (CSTPCD), we obtained aggregated journal-journal citation environments by applying routines developed specifically for this purpose. Local citation impact of journals is defined as the share of the total citations in a local citation environment, which is expressed as a ratio and can be visualized by the size of the nodes. The vertical size of the nodes varies proportionally to a journal's total citation share, while the horizontal size of the nodes is used to provide citation information after correction for the within-journal (self-) citations. In this study, we analyze citation impacts of three Chinese journals in mathematics and compare local citation impacts with impact factors. Local citation impacts reflect a journal's status and function better than (global) impact factors. We also found that authors in Chinese journals prefer international instead of domestic ones as sources for their citations.
This study examines the social media uptake of scientific journals on two different platforms - X and WeChat - by comparing the adoption of X among journals indexed in the Science Citation Index-Expanded (SCIE) with the adoption of WeChat among journals indexed in the Chinese Science Citation Database (CSCD). The findings reveal substantial differences in platform adoption and user engagement, shaped by local contexts. While only 22.7% of SCIE journals maintain an X account, 84.4% of CSCD journals have a WeChat official account. Journals in Life Sciences & Biomedicine lead in uptake on both platforms, whereas those in Technology and Physical Sciences show high WeChat uptake but comparatively lower presence on X. User engagement on both platforms is dominated by low-effort interactions rather than more conversational behaviors. Correlation analyses indicate weak-to-moderate relationships between bibliometric indicators and social media metrics, confirming that online engagement reflects a distinct dimension of journal impact, whether on an international or a local platform. These findings underscore the need for broader social media metric frameworks that incorporate locally dom
We compare the network of aggregated journal-journal citation relations provided by the Journal Citation Reports (JCR) 2012 of the Science and Social Science Citation Indexes (SCI and SSCI) with similar data based on Scopus 2012. First, global maps were developed for the two sets separately; sets of documents can then be compared using overlays to both maps. Using fuzzy-string matching and ISSN numbers, we were able to match 10,524 journal names between the two sets; that is, 96.4% of the 10,936 journals contained in JCR or 51.2% of the 20,554 journals covered by Scopus. Network analysis was then pursued on the set of journals shared between the two databases and the two sets of unique journals. Citations among the shared journals are more comprehensively covered in JCR than Scopus, so the network in JCR is denser and more connected than in Scopus. The ranking of shared journals in terms of indegree (that is, numbers of citing journals) or total citations is similar in both databases overall (Spearman's \r{ho} > 0.97), but some individual journals rank very differently. Journals that are unique to Scopus seem to be less important--they are citing shared journals rather than bein
The study of how science is discussed and how scholarly actors interact on social media has increasingly become popular in the field of scientometrics in recent years. While most prior studies focused on research outputs discussed on global platforms, such as Twitter or Facebook, the presence of scholarly journals on local platforms was seldom studied, especially in the Chinese social media context. To fill this gap, this study investigates the uptake of WeChat (a Chinese social network app) by the Chinese scholarly journals indexed by the Chinese Social Sciences Citation Index (CSSCI). The results show that 65.3% of CSSCI-indexed journals have created WeChat public accounts and posted over 193 thousand WeChat posts in total. At the journal level, bibliometric indicators (e.g., citations, downloads, and journal impact factors) and WeChat indicators (e.g., clicks, likes, replies, and recommendations) are weakly correlated with each other, reinforcing the idea of fundamentally differentiated dimensions of indicators between bibliometrics and social media metrics. Results also show that journals with WeChat public accounts slightly outperform those without WeChat public accounts in te
No previous work has studied the performance of Large Language Models (LLMs) in the context of Traditional Chinese Medicine (TCM), an essential and distinct branch of medical knowledge with a rich history. To bridge this gap, we present a TCM question dataset named TCM-QA, which comprises three question types: single choice, multiple choice, and true or false, to examine the LLM's capacity for knowledge recall and comprehensive reasoning within the TCM domain. In our study, we evaluate two settings of the LLM, zero-shot and few-shot settings, while concurrently discussing the differences between English and Chinese prompts. Our results indicate that ChatGPT performs best in true or false questions, achieving the highest precision of 0.688 while scoring the lowest precision is 0.241 in multiple-choice questions. Furthermore, we observed that Chinese prompts outperformed English prompts in our evaluations. Additionally, we assess the quality of explanations generated by ChatGPT and their potential contribution to TCM knowledge comprehension. This paper offers valuable insights into the applicability of LLMs in specialized domains and paves the way for future research in leveraging th
Large language models (LLMs) have demonstrated remarkable capabilities, but their success heavily relies on the quality of pretraining corpora. For Chinese LLMs, the scarcity of high-quality Chinese datasets presents a significant challenge, often limiting their performance. To address this issue, we propose the OpenCSG Chinese Corpus, a series of high-quality datasets specifically designed for LLM pretraining, post-training, and fine-tuning. This corpus includes Fineweb-edu-chinese, Fineweb-edu-chinese-v2, Cosmopedia-chinese, and Smoltalk-chinese, each with distinct characteristics: Fineweb-edu datasets focus on filtered, high-quality content derived from diverse Chinese web sources; Cosmopedia-chinese provides synthetic, textbook-style data for knowledge-intensive training; and Smoltalk-chinese emphasizes stylistic and diverse chat-format data. The OpenCSG Chinese Corpus is characterized by its high-quality text, diverse coverage across domains, and scalable, reproducible data curation processes. Additionally, we conducted extensive experimental analyses, including evaluations on smaller parameter models, which demonstrated significant performance improvements in tasks such as C-
Rankings of scholarly journals based on citation data are often met with skepticism by the scientific community. Part of the skepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of authors. This paper focuses on analysis of the table of cross-citations among a selection of Statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care in order to avoid potential over-interpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's Research Assessment Exercise shows strong correlation at aggregate level between assessed research quality and journal citation `export scores' within the discipline of Statistics.
Interdisciplinary research is critical for innovation and addressing complex societal issues. We characterise the interdisciplinary knowledge structure of PubMed research articles in medicine as correlation networks of medical concepts and compare the interdisciplinarity of articles between high-ranking (impactful) and less high-ranking (less impactful) medical journals. We found that impactful medical journals tend to publish research that are less interdisciplinary than less impactful journals. Observing that they bridge distant knowledge clusters in the networks, we find that cancer-related research can be seen as one of the main drivers of interdisciplinarity in medical science. Using signed difference networks, we also investigate the clustering of deviations between high and low impact journal correlation networks. We generally find a mild tendency for strong link differences to be adjacent. Furthermore, we find topic clusters of deviations that shift over time. In contrast, topic clusters in the original networks are static over time and can be seen as the core knowledge structure in medicine. Overall, journals and policymakers should encourage initiatives to accommodate int
This paper explores the application of prompt engineering to enhance the performance of large language models (LLMs) in the domain of Traditional Chinese Medicine (TCM). We propose TCM-Prompt, a framework that integrates various pre-trained language models (PLMs), templates, tokenization, and verbalization methods, allowing researchers to easily construct and fine-tune models for specific TCM-related tasks. We conducted experiments on disease classification, syndrome identification, herbal medicine recommendation, and general NLP tasks, demonstrating the effectiveness and superiority of our approach compared to baseline methods. Our findings suggest that prompt engineering is a promising technique for improving the performance of LLMs in specialized domains like TCM, with potential applications in digitalization, modernization, and personalized medicine.
Interdisciplinary research, a process of knowledge integration, is vital for scientific advancements. It remains unclear whether prestigious journals that are highly impactful lead in disseminating interdisciplinary knowledge. In this paper, by constructing topic-level correlation networks based on publications, we evaluated the interdisciplinarity of more and less prestigious journals in medicine. We found research from prestigious medical journals tends to be less interdisciplinary than research from other medical journals. We also established that cancer-related research is the main driver of interdisciplinarity in medical science. Our results indicate a weak tendency for differences in topic correlations between more and less prestigious journals to be co-located. Accordingly, we identified that interdisciplinarity in prestigious journals mainly differs from interdisciplinarity in other journals in areas such as infections, nervous system diseases and cancer. Overall, our results suggest that interdisciplinarity in science could benefit from prestigious journals easing rigid disciplinary boundaries.
Traditional Chinese medicine (TCM) prescription is the most critical form of TCM treatment, and uncovering the complex nonlinear relationship between symptoms and TCM is of great significance for clinical practice and assisting physicians in diagnosis and treatment. Although there have been some studies on TCM prescription generation, these studies consider a single factor and directly model the symptom-prescription generation problem mainly based on symptom descriptions, lacking guidance from TCM knowledge. To this end, we propose a RoBERTa and Knowledge Enhancement model for Prescription Generation of Traditional Chinese Medicine (RoKEPG). RoKEPG is firstly pre-trained by our constructed TCM corpus, followed by fine-tuning the pre-trained model, and the model is guided to generate TCM prescriptions by introducing four classes of knowledge of TCM through the attention mask matrix. Experimental results on the publicly available TCM prescription dataset show that RoKEPG improves the F1 metric by about 2% over the baseline model with the best results.
Traditional Chinese medicine (TCM) relies on natural medical products to treat symptoms and diseases. While clinical data have demonstrated the effectiveness of selected TCM-based treatments, the mechanistic root of how TCM herbs treat diseases remains largely unknown. More importantly, current approaches focus on single herbs or prescriptions, missing the high-level general principles of TCM. To uncover the mechanistic nature of TCM on a system level, in this work we establish a generic network medicine framework for TCM from the human protein interactome. Applying our framework reveals a network pattern between symptoms (diseases) and herbs in TCM. We first observe that genes associated with a symptom are not distributed randomly in the interactome, but cluster into localized modules; furthermore, a short network distance between two symptom modules is indicative of the symptoms' co-occurrence and similarity. Next, we show that the network proximity of a herb's targets to a symptom module is predictive of the herb's effectiveness in treating the symptom. We validate our framework with real-world hospital patient data by showing that (1) shorter network distance between symptoms o
Using three years of the Journal Citation Reports (2011, 2012, and 2013), indicators of transitions in 2012 (between 2011 and 2013) are studied using methodologies based on entropy statistics. Changes can be indicated at the level of journals using the margin totals of entropy production along the row or column vectors, but also at the level of links among journals by importing the transition matrices into network analysis and visualization programs (and using community-finding algorithms). Seventy-four journals are flagged in terms of discontinuous changes in their citations; but 3,114 journals are involved in "hot" links. Most of these links are embedded in a main component; 78 clusters (containing 172 journals) are flagged as potential "hot spots" emerging at the network level. An additional finding is that PLoS ONE introduced a new communication dynamics into the database. The limitations of the methodology are elaborated using an example. The results of the study indicate where developments in the citation dynamics can be considered as significantly unexpected. This can be used as heuristic information; but what a "hot spot" in terms of the entropy statistics of aggregated cit
The academic journal zoning system is central to evaluating research talent, funding, and institutions. The CAS journal partition system, one of East Asia's most widely used tools, will cease operation in March 2026, creating a policy gap. Existing alternatives have major limitations: JCR depends on paid databases and excludes conferences; Scimago/CiteScore relies on Elsevier proprietary data; expert-based rankings such as CCF and CORE lack quantitative foundations and update slowly. This paper proposes the General Science Ranking (GSR), a multidimensional bibliometric framework built entirely on open-source data. GSR covers 500 computer science venues (397 journals and 103 conferences) and 500 medical journals using OpenAlex and Semantic Scholar. Scores combine four indicators: field-weighted citation impact (FWCI), two-year impact factor (IF2), five-year h-index (h5), and citation CAGR. For CS conferences lacking citation time-series data, IF2-approx was estimated from calibration on 1.41 million OpenAlex journal papers. Rankings adopt fixed quotas: Q1 (1-50), Q2 (51-100), Q3 (101-200), and Q4 (201+). All code and data are open source. In CS rankings, conferences and journals eac
Using the Scopus dataset (1996-2007) a grand matrix of aggregated journal-journal citations was constructed. This matrix can be compared in terms of the network structures with the matrix contained in the Journal Citation Reports (JCR) of the Institute of Scientific Information (ISI). Since the Scopus database contains a larger number of journals and covers also the humanities, one would expect richer maps. However, the matrix is in this case sparser than in the case of the ISI data. This is due to (i) the larger number of journals covered by Scopus and (ii) the historical record of citations older than ten years contained in the ISI database. When the data is highly structured, as in the case of large journals, the maps are comparable, although one may have to vary a threshold (because of the differences in densities). In the case of interdisciplinary journals and journals in the social sciences and humanities, the new database does not add a lot to what is possible with the ISI databases.
To analyze the bacteriostatic effect of Chinese traditional herbal medicines on E. coli, total 35 different preparations (decoction, volatile oil and distillate) of Chinese traditional herbal medicines were tested using plate culture method. The results showed that 18 preparations of traditional Chinese herbal medicines have different inhibition effect on E. coli in vitro. The results also revealed that different process and combination affect the bacteriostatic effect and different medicines could be used in singles or combined to treat E.coli disease
Large Language Models (LLMs) has made significant progress in a number of professional fields, including medicine, law, and finance. However, in traditional Chinese medicine (TCM), there are challenges such as the essential differences between theory and modern medicine, the lack of specialized corpus resources, and the fact that relying only on supervised fine-tuning may lead to overconfident predictions. To address these challenges, we propose a two-stage training approach that combines continuous pre-training and supervised fine-tuning. A notable contribution of our study is the processing of a 2GB corpus dedicated to TCM, constructing pre-training and instruction fine-tuning datasets for TCM, respectively. In addition, we have developed Qibo-Benchmark, a tool that evaluates the performance of LLM in the TCM on multiple dimensions, including subjective, objective, and three TCM NLP tasks. The medical LLM trained with our pipeline, named $\textbf{Qibo}$, exhibits significant performance boosts. Compared to the baselines, the average subjective win rate is 63%, the average objective accuracy improved by 23% to 58%, and the Rouge-L scores for the three TCM NLP tasks are 0.72, 0.61,