共找到 20 条结果
The application of knowledge from quantum physics to computer science, which we call \doubleq{quantum informatics}, is driving the development of new technologies, such as quantum computing and quantum key distribution. Researchers in physics education have recognized the promise and significance of teaching quantum informatics in schools, and various teaching methods are being developed, researched and applied. Although quantum informatics is equally relevant to computer science education, little research has been done on how to teach it with a focus on computer science concepts and knowledge. In this study, we position quantum informatics within Denning's Great Principles of Computing and propose Quantum Informatics Standards for secondary schools.
Materials informatics is increasingly used to support modelling, analysis and design across the length scales of materials science, from atomistic simulations to microstructural characterisation and continuum descriptions. Despite rapid progress, the reliability and transferability of these approaches vary strongly with scale. Here we survey data-driven methods at the nanoscale, mesoscale, and micro-to-continuum levels, highlighting established capabilities as well as unresolved challenges. Machine-learning interatomic potentials, mesoscale surrogate and operator-learning models, and learning-based analysis of experimental microstructures are discussed, with emphasis on data quality, uncertainty, interpretability, and cross-scale consistency. We further examine the role of data standards, ontologies, and emerging tools, such as autonomous laboratories, where they directly affect multiscale workflows. This perspective clarifies what can be considered reliable today and identifies key obstacles to the broader integration of materials informatics across scales.
This perspective explores the evolution of materials informatics, from its foundational roots in physics and information theory to its maturation through artificial intelligence (AI). We trace the field's trajectory from early milestones to the transformative impact of the Materials Genome Initiative and the recent advent of large language models (LLMs). Rather than a mere toolkit, we present materials informatics as an evolving ecosystem, reviewing key methodologies such as Bayesian Optimization, Reinforcement Learning, and Transformers that drive inverse design and autonomous self-driving laboratories. We specifically address the practical challenges of LLM integration, comparing specialist versus generalist models and discussing solutions for uncertainty quantification. Looking forward, we assess the transition of AI from a predictive tool to a collaborative research partner. By leveraging active learning and retrieval-augmented generation (RAG), the field is moving toward a new era of autonomous materials science, increasingly characterized by "human-out-of-the-loop" discovery processes.
The advancement of polymer informatics has been significantly propelled by the integration of machine learning (ML) techniques, enabling the rapid prediction of polymer properties and expediting the discovery of high-performance polymeric materials. However, the field lacks a standardized workflow that encompasses prediction accuracy, uncertainty quantification, ML interpretability, and polymer synthesizability. In this study, we introduce POINT$^{2}$ (POlymer INformatics Training and Testing), a comprehensive benchmark database and protocol designed to address these critical challenges. Leveraging the existing labeled datasets and the unlabeled PI1M dataset, a collection of approximately one million virtual polymers generated via a recurrent neural network trained on the realistic polymers, we develop an ensemble of ML models, including Quantile Random Forests, Multilayer Perceptrons with dropout, Graph Neural Networks, and pretrained large language models. These models are coupled with diverse polymer representations such as Morgan, MACCS, RDKit, Topological, Atom Pair fingerprints, and graph-based descriptors to achieve property predictions, uncertainty estimations, model interp
This chapter provides a summary of the activities and results of the European Network For Gender Balance in Informatics (EUGAIN, EU COST Action CA19122). The main aim and objective of the network is to improve gender balance in informatics at all levels, from undergraduate and graduate studies to participation and leadership both in academia and industry, through the creation of a European network of colleagues working at the forefront of the efforts for gender balance in informatics in their countries and research communities.
This study aims to enrich and leverage data from the Informatics Europe Higher Education (IEHE) data portal to extract and analyze trends in female participation in Informatics across Europe. The research examines the proportion of female students, first-year enrollments, and degrees awarded to women in the field. The issue of low female participation in Informatics has long been recognized as a persistent challenge and remains a critical area of scholarly inquiry. Furthermore, existing literature indicates that socio-economic factors can unpredictably influence female participation, complicating efforts to address the gender gap. The analysis focuses on participation data from research universities at various academic levels, including Bachelors, Masters, and PhD programs, and seeks to uncover potential correlations between female participation and geographical or economic zones. The dataset was first enriched by integrating additional information, such as each country's GDP and relevant geographical data, sourced from various online repositories. Subsequently, the data was cleaned to ensure consistency and eliminate incomplete time series. A final set of complete time series was
Rapid and reliable qualification of advanced materials remains a bottleneck in industrial manufacturing, particularly for heterogeneous structures produced via non-conventional additive manufacturing processes. This study introduces a novel framework that links microstructure informatics with a range of expert characterization knowledge using customized and hybrid vision-language representations (VLRs). By integrating deep semantic segmentation with pre-trained multi-modal models (CLIP and FLAVA), we encode both visual microstructural data and textual expert assessments into shared representations. To overcome limitations in general-purpose embeddings, we develop a customized similarity-based representation that incorporates both positive and negative references from expert-annotated images and their associated textual descriptions. This allows zero-shot classification of previously unseen microstructures through a net similarity scoring approach. Validation on an additively manufactured metal matrix composite dataset demonstrates the framework's ability to distinguish between acceptable and defective samples across a range of characterization criteria. Comparative analysis reveals
In the modern world, our cities and societies face several technological and societal challenges, such as rapid urbanization, global warming & climate change, the digital divide, and social inequalities, increasing the need for more sustainable cities and societies. Addressing these challenges requires a multifaceted approach involving all the stakeholders, sustainable planning, efficient resource management, innovative solutions, and modern technologies. Like other modern technologies, social media informatics also plays its part in developing more sustainable and resilient cities and societies. Despite its limitations, social media informatics has proven very effective in various sustainable cities and society applications. In this paper, we review and analyze the role of social media informatics in sustainable cities and society by providing a detailed overview of its applications, associated challenges, and potential solutions. This work is expected to provide a baseline for future research in the domain.
Medical imaging informatics is a rapidly growing field that combines the principles of medical imaging and informatics to improve the acquisition, management, and interpretation of medical images. This chapter introduces the basic concepts of medical imaging informatics, including image processing, feature engineering, and machine learning. It also discusses the recent advancements in computer vision and deep learning technologies and how they are used to develop new quantitative image markers and prediction models for disease detection, diagnosis, and prognosis prediction. By covering the basic knowledge of medical imaging informatics, this chapter provides a foundation for understanding the role of informatics in medicine and its potential impact on patient care.
Literature in cyber security including cyber security in energy informatics are tecnocentric focuses that may miss the chances of understanding a bigger picture of cyber security measures. This research thus aims to conduct a literature review focusing on non-technical issues in cyber security in the energy informatics field. The findings show that there are seven non-technical issues have been discussed in literature, including education, awareness, policy, standards, human, and risks, challenges, and solutions. These findings can be valuable for not only researchers, but also managers, policy makers, and educators.
Dental informatics is a rapidly evolving field that combines dentistry with information technology to improve oral health care delivery, research, and education. Electronic health records (EHRs), telehealth, digital imaging, and other digital tools have revolutionised how dental professionals diagnose, treat, and manage oral health conditions. In this review article, we will explore dental informatics's current trends and future directions, focusing on its impact on clinical practice, research, and education. We will also discuss the challenges and opportunities associated with implementing dental informatics and highlight fundamental research studies and innovations in the field.
Materials informatics, data-enabled investigation, is a "fourth paradigm" in materials science research after the conventional empirical approach, theoretical science, and computational research. Materials informatics has two essential ingredients: fingerprinting materials proprieties and the theory of statistical inference and learning. We have researched the organic semiconductor's enigmas through the materials informatics approach. By applying diverse neural network topologies, logical axiom, and inferencing information science, we have developed data-driven procedures for novel organic semiconductor discovery for the semiconductor industry and knowledge extraction for the materials science community. We have reviewed and corresponded with various algorithms for the neural network design topology for the materials informatics dataset.
Citation network analysis has become one of methods to study how scientific knowledge flows from one domain to another. Health informatics is a multidisciplinary field that includes social science, software engineering, behavioral science, medical science and others. In this study, we perform an analysis of citation statistics from health informatics journals using data set extracted from CrossRef. For each health informatics journal, we extract the number of citations from/to studies related to computer science, medicine/clinical medicine and other fields, including the number of self-citations from the health informatics journal. With a similar number of articles used in our analysis, we show that the Journal of the American Medical Informatics Association (JAMIA) has more in-citations than the Journal of Medical Internet Research (JMIR); while JMIR has a higher number of out-citations and self-citations. We also show that JMIR cites more articles from health informatics journals and medicine related journals. In addition, the Journal of Medical Systems (JMS) cites more articles from computer science journals compared with other health informatics journals included in our analysi
This paper investigates foundation models tailored for music informatics, a domain currently challenged by the scarcity of labeled data and generalization issues. To this end, we conduct an in-depth comparative study among various foundation model variants, examining key determinants such as model architectures, tokenization methods, temporal resolution, data, and model scalability. This research aims to bridge the existing knowledge gap by elucidating how these individual factors contribute to the success of foundation models in music informatics. Employing a careful evaluation framework, we assess the performance of these models across diverse downstream tasks in music information retrieval, with a particular focus on token-level and sequence-level classification. Our results reveal that our model demonstrates robust performance, surpassing existing models in specific key metrics. These findings contribute to the understanding of self-supervised learning in music informatics and pave the way for developing more effective and versatile foundation models in the field. A pretrained version of our model is publicly available to foster reproducibility and future research.
In the evolving landscape of clinical informatics, the integration and utilization of software tools developed through governmental funding represent a pivotal advancement in research and application. However, the dispersion of these tools across various repositories, with no centralized knowledge base, poses significant challenges to leveraging their full potential. This study introduces an automated methodology to bridge this gap by systematically extracting GitHub repository URLs from academic papers indexed in arXiv, focusing on the field of clinical informatics. Our approach encompasses querying the arXiv API for relevant papers, cleaning extracted GitHub URLs, fetching comprehensive repository information via the GitHub API, and analyzing repository maturity based on defined metrics such as stars, forks, open issues, and contributors. The process is designed to be robust, incorporating error handling and rate limiting to ensure compliance with API constraints. Preliminary findings demonstrate the efficacy of this methodology in compiling a centralized knowledge base of NIH-funded software tools, laying the groundwork for an enriched understanding and utilization of these reso
Healthcare has witnessed an increased digitalization in the post-COVID world. Technologies such as the medical internet of things and wearable devices are generating a plethora of data available on the cloud anytime from anywhere. This data can be analyzed using advanced artificial intelligence techniques for diagnosis, prognosis, or even treatment of disease. This advancement comes with a major risk to protecting and securing protected health information (PHI). The prevailing regulations for preserving PHI are neither comprehensive nor easy to implement. The study first identifies twenty activities crucial for privacy and security, then categorizes them into five homogeneous categories namely: $\complement_1$ (Policy and Compliance Management), $\complement_2$ (Employee Training and Awareness), $\complement_3$ (Data Protection and Privacy Control), $\complement_4$ (Monitoring and Response), and $\complement_5$ (Technology and Infrastructure Security) and prioritizes these categories to provide a framework for the implementation of privacy and security in a wise manner. The framework utilized the Delphi Method to identify activities, criteria for categorization, and prioritization.
Large AI models, or foundation models, are models recently emerging with massive scales both parameter-wise and data-wise, the magnitudes of which can reach beyond billions. Once pretrained, large AI models demonstrate impressive performance in various downstream tasks. A prime example is ChatGPT, whose capability has compelled people's imagination about the far-reaching influence that large AI models can have and their potential to transform different domains of our lives. In health informatics, the advent of large AI models has brought new paradigms for the design of methodologies. The scale of multi-modal data in the biomedical and health domain has been ever-expanding especially since the community embraced the era of deep learning, which provides the ground to develop, validate, and advance large AI models for breakthroughs in health-related areas. This article presents a comprehensive review of large AI models, from background to their applications. We identify seven key sectors in which large AI models are applicable and might have substantial influence, including 1) bioinformatics; 2) medical diagnosis; 3) medical imaging; 4) medical informatics; 5) medical education; 6) pu
A key task in the emerging field of materials informatics is to use machine learning to predict a material's properties and functions. A fast and accurate predictive model allows researchers to more efficiently identify or construct a material with desirable properties. As in many fields, deep learning is one of the state-of-the art approaches, but fully training a deep learning model is not always feasible in materials informatics due to limitations on data availability, computational resources, and time. Accordingly, there is a critical need in the application of deep learning to materials informatics problems to develop efficient transfer learning algorithms. The Bayesian framework is natural for transfer learning because the model trained from the source data can be encoded in the prior distribution for the target task of interest. However, the Bayesian perspective on transfer learning is relatively unaccounted for in the literature, and is complicated for deep learning because the parameter space is large and the interpretations of individual parameters are unclear. Therefore, rather than subjective prior distributions for individual parameters, we propose a new Bayesian trans
This paper elucidates the challenges and opportunities inherent in integrating data-driven methodologies into geotechnics, drawing inspiration from the success of materials informatics. Highlighting the intricacies of soil complexity, heterogeneity, and the lack of comprehensive data, the discussion underscores the pressing need for community-driven database initiatives and open science movements. By leveraging the transformative power of deep learning, particularly in feature extraction from high-dimensional data and the potential of transfer learning, we envision a paradigm shift towards a more collaborative and innovative geotechnics field. The paper concludes with a forward-looking stance, emphasizing the revolutionary potential brought about by advanced computational tools like large language models in reshaping geotechnics informatics.
End-user informatics applications are Internet data web management automation solutions. These are mass modeling and mass management collaborative communal consensus solutions. They are made and maintained by managerial, professional, technical and specialist end-users. In end-user informatics the end-users are always right. So it becomes necessary for information technology professionals to understand information and informatics from the end-user perspective. End-user informatics starts with the observation that practical prose is a mass consensus communal modeling technology. This high technology is the mechanistic modeling medium we all use every day in all of our practical pursuits. Practical information flows are the lifeblood of modern capitalist communities. But what exactly is practical information? It's ultimately physical information, but the physics is highly emergent rather than elementary. So practical reality is just physical reality in deep disguise. Practical prose is the medium that we all use to model the everyday and elite mechanics of practical reality. So this is the medium that end-user informatics must automate and animate.