共找到 20 条结果
Numerous metascience studies and other initiatives have begun to monitor the prevalence of open science practices when it is more important to understand the 'downstream' effects or impacts of open science. PLOS and DataSeer have developed a new LLM-based indicator to measure an important effect of open science: the reuse of research data. Our results show a data reuse rate of 43%, which is higher than established bibliometric techniques. We show that data reuse can be measured at scale using LLMs and generative artificial intelligence. The positive effects of research data sharing and reuse may currently be underestimated.
AI for Science (AI4Science), particularly in the form of self-driving labs, has the potential to sideline human involvement and hinder scientific discovery within the broader community. While prior research has focused on ensuring the responsible deployment of AI applications, enhancing security, and ensuring interpretability, we also propose that promoting openness in AI4Science discoveries should be carefully considered. In this paper, we introduce the concept of AI for Open Science (AI4OS) as a multi-agent extension of AI4Science with the core principle of maximizing open knowledge translation throughout the scientific enterprise rather than a single organizational unit. We use the established principles of Knowledge Discovery and Data Mining (KDD) to formalize a language around AI4OS. We then discuss three principle stages of knowledge translation embedded in AI4Science systems and detail specific points where openness can be applied to yield an AI4OS alternative. Lastly, we formulate a theoretical metric to assess AI4OS with a supporting ethical argument highlighting its importance. Our goal is that by drawing attention to AI4OS we can ensure the natural consequence of AI4Scie
This paper reviews research literature on Diamond Open Access (DOA) journals - sometimes also called Platinum Open Access - that was produced after this journal segment started to become a priority in European research policy around 2020. It contextualizes the current science policy debate, critically examines different understandings of DOA, and reviews studies on the role of such journals in scholarly communication. Most existing research consists of quantitative studies focusing on aspects such as the number of DOA journals, their publication output, the diversity of the landscape in terms of subject areas, languages, publishing entities, indexing in major databases, awareness and perception among scholars, cost analyses, as well as insights into the internal operations of DOA journals. The review shows that research on DOA journals is partly influenced by the science policy discourse in at least two ways: first, through the normativity inherent in that discourse, and second, through the temporality of policy-driven research of practical relevance, which leaves important aspects of the phenomenon understudied. Moreover, research on the DOA journal landscape has implications beyo
We present the Open MatSci ML Toolkit: a flexible, self-contained, and scalable Python-based framework to apply deep learning models and methods on scientific data with a specific focus on materials science and the OpenCatalyst Dataset. Our toolkit provides: 1. A scalable machine learning workflow for materials science leveraging PyTorch Lightning, which enables seamless scaling across different computation capabilities (laptop, server, cluster) and hardware platforms (CPU, GPU, XPU). 2. Deep Graph Library (DGL) support for rapid graph neural network prototyping and development. By publishing and sharing this toolkit with the research community via open-source release, we hope to: 1. Lower the entry barrier for new machine learning researchers and practitioners that want to get started with the OpenCatalyst dataset, which presently comprises the largest computational materials science dataset. 2. Enable the scientific community to apply advanced machine learning tools to high-impact scientific challenges, such as modeling of materials behavior for clean energy applications. We demonstrate the capabilities of our framework by enabling three new equivariant neural network models for
Nowadays, protecting trust in social sciences also means engaging in open community dialogue, which helps to safeguard robustness and improve efficiency of research methods. The combination of open data, open review and open dialogue may sound simple but implementation in the real world will not be straightforward. However, in view of Begley and Ellis's (2012) statement that, "the scientific process demands the highest standards of quality, ethics and rigour," they are worth implementing. More importantly, they are feasible to work on and likely will help to restore plausibility to social sciences research. Therefore, I feel it likely that the triplet of open data, open review and open dialogue will gradually emerge to become policy requirements regardless of the research funding source.
Authorship of scientific articles has profoundly changed from early science until now. While once upon a time a paper was authored by a handful of authors, scientific collaborations are much more prominent on average nowadays. As authorship (and citation) is essentially the primary reward mechanism according to the traditional research evaluation frameworks, it turned out to be a rather hot-button topic from which a significant portion of academic disputes stems. However, the novel Open Science practices could be an opportunity to disrupt such dynamics and diversify the credit of the different scientific contributors involved in the diverse phases of the lifecycle of the same research effort. In fact, a paper and research data (or software) contextually published could exhibit different authorship to give credit to the various contributors right where it feels most appropriate. As a preliminary study, in this paper, we leverage the wealth of information contained in Open Science Graphs, such as OpenAIRE, and conduct a focused analysis on a subset of publications with supplementary material drawn from the European Marine Science (MES) research community. The results are promising an
It is widely recognised nowadays that there is no single, accepted, unified definition of Open Science, which motivates our proposal of an Open Science definition as a political and legal framework where research outputs are shared and disseminated in order to be rendered visible, accessible, reusable is developed, standing over the concepts enhanced by the Budapest Open Science Initiative (BOAI), and by the Free/Open Source Software (FOSS) and Open data movements. We elaborate this proposal through a detailed analysis of some selected EC policies and laws as well as of the function of research evaluation practices. The legal aspects considered in our examination include, in particular, the study of the role of licenses in the context of the dissemination of research outputs.
Let $(M,ξ)$ be a contact 3-manifold. We present two new algorithms, the first of which converts an open book $(Σ,Φ)$ supporting $(M,ξ)$ with connected binding into a contact surgery diagram. The second turns a contact surgery diagram for $(M,ξ)$ into a supporting open book decomposition. These constructions lead to a refinement of a result of Ding-Geiges, which states that every such $(M,ξ)$ may be obtained by contact surgery from $(S^{3},ξ_{std})$, as well as bounds on the support norm and genus of contact manifolds obtained by surgery in terms of classical link data. We then introduce Kirby moves called ribbon moves which use mapping class relations to modify contact surgery diagrams. Any two surgery diagrams of the same contact 3-manifold are related by a sequence of Legendrian isotopies and ribbon moves. As most of our results are computational in nature, a number of examples are analyzed.
Access to the work of others is something that is too often taken for granted, yet problematic and difficult to be obtained unless someone pays for it. Green and gold open access are claimed to be a solution to this problem. While open access is gaining momentum in some fields, there is a limited and seasoned knowledge about self-archiving in computer science. In particular, there is an inadequate understanding of author-based self-archiving awareness, practice, and inhibitors. This article reports an exploratory study of the awareness of self-archiving, the practice of self-archiving, and the inhibitors of self-archiving among authors in an Italian computer science faculty. Forty-nine individuals among interns, PhD students, researchers, and professors were recruited in a questionnaire (response rate of 72.8%). The quantitative and qualitative responses suggested that there is still work needed in terms of advocating green open access to computer science authors who seldom self-archive and when they do, they often infringe the copyright transfer agreements (CTAs) of the publishers. In addition, tools from the open-source community are needed to facilitate author-based self-archivi
GREX-PLUS (Galaxy Reionization EXplorer and PLanetary Universe Spectrometer) is a mission candidate for a JAXA strategic L-class mission to be launched in the 2030s. Its primary science goals are two-fold: galaxy formation and evolution, and planetary system formation and evolution. The GREX-PLUS spacecraft will carry a telescope with a 1 m primary mirror aperture cooled down to 50 K. The two science instruments will be onboard: a wide-field camera in the 2--8 $μ$m wavelength band and a high-resolution spectrometer with a wavelength resolution of 30,000 in the 10--18 $μ$m band. The GREX-PLUS wide-field camera aims to detect the first generation of galaxies at redshift $z>15$. The GREX-PLUS high-resolution spectrometer aims to identify the location of the water ``snowline'' in protoplanetary disks. Both instruments will provide unique datasets for a broad range of scientific topics, including galaxy mass assembly, the origin of supermassive blackholes, infrared background radiation, molecular spectroscopy in the interstellar medium, transit spectroscopy of exoplanet atmospheres, planetary atmospheres in the Solar System, and so on. This document is the second version of a collect
Mauve is a low-cost small satellite developed and operated by Blue Skies Space Ltd. The payload features a 13 cm telescope connected with a fibre that feeds into a UV-Vis spectrometer. The detector covers the 200-700 nm range in a single shot, obtaining low resolution spectra at R~20-65. Mauve has launched on 28th November 2025, reaching a 510 km Low-Earth Sun-synchronous orbit. The satellite will enable UV and visible observations of a variety of stellar objects in our Galaxy, filling the gaps in the ultraviolet space-based data. The researchers that have already joined the mission have defined the science themes, observational strategy and targets that Mauve will observe in the first year of operations. To date 10 science themes have been developed by the Mauve science collaboration for year 1, with observational strategies that include both long duration monitoring and short cadence snapshots. Here, we describe these themes and the science that Mauve will undertake in its first year of operations.
The Open Science Grid (OSG) includes work to enable new science, new scientists, and new modalities in support of computationally based research. There are frequently significant sociological and organizational changes required in transformation from the existing to the new. OSG leverages its deliverables to the large scale physics experiment member communities to benefit new communities at all scales through activities in education, engagement and the distributed facility. As a partner to the poster and tutorial at SciDAC 2008, this paper gives both a brief general description and some specific examples of new science enabled on the OSG. More information is available at the OSG web site: (http://www.opensciencegrid.org).
The large instantaneous sensitivity, a wide frequency coverage and flexible observation modes with large number of beams in the sky are the main features of the SKA observatory's two telescopes, the SKA-Low and the SKA-Mid, which are located on two different continents. Owing to these capabilities, the SKAO telescopes are going to be a game-changer for radio astronomy in general and pulsar astronomy in particular. The eleven articles in this special issue on pulsar science with the SKA Observatory describe its impact on different areas of pulsar science. In this lead article, a brief description of the two telescopes highlighting the relevant features for pulsar science is presented followed by an overview of each accompanying article, exploring the inter-relationship between different pulsar science use cases.
This paper reflects on a number of trends towards a more open and reproducible approach to geographic and spatial data science over recent years. In particular it considers trends towards Big Data, and the impacts this is having on spatial data analysis and modelling. It identifies a turn in academia towards coding as a core analytic tool, and away from proprietary software tools offering 'black boxes' where the internal workings of the analysis are not revealed. It is argued that this closed form software is problematic, and considers a number of ways in which issues identified in spatial data analysis (such as the MAUP) could be overlooked when working with closed tools, leading to problems of interpretation and possibly inappropriate actions and policies based on these. In addition, this paper and considers the role that reproducible and open spatial science may play in such an approach, taking into account the issues raised. It highlights the dangers of failing to account for the geographical properties of data, now that all data are spatial (they are collected somewhere), the problems of a desire for n=all observations in data science and it identifies the need for a critical
Concept erasure in text-to-image diffusion models is crucial for mitigating harmful content, yet existing methods often compromise generative quality. We introduce Semantic Surgery, a novel training-free, zero-shot framework for concept erasure that operates directly on text embeddings before the diffusion process. It dynamically estimates the presence of target concepts in a prompt and performs a calibrated vector subtraction to neutralize their influence at the source, enhancing both erasure completeness and locality. The framework includes a Co-Occurrence Encoding module for robust multi-concept erasure and a visual feedback loop to address latent concept persistence. As a training-free method, Semantic Surgery adapts dynamically to each prompt, ensuring precise interventions. Extensive experiments on object, explicit content, artistic style, and multi-celebrity erasure tasks show our method significantly outperforms state-of-the-art approaches. We achieve superior completeness and robustness while preserving locality and image quality (e.g., 93.58 H-score in object erasure, reducing explicit content to just 1 instance, and 8.09 H_a in style erasure with no quality degradation).
The CamCAN Lifespan Neuroimaging Dataset, Cambridge (UK) Centre for Ageing and Neuroscience, was acquired and processed beginning in December, 2016. The referee consensus solver deployed to the Open Science Grid was used for this task. The dataset includes demographic and screening measures, a high-resolution MRI scan of the brain, and whole-head magnetoencephalographic (MEG) recordings during eyes closed rest (560 sec), a simple task (540 sec), and passive listening/viewing (140 sec). The data were collected from 619 neurologically normal individuals, ages 18-87. The processed results from the resting recordings are completed and available online. These constitute 1.7 TBytes of data including the location within the brain (1 mm resolution), time stamp (1 msec resolution), and 80 msec time course for each of 3.7 billion validated neuroelectric events, i.e. mean 6.1 million events for each of the 619 participants. The referee consensus solver provides high yield (mean 11,000 neuroelectric currents/sec; standard deviation (sd): 3500/sec) high confidence (p < 10-12 for each identified current) measures of the neuroelectric currents whose magnetic fields are detected in the MEG reco
Data science and technology offer transformative tools and methods to science. This review article highlights latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS). A large amount of data and machine learning algorithms go hand in hand. Most plasma data, whether experimental, observational or computational, are generated or collected by machines today. It is now becoming impractical for humans to analyze all the data manually. Therefore, it is imperative to train machines to analyze and interpret (eventually) such data as intelligently as humans but far more efficiently in quantity. Despite the recent impressive progress in applications of data science to plasma science and technology, the emerging field of DDPS is still in its infancy. Fueled by some of the most challenging problems such as fusion energy, plasma processing of materials, and fundamental understanding of the universe through observable plasma phenomena, it is expected that DDPS continues to benefit significantly from the interdisciplinary marriage between plasma science and data science into the foreseeable future.
We construct an open book decomposition compatible with a contact structure given by a rational contact surgery on a Legendrian link in the standard contact $S^3$. As an application we show that some rational contact surgeries on certain Legendrian knots induce overtwisted contact structures.
With the rise of Wikipedia as a first-stop source for scientific knowledge, it is important to compare its representation of that knowledge to that of the academic literature. Here we identify the 250 most heavily used journals in each of 26 research fields (4,721 journals, 19.4M articles in total) indexed by the Scopus database, and test whether topic, academic status, and accessibility make articles from these journals more or less likely to be referenced on Wikipedia. We find that a journal's academic status (impact factor) and accessibility (open access policy) both strongly increase the probability of it being referenced on Wikipedia. Controlling for field and impact factor, the odds that an open access journal is referenced on the English Wikipedia are 47% higher compared to paywall journals. One of the implications of this study is that a major consequence of open access policies is to significantly amplify the diffusion of science, through an intermediary like Wikipedia, to a broad audience.
The LIGO Open Science Center (LOSC) fulfills LIGO's commitment to release, archive, and serve LIGO data in a broadly accessible way to the scientific community and to the public, and to provide the information and tools necessary to understand and use the data. In August 2014, the LOSC published the full dataset from Initial LIGO's "S5" run at design sensitivity, the first such large-scale release and a valuable testbed to explore the use of LIGO data by non-LIGO researchers and by the public, and to help teach gravitational-wave data analysis to students across the world. In addition to serving the S5 data, the LOSC web portal (losc.ligo.org) now offers documentation, data-location and data-quality queries, tutorials and example code, and more. We review the mission and plans of the LOSC, focusing on the S5 data release.