In the early stages of scientific research, researchers rely on core scholarly judgments to identify relevant literature, assess credible evidence, and determine which directions merit pursuit. As AI tools become increasingly integrated into these early-stage workflows, the scholarly judgments that were once transparent and attributable to individual researchers become obscured, raising critical Responsible AI (RAI) concerns around accountability, transparency, and trust. Yet how these three dimensions manifest in real-time, in-situ scholarly practice remains largely unexplored. To address this gap, we conducted a think-aloud study with 15 researchers to examine how they used AI tools powered by large language models (LLMs) across early-stage research tasks, including literature exploration, synthesis, and research ideation. Our key findings address the tripartite constructs of accountability, transparency, and trust. First, the confident tone of AI outputs misrepresents epistemic uncertainty, making it more difficult for researchers (who are ultimately accountable) to identify which outputs require the greatest scrutiny. Second, opaque retrieval and content construction make prove
Reproducibility is a key requirement for scientific progress. It allows the reproduction of the works of others, and, as a consequence, to fully trust the reported claims and results. In this work, we argue that, by facilitating reproducibility of recommender systems experimentation, we indirectly address the issues of accountability and transparency in recommender systems research from the perspectives of practitioners, designers, and engineers aiming to assess the capabilities of published research works. These issues have become increasingly prevalent in recent literature. Reasons for this include societal movements around intelligent systems and artificial intelligence striving towards fair and objective use of human behavioral data (as in Machine Learning, Information Retrieval, or Human-Computer Interaction). Society has grown to expect explanations and transparency standards regarding the underlying algorithms making automated decisions for and around us. This work surveys existing definitions of these concepts, and proposes a coherent terminology for recommender systems research, with the goal to connect reproducibility to accountability. We achieve this by introducing seve
Many Machine Learning research studies use language that describes potential social benefits or technical affordances of new methods and technologies. Such language, which we call "social claims", can help garner substantial resources and influence for those involved in ML research and technology production. However, there exists a gap between social claims and reality (the claim-reality gap): ML methods often fail to deliver the claimed functionality or social impacts. This paper investigates the claim-reality gap and makes a normative argument for developing accountability mechanisms for it. In making the argument, we make three contributions. First, we show why the symptom - absence of social claim accountability - is problematic. Second, we coin dead zone of accountability - a lens that scholars and practitioners can use to identify opportunities for new forms of accountability. We apply this lens to the claim-reality gap and provide a diagnosis by identifying cognitive and structural resistances to accountability in the claim-reality gap. Finally, we offer a prescription - two potential collaborative research agendas that can help create the condition for social claim accounta
Software is at the core of most scientific discoveries today. Therefore, the quality of research results highly depends on the quality of the research software. Rigorous testing, as we know it from software engineering in the industry, could ensure the quality of the research software but it also requires a substantial effort that is often not rewarded in academia. Therefore, this research explores the effects of research software testing integrated into teaching on research software. In an in-vivo experiment, we integrated the engineering of a test suite for a large-scale network simulation as group projects into a course on software testing at the Blekinge Institute of Technology, Sweden, and qualitatively measured the effects of this integration on the research software. We found that the research software benefited from the integration through substantially improved documentation and fewer hardware and software dependencies. However, this integration was effortful and although the student teams developed elegant and thoughtful test suites, no code by students went directly into the research software since we were not able to make the integration back into the research software
Including children's images in datasets has raised ethical concerns, particularly regarding privacy, consent, data protection, and accountability. These datasets, often built by scraping publicly available images from the Internet, can expose children to risks such as exploitation, profiling, and tracking. Despite the growing recognition of these issues, approaches for addressing them remain limited. We explore the ethical implications of using children's images in AI datasets and propose a pipeline to detect and remove such images. As a use case, we built the pipeline on a Vision-Language Model under the Visual Question Answering task and tested it on the #PraCegoVer dataset. We also evaluate the pipeline on a subset of 100,000 images from the Open Images V7 dataset to assess its effectiveness in detecting and removing images of children. The pipeline serves as a baseline for future research, providing a starting point for more comprehensive tools and methodologies. While we leverage existing models trained on potentially problematic data, our goal is to expose and address this issue. We do not advocate for training or deploying such models, but instead call for urgent community r
This scientometric study analyzes Avian Influenza research from 2014 to 2023 using bibliographic data from the Web of Science database. We examined publication trends, sources, authorship, collaborative networks, document types, and geographical distribution to gain insights into the global research landscape. Results reveal a steady increase in publications, with high contributions from Chinese and American institutions. Journals such as PLoS One and the Journal of Virology published the highest number of studies, indicating their influence in this field. The most prolific institutions include the Chinese Academy of Sciences and the University of Hong Kong, while the College of Veterinary Medicine at South China Agricultural University emerged as the most productive department. China and the USA lead in publication volume, though developed nations like the United Kingdom and Germany exhibit a higher rate of international collaboration. "Articles" are the most common document type, constituting 84.6% of the total, while "Reviews" account for 7.6%. This study provides a comprehensive view of global trends in Avian Influenza research, emphasizing the need for collaborative efforts ac
This paper presents multi- and interdisciplinary approaches for finding the appropriate AI technologies for research information. Professional research information management (RIM) is becoming increasingly important as an expressly data-driven tool for researchers. It is not only the basis of scientific knowledge processes, but also related to other data. A concept and a process model of the elementary phases from the start of the project to the ongoing operation of the AI methods in the RIM is presented, portraying the implementation of an AI project, meant to enable universities and research institutions to support their researchers in dealing with incorrect and incomplete research information, while it is being stored in their RIMs. Our aim is to show how research information harmonizes with the challenges of data literacy and data quality issues, related to AI, also wanting to underline that any project can be successful if the research institutions and various departments of universities, involved work together and appropriate support is offered to improve research information and data management.
Recent advances in artificial intelligence, particularly generative AI (GenAI) and large language models (LLMs), are fundamentally transforming accounting research, creating both opportunities and competitive threats for scholars. This paper proposes a framework that classifies AI-accounting research along two dimensions: research focus (accounting-centric versus AI-centric) and methodological approach (AI-based versus traditional methods). We apply this framework to papers from the IJAIS special issue and recent AI-accounting research published in leading accounting journals to map existing studies and identify research opportunities. Using this same framework, we analyze how accounting researchers can leverage their expertise through strategic positioning and collaboration, revealing where accounting scholars' strengths create the most value. We further examine how GenAI and LLMs transform the research process itself, comparing the capabilities of human researchers and AI agents across the entire research workflow. This analysis reveals that while GenAI democratizes certain research capabilities, it simultaneously intensifies competition by raising expectations for higher-order c
Demographic data collection is essential in education research, as demographic data allows researchers to better describe the participant population they study and to contextualize findings. However, current research practices for neurodiversity demographics often rely on prescriptive methods (e.g., requiring participants to report official diagnoses) rather than allowing participants to self-identify. This approach can: a) not allow participants to express their intersecting identities in ways that are authentic; and b) limit trustworthiness and reliability of the data and interpretation. In addition, inconsistent dissemination and representation of demographic data across studies hinder the accessibility and usability of this work. Through a literature review of neurodivergent student experiences with learning and performing STEM, we identified widespread discrepancies in how demographic information is collected and reported. This paper explores how neurodivergent identities can be more accurately and inclusively represented in education research. We present findings of a thematic analysis on the ways neurodivergent demographic data collection is done in the literature using data
This paper presents a scientometric analysis of research output from the University of Lagos, focusing on the two decades spanning 2004 to 2023. Using bibliometric data retrieved from the Web of Science, we examine trends in publication volume, collaboration patterns, citation impact, and the most prolific authors, departments, and research domains at the university. The study reveals a consistent increase in research productivity, with the highest publication output recorded in 2023. Health Sciences, Engineering, and Social Sciences are identified as dominant fields, reflecting the university's interdisciplinary research strengths. Collaborative efforts, both locally and internationally, show a positive correlation with higher citation impact, with the United States and the United Kingdom being the leading international collaborators. Notably, open-access publications account for a significant portion of the university's research output, enhancing visibility and citation rates. The findings offer valuable insights into the university's research performance over the past two decades, providing a foundation for strategic planning and policy formulation to foster research excellence
A series of high-profile tragedies involving companion chatbots has triggered an unusually rapid regulatory response. Several jurisdictions, including Australia, California, and New York, have introduced enforceable regulation, while regulators elsewhere have signaled growing concern about risks posed by companion chatbots, particularly to children. In parallel, leading providers, notably OpenAI, appear to have strengthened their self-regulatory approaches. Drawing on legal textual analysis and insights from regulatory theory, psychology, and information systems research, this paper critically examines these recent interventions. We examine what is regulated and who is regulated, identifying regulatory targets, scope, and modalities. We classify interventions by method and priority, showing how emerging regimes combine "locks and blocks", such as access gating and content moderation, with measures addressing toxic relationship features and process-based accountability requirements. We argue that effective regulation of companion chatbots must integrate all three dimensions. More, however, is required. Current regimes tend to focus on discrete harms, narrow conceptions of vulnerabil
Research is facing a reproducibility crisis, in which the results and findings of many studies are difficult or even impossible to reproduce. This is also the case in machine learning (ML) and artificial intelligence (AI) research. Often, this is the case due to unpublished data and/or source-code, and due to sensitivity to ML training conditions. Although different solutions to address this issue are discussed in the research community such as using ML platforms, the level of reproducibility in ML-driven research is not increasing substantially. Therefore, in this mini survey, we review the literature on reproducibility in ML-driven research with three main aims: (i) reflect on the current situation of ML reproducibility in various research fields, (ii) identify reproducibility issues and barriers that exist in these research fields applying ML, and (iii) identify potential drivers such as tools, practices, and interventions that support ML reproducibility. With this, we hope to contribute to decisions on the viability of different solutions for supporting ML reproducibility.
The present study attempts to highlight the research output generated in Russia in coronary artery disease (CAD) research during the period 1990-2019 to understand the distribution of research output, top journals for publications, and most prolific authors, authorship pattern, and citation pattern. This study is based on secondary data extracted from the Science Citation Index (SCI), which is an integral component of the Web of Science. Descriptive and inferential statistical techniques were applied in the study. There were 5058 articles by Russian scholars in coronary artery disease during 1990-2019; they preferred to publish in Russian journals. The research contributions were in the form of research articles, meeting abstracts and reviews with a consistent drop in the number of editorial material and article; proceedings paper with time. Co-authorship was the norm in coronary artery disease research, with a steady increase in the number of multi-author documents in recent years.
As Engineering Education Research (EER) develops as a discipline it is necessary for EER scholars to contribute to the development of learning theory rather than simply being informed by it. It has been suggested that to do this effectively will require partnerships between Engineering scholars and psychologists, education researchers, including other social scientists. The formation of such partnerships is particularly important when considering the introduction of business-related skills into engineering curriculum designed to prepare 21st Century Engineering Students for workplace challenges. In order to encourage scholars beyond Engineering to engage with EER, it is necessary to provide an introduction to the complexities of EER. With this aim in mind, this paper provides an outline review of what is considered rigorous research from an EER perspective as well as highlighting some of the core methodological traditions of EER. The paper aims to facilitate further discussion between EER scholars and researchers from other disciplines, ultimately leading to future collaboration on innovative and rigorous EER.
Drawing on 1,178 safety and reliability papers from 9,439 generative AI papers (January 2020 - March 2025), we compare research outputs of leading AI companies (Anthropic, Google DeepMind, Meta, Microsoft, and OpenAI) and AI universities (CMU, MIT, NYU, Stanford, UC Berkeley, and University of Washington). We find that corporate AI research increasingly concentrates on pre-deployment areas -- model alignment and testing & evaluation -- while attention to deployment-stage issues such as model bias has waned. Significant research gaps exist in high-risk deployment domains, including healthcare, finance, misinformation, persuasive and addictive features, hallucinations, and copyright. Without improved observability into deployed AI, growing corporate concentration could deepen knowledge deficits. We recommend expanding external researcher access to deployment data and systematic observability of in-market AI behaviors.
Social accountability refers to promoting good governance by making ruling elites more responsive. In Bangladesh, where bureaucracy and legislature operate with little effective accountability or checks and balances, traditional horizontal or vertical accountability proved to be very blunt and weak. In the presence of such faulty mechanisms, ordinary citizens access to information is frequently denied, and their voices are kept mute. It impasses the formation of an enabling environment, where activists and civil society institutions representing the ordinary peoples interest are actively discouraged. They become vulnerable to retribution. Social accountability, on the other hand, provides an enabling environment for activists and civil society institutions to operate freely. Thus, leaders and administration become more accountable to people. An enabling environment means providing legal protection, enhancing the availability of information and increasing citizen voice, strengthening institutional and public service capacities and directing incentives that foster accountability. Donors allocate significant shares of resources to encouraging civil society to partner with elites rathe
Physics education researchers (PER) often analyze student data with single-level regression models (e.g., linear and logistic regression). However, education datasets can have hierarchical structures, such as students nested within courses, that single-level models fail to account for. The improper use of single-level models to analyze hierarchical datasets can lead to biased findings. Hierarchical models (a.k.a., multi-level models) account for this hierarchical nested structure in the data. In this publication, we outline the theoretical differences between how single-level and multi-level models handle hierarchical datasets. We then present analysis of a dataset from 112 introductory physics courses using both multiple linear regression and hierarchical linear modeling to illustrate the potential impact of using an inappropriate analytical method on PER findings and implications. Research can leverage multi-institutional datasets to improve the field's understanding of how to support student success in physics. There is no post hoc fix, however, if researchers use inappropriate single-level models to analyze multi-level datasets. To continue developing reliable and generalizable
AI coding assistants and autonomous agents are becoming integral to software development workflows, reshaping how code is produced, reviewed, and maintained. While recent research has focused mainly on the capabilities and impacts of productivity of these systems, much less attention has been paid to accountability: who is responsible when agents generate, modify, or recommend code? In practice, accountability is defined through the Terms of Service (ToS) and related policy documents that govern the use of AI-powered development tools. In this vision paper, we present a comparative analysis of the Terms of Service for widely used AI coding assistants and agent-enabled development tools. We examine how these documents allocate ownership, responsibility, liability, and disclosure obligations between tool providers and software developers, and we identify common patterns and divergences between providers. Our analysis reveals a consistent tendency to shift responsibility for correctness, safety, and legal compliance onto users, as well as substantial variation in how providers address issues such as indemnification, data reuse, and acceptable use. Based on these findings, we argue tha
Academic and policy proposals on algorithmic accountability often seek to understand algorithmic systems in their socio-technical context, recognising that they are produced by 'many hands'. Increasingly, however, algorithmic systems are also produced, deployed, and used within a supply chain comprising multiple actors tied together by flows of data between them. In such cases, it is the working together of an algorithmic supply chain of different actors who contribute to the production, deployment, use, and functionality that drives systems and produces particular outcomes. We argue that algorithmic accountability discussions must consider supply chains and the difficult implications they raise for the governance and accountability of algorithmic systems. In doing so, we explore algorithmic supply chains, locating them in their broader technical and political economic context and identifying some key features that should be understood in future work on algorithmic governance and accountability (particularly regarding general purpose AI services). To highlight ways forward and areas warranting attention, we further discuss some implications raised by supply chains: challenges for a
This report by the CRA Working Group on Socially Responsible Computing outlines guidelines for ethical and responsible research practices in computing conferences. Key areas include avoiding harm, responsible vulnerability disclosure, ethics board review, obtaining consent, accurate reporting, managing financial conflicts of interest, and the use of generative AI. The report emphasizes the need for conference organizers to adopt clear policies to ensure responsible computing research and publication, highlighting the evolving nature of these guidelines as understanding and practices in the field advance.