This study explores human perceptions of intelligent agents by comparing interactions with a humanoid robot and a virtual human avatar, both utilizing GPT-3 for response generation. The study aims to understand how physical and virtual embodiments influence perceptions of anthropomorphism, animacy, likeability, and perceived intelligence. The uncanny valley effect was also investigated in the scope of this study based on the two agents' human-likeness and affinity. Conducted with ten participants from Sabanci University, the experiment involved tasks that sought advice, followed by assessments using the Godspeed Questionnaire Series and structured interviews. Results revealed no significant difference in anthropomorphism between the humanoid robot and the virtual human avatar, but the humanoid robot was perceived as more likable and slightly more intelligent, highlighting the importance of physical presence and interactive gestures. These findings suggest that while virtual avatars can achieve high human-likeness, physical embodiment enhances likeability and perceived intelligence. However, the study's scope was insufficient to claim the existence of the uncanny valley effect in th
The rapid rise of AI-generated art has sparked debate about potential biases in how audiences perceive and evaluate such works. This study investigates how composer information and listener characteristics shape the perception of AI-generated music, adopting a mixed-method approach. Using a diverse set of stimuli across various genres from two AI music models, we examine effects of perceived authorship on liking and emotional responses, and explore how attitudes toward AI, personality traits, and music-related variables influence evaluations. We further assess the influence of perceived humanness and analyze open-ended responses to uncover listener criteria for judging AI-generated music. Attitudes toward AI proved to be the best predictor of both liking and emotional intensity of AI-generated music. This quantitative finding was complemented by qualitative themes from our thematic analysis, which identified ethical, cultural, and contextual considerations as important criteria in listeners' evaluations of AI-generated music. Our results offer a nuanced view of how people experience music created by AI tools and point to key factors and methodological considerations for future rese
Understanding human perceptions of robot performance is crucial for designing socially intelligent robots that can adapt to human expectations. Current approaches often rely on surveys, which can disrupt ongoing human-robot interactions. As an alternative, we explore predicting people's perceptions of robot performance using non-verbal behavioral cues and machine learning techniques. We contribute the SEAN TOGETHER Dataset consisting of observations of an interaction between a person and a mobile robot in Virtual Reality, together with perceptions of robot performance provided by users on a 5-point scale. We then analyze how well humans and supervised learning techniques can predict perceived robot performance based on different observation types (like facial expression and spatial behavior features). Our results suggest that facial expressions alone provide useful information, but in the navigation scenarios that we considered, reasoning about spatial features in context is critical for the prediction task. Also, supervised learning techniques outperformed humans' predictions in most cases. Further, when predicting robot performance as a binary classification task on unseen users'
Positive human-perception of robots is critical to achieving sustained use of robots in shared environments. One key factor affecting human-perception of robots are their sounds, especially the consequential sounds which robots (as machines) must produce as they operate. This paper explores qualitative responses from 182 participants to gain insight into human-perception of robot consequential sounds. Participants viewed videos of different robots performing their typical movements, and responded to an online survey regarding their perceptions of robots and the sounds they produce. Topic analysis was used to identify common properties of robot consequential sounds that participants expressed liking, disliking, wanting or wanting to avoid being produced by robots. Alongside expected reports of disliking high pitched and loud sounds, many participants preferred informative and audible sounds (over no sound) to provide predictability of purpose and trajectory of the robot. Rhythmic sounds were preferred over acute or continuous sounds, and many participants wanted more natural sounds (such as wind or cat purrs) in-place of machine-like noise. The results presented in this paper suppor
As AI technology continues to advance, the importance of human-AI collaboration becomes increasingly evident, with numerous studies exploring its potential in various fields. One vital field is data science, including feature engineering (FE), where both human ingenuity and AI capabilities play pivotal roles. Despite the existence of AI-generated recommendations for FE, there remains a limited understanding of how to effectively integrate and utilize humans' and AI's knowledge. To address this gap, we design a readily-usable prototype, human\&AI-assisted FE in Jupyter notebooks. It harnesses the strengths of humans and AI to provide feature suggestions to users, seamlessly integrating these recommendations into practical workflows. Using the prototype as a research probe, we conducted an exploratory study to gain valuable insights into data science practitioners' perceptions, usage patterns, and their potential needs when presented with feature suggestions from both humans and AI. Through qualitative analysis, we discovered that the Creator of the feature (i.e., AI or human) significantly influences users' feature selection, and the semantic clarity of the suggested feature gre
There is a wide variety of music similarity detection algorithms, while discussions about music plagiarism in the real world are often based on audience perceptions. Therefore, we aim to conduct a study to examine the key criteria of human perception of music plagiarism, focusing on the three commonly used musical features in similarity analysis: melody, rhythm, and chord progression. After identifying the key features and levels of variation humans use in perceiving musical similarity, we propose a LLM-as-a-judge framework that applies a systematic, step-by-step approach, drawing on modules that extract such high-level attributes.
Humans are routinely asked to evaluate the performance of other individuals, separating success from failure and affecting outcomes from science to education and sports. Yet, in many contexts, the metrics driving the human evaluation process remain unclear. Here we analyse a massive dataset capturing players' evaluations by human judges to explore human perception of performance in soccer, the world's most popular sport. We use machine learning to design an artificial judge which accurately reproduces human evaluation, allowing us to demonstrate how human observers are biased towards diverse contextual features. By investigating the structure of the artificial judge, we uncover the aspects of the players' behavior which attract the attention of human judges, demonstrating that human evaluation is based on a noticeability heuristic where only feature values far from the norm are considered to rate an individual's performance.
With the development of generative AI technology, a vast array of AI-generated paintings (AIGP) have gone viral on social media like TikTok. However, some negative news about AIGP has also emerged. For example, in 2022, numerous painters worldwide organized a large-scale anti-AI movement because of the infringement in generative AI model training. This event reflected a social issue that, with the development and application of generative AI, public feedback and feelings towards it may have been overlooked. Therefore, to investigate public interactions and perceptions towards AIGP on social media, we analyzed user engagement level and comment sentiment scores of AIGP using human painting videos as a baseline. In analyzing user engagement, we also considered the possible moderating effect of the aesthetic quality of Paintings. Utilizing topic modeling, we identified seven reasons, including hyperrealistic quality, ambivalent reactions, perceived theft of art, etc., leading to negative public perceptions of AIGP. Our work may provide instructive suggestions for future generative AI technology development and avoid potential crises in human-AI collaboration.
Which factors influence the human assessment of creativity exhibited by a computational system is a core question of computational creativity (CC) research. Recently, the system's embodiment has been put forward as such a factor, but empirical studies of its effect are lacking. To this end, we propose an experimental framework which isolates the effect of embodiment on the perception of creativity from its effect on creativity per se. We not only manipulate the system's embodiment, but also the perceptual evidence as the basis for the human creativity assessment. We motivate the core framework with embodiment and perceptual evidence as independent and the creative process as controlled variable, and we provide recommendations on measuring the assessment of creativity as dependent variable. We hope the framework will inspire others to study the human perception of embodied CC in a principled manner.
Current discourse on Artificial Intelligence (AI) ethics, dominated by "trustworthy" and "responsible" AI, overlooks a more fundamental human-computer interaction (HCI) crisis: the erosion of human agency. This paper argues that the primary challenge of high-stakes AI systems is not trust, but the preservation of human causal control. We posit that "bad AI" will function as "bad UI," a metaphor for catastrophic interface failures that misrepresent system state and lead to human error. Applying Marshall McLuhan's media theory, AI can be framed as a technology of "augmentation" that simultaneously "amputates" the user's direct perception of causality. This places the interface as the critical locus where a "double uncertainty"--that of the human user and that of the probabilistic model--must be mediated. We critique current Explainable AI (XAI) for its correlational focus and failure to represent uncertainty. We conclude by proposing a rigorous, nested Causal-Agency Framework (CAF) that integrates causal models, uncertainty quantification, and human-centered evaluation to restore agency at the interface.
Using "Analyze Results" at the Web of Science, one can directly generate overlays onto global journal maps of science. The maps are based on the 10,000+ journals contained in the Journal Citation Reports (JCR) of the Science and Social Science Citation Indices (2011). The disciplinary diversity of the retrieval is measured in terms of Rao-Stirling's "quadratic entropy." Since this indicator of interdisciplinarity is normalized between zero and one, the interdisciplinarity can be compared among document sets and across years, cited or citing. The colors used for the overlays are based on Blondel et al.'s (2008) community-finding algorithms operating on the relations journals included in JCRs. The results can be exported from VOSViewer with different options such as proportional labels, heat maps, or cluster density maps. The maps can also be web-started and/or animated (e.g., using PowerPoint). The "citing" dimension of the aggregated journal-journal citation matrix was found to provide a more comprehensive description than the matrix based on the cited archive. The relations between local and global maps and their different functions in studying the sciences in terms of journal lit
Although artificial intelligence (AI) has achieved many feats at a rapid pace, there still exist open problems and fundamental shortcomings related to performance and resource efficiency. Since AI researchers benchmark a significant proportion of performance standards through human intelligence, cognitive sciences-inspired AI is a promising domain of research. Studying cognitive science can provide a fresh perspective to building fundamental blocks in AI research, which can lead to improved performance and efficiency. In this review paper, we focus on the cognitive functions of perception, which is the process of taking signals from one's surroundings as input, and processing them to understand the environment. Particularly, we study and compare its various processes through the lens of both cognitive sciences and AI. Through this study, we review all current major theories from various sub-disciplines of cognitive science (specifically neuroscience, psychology and linguistics), and draw parallels with theories and techniques from current practices in AI. We, hence, present a detailed collection of methods in AI for researchers to build AI systems inspired by cognitive science. Fur
Using the Scopus dataset (1996-2007) a grand matrix of aggregated journal-journal citations was constructed. This matrix can be compared in terms of the network structures with the matrix contained in the Journal Citation Reports (JCR) of the Institute of Scientific Information (ISI). Since the Scopus database contains a larger number of journals and covers also the humanities, one would expect richer maps. However, the matrix is in this case sparser than in the case of the ISI data. This is due to (i) the larger number of journals covered by Scopus and (ii) the historical record of citations older than ten years contained in the ISI database. When the data is highly structured, as in the case of large journals, the maps are comparable, although one may have to vary a threshold (because of the differences in densities). In the case of interdisciplinary journals and journals in the social sciences and humanities, the new database does not add a lot to what is possible with the ISI databases.
Rankings of scholarly journals based on citation data are often met with skepticism by the scientific community. Part of the skepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of authors. This paper focuses on analysis of the table of cross-citations among a selection of Statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care in order to avoid potential over-interpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's Research Assessment Exercise shows strong correlation at aggregate level between assessed research quality and journal citation `export scores' within the discipline of Statistics.
Human-robot collaboration has significant potential in recycling due to the wide variation in the composition of recyclable products. Six participants performed a recyclable item sorting task collaborating with a robot arm equipped with a vision system. The effect of three different levels of human involvement or assistance to the robot (Level 1- occlusion removal; Level 2- optimal spacing; Level 3- optimal grip) on performance metrics such as robot accuracy, task time and subjective fluency were assessed. Results showed that human involvement had a remarkable impact on the robot's accuracy, which increased with human involvement level. Mean accuracy values were 33.3% for Level 1, 69% for Level 2 and 100% for Level 3. The results imply that for sorting processes involving diverse materials that vary in size, shape, and composition, human assistance could improve the robot's accuracy to a significant extent while also being cost-effective.
A number of journal classification systems have been developed in bibliometrics since the launch of the Citation Indices by the Institute of Scientific Information (ISI) in the 1960s. These systems are used to normalize citation counts with respect to field-specific citation patterns. The best known system is the so-called "Web-of-Science Subject Categories" (WCs). In other systems papers are classified by algorithmic solutions. Using the Journal Citation Reports 2014 of the Science Citation Index and the Social Science Citation Index (n of journals = 11,149), we examine options for developing a new system based on journal classifications into subject categories using aggregated journal-journal citation data. Combining routines in VOSviewer and Pajek, a tree-like classification is developed. At each level one can generate a map of science for all the journals subsumed under a category. Nine major fields are distinguished at the top level. Further decomposition of the social sciences is pursued for the sake of example with a focus on journals in information science (LIS) and science studies (STS). The new classification system improves on alternative options by avoiding the problem
Publication patterns of 79 forest scientists awarded major international forestry prizes during 1990-2010 were compared with the journal classification and ranking promoted as part of the 'Excellence in Research for Australia' (ERA) by the Australian Research Council. The data revealed that these scientists exhibited an elite publication performance during the decade before and two decades following their first major award. An analysis of their 1703 articles in 431 journals revealed substantial differences between the journal choices of these elite scientists and the ERA classification and ranking of journals. Implications from these findings are that additional cross-classifications should be added for many journals, and there should be an adjustment to the ranking of several journals relevant to the ERA Field of Research classified as 0705 Forestry Sciences.
Human-robot collaboration in surgery represents a significant area of research, driven by the increasing capability of autonomous robotic systems to assist surgeons in complex procedures. This systematic review examines the advancements and persistent challenges in the development of autonomous surgical robotic assistants (ASARs), focusing specifically on scenarios where robots provide meaningful and active support to human surgeons. Adhering to the PRISMA guidelines, a comprehensive literature search was conducted across the IEEE Xplore, Scopus, and Web of Science databases, resulting in the selection of 32 studies for detailed analysis. Two primary collaborative setups were identified: teleoperation-based assistance and direct hands-on interaction. The findings reveal a growing research emphasis on ASARs, with predominant applications currently in endoscope guidance, alongside emerging progress in autonomous tool manipulation. Several key challenges hinder wider adoption, including the alignment of robotic actions with human surgeon preferences, the necessity for procedural awareness within autonomous systems, the establishment of seamless human-robot information exchange, and th
Saliency maps can explain how deep neural networks classify images. But are they actually useful for humans? The present systematic review of 68 user studies found that while saliency maps can enhance human performance, null effects or even costs are quite common. To investigate what modulates these effects, the empirical outcomes were organised along several factors related to the human tasks, AI performance, XAI methods, images to be classified, human participants and comparison conditions. In image-focused tasks, benefits were less common than in AI-focused tasks, but the effects depended on the specific cognitive requirements. Moreover, benefits were usually restricted to incorrect AI predictions in AI-focused tasks but to correct ones in image-focused tasks. XAI-related factors had surprisingly little impact. The evidence was limited for image- and human-related factors and the effects were highly dependent on the comparison conditions. These findings may support the design of future user studies.
The aim of this paper is to promote quantum logic as one of the basic tools for analyzing human reasoning. We compare it with classical (Boolean) logic and highlight the role of violation of the distributive law for conjunction and disjunction. It is well known that nondistributivity is equivalent to incompatibility of logical variables -- the impossibility to assign jointly the two-valued truth values to these variables. A natural question arises as to whether quantum logical nondistributivity in human logic can be tested experimentally. We show that testing the response replicability effect (RRE) in cognitive psychology is equivalent to testing nondistributivity -- under the prevailing conjecture that the mental state update generated by observation is described as orthogonal projection of the mental state vector (the projective update conjecture of Wang and Busemeyer). A simple test of RRE is suggested. In contrast to the previous works in quantum-like modeling, we proceed in the state-dependent framework; in particular, distributivity, compatibility, and RRE are considered in a fixed mental state. In this framework, we improve the previous result on the impossibility to combine