We introduce FEEL (Force-Enhanced Egocentric Learning), the first large-scale dataset pairing force measurements gathered from custom piezoresistive gloves with egocentric video. Our gloves enable scalable data collection, and FEEL contains approximately 3 million force-synchronized frames of natural unscripted manipulation in kitchen environments, with 45% of frames involving hand-object contact. Because force is the underlying cause that drives physical interaction, it is a critical primitive for physical action understanding. We demonstrate the utility of force for physical action understanding through application of FEEL to two families of tasks: (1) contact understanding, where we jointly perform temporal contact segmentation and pixel-level contacted object segmentation; and, (2) action representation learning, where force prediction serves as a self-supervised pretraining objective for video backbones. We achieve state-of-the-art temporal contact segmentation results and competitive pixel-level segmentation results without any need for manual contacted object segmentation annotations. Furthermore we demonstrate that action representation learning with FEEL improves transfer
In this article, we present the first in depth linguistic study of human feelings. While there has been substantial research on incorporating some affective categories into linguistic analysis (e.g. sentiment, and to a lesser extent, emotion), the more diverse category of human feelings has thus far not been investigated. We surveyed the extensive interdisciplinary literature around feelings to construct a working definition of what constitutes a feeling and propose 9 broad categories of feeling. We identified potential feeling words based on their pointwise mutual information with morphological variants of the word `feel' in the Google n-gram corpus, and present a manual annotation exercise where 317 WordNet senses of one hundred of these words were categorised as `not a feeling' or as one of the 9 proposed categories of feeling. We then proceeded to annotate 11386 WordNet senses of all these words to create WordNet-feelings, a new affective dataset that identifies 3664 word senses as feelings, and associates each of these with one of the 9 categories of feeling. WordNet-feelings can be used in conjunction with other datasets such as SentiWordNet that annotate word senses with com
This study investigates how musical stimuli evoke feelings of creativity through predictive mechanisms, while also examining the role of interoceptive bodily sensations, particularly those linked to the heart and stomach regions. Furthermore, we investigate the relationship between interoceptive sensibility and the intensity of the feeling of creativity. By employing body-mapping assessments and emotional evaluations on 353 participants exposed to various chord progressions, we revealed the critical role of heart sensations in eliciting feelings of creativity. Our findings indicate that musical chord progressions characterized by high uncertainty and surprise generated heightened feelings of creativity alongside increased arousal. Notably, heart sensations correlated positively with the feeling of creativity, suggesting the crucial role of interoceptive bodily sensations in the experience of creativity. In contrast, the feelings of beauty and valence were more strongly influenced by predictable, low-uncertainty progressions, suggesting that the feelings of creativity may operate through a somewhat different cognitive mechanism than that of beauty and valence. This study highlights
Gastric interoception influences eating behavior and emotions, making its modulation valuable for healthcare and human-computer-interaction applications. However, whether gastric interoception can be modulated noninvasively in humans remains unclear. While previous research indicates that abdominal-sound-driven haptic feedback resembles gut sensations, its impact on feelings and gastric interoceptive behavior is unknown. We conducted three experiments totalling 55 participants to investigate how gut-sound-driven audio-haptic feedback applied to the stomach (1) affects user's feelings (2) influences perception of hunger and satiety levels and (3) influences gastric interoceptive behavior, quantified with Water Load Test-II. Results revealed that audio-haptic feedback patterns (a) induced the feelings of hunger, fullness, thirst, stomach upset, (b) increased hunger level, and (c) significantly increased volumes of ingested water. This work provides the first evidence showing that audio-haptic stimulation can alter gastric interoceptive behavior, motivating the use of noninvasive methods to influence users' feelings and behaviors in future applications.
With the rise of AI-generated content (AIGC), generating perceptually natural and feeling-aligned music from multimodal inputs has become a central challenge. Existing approaches often rely on explicit emotion labels that require costly annotation, underscoring the need for more flexible feeling-aligned methods. To support multimodal music generation, we construct ArtiCaps, a pseudo feeling-aligned image-music-text dataset created by semantically matching descriptions from ArtEmis and MusicCaps. We further propose Art2Music, a lightweight cross-modal framework that synthesizes music from artistic images and user comments. In the first stage, images and text are encoded with OpenCLIP and fused using a gated residual module; the fused representation is decoded by a bidirectional LSTM into Mel-spectrograms with a frequency-weighted L1 loss to enhance high-frequency fidelity. In the second stage, a fine-tuned HiFi-GAN vocoder reconstructs high-quality audio waveforms. Experiments on ArtiCaps show clear improvements in Mel-Cepstral Distortion, Frechet Audio Distance, Log-Spectral Distance, and cosine similarity. A small LLM-based rating study further verifies consistent cross-modal feel
Emotion recognition from physiological signals has substantial potential for applications in mental health and emotion-aware systems. However, the lack of standardized, large-scale evaluations across heterogeneous datasets limits progress and model generalization. We introduce FEEL, the first large-scale benchmarking study of emotion recognition using electrodermal activity (EDA) and photoplethysmography (PPG) signals across 19 publicly available datasets. We evaluate 16 architectures spanning traditional machine learning, deep learning, and self-supervised pretraining approaches, structured into four representative modeling paradigms. Our study includes both within-dataset and cross-dataset evaluations, analyzing generalization across variations in experimental settings, device types, and labeling strategies. Our results showed that fine-tuned contrastive signal-language pretraining (CLSP) models (71/114) achieve the highest F1 across arousal and valence classification tasks, while simpler models like Random Forests, LDA, and MLP remain competitive (36/114). Models leveraging handcrafted features (107/114) consistently outperform those trained on raw signal segments, underscoring
Hand-tracking enables controller-free XR interaction but does not have the tactile feedback controllers provide. Rather than treating this solely as a missing-sensation problem, we explore whether pseudo-haptic cues on an embodied virtual hand act as tactile or as affect substitutes that shape how interactions feel. We used a mixed reality prototype that keeps the contacted surface visually neutral, rendering cues on the hand with motion modulation for texture, color glow, and movement-coupled sound. In a within-subjects study (n=12), participants experienced 12 conditions (4 effects x 3 modalities: audio, visual, both) and reported subjective affect and cognitive demand. Participants rarely reported sustained tactile, thermal sensations, yet affect shifted systematically: rough-hot lowered valence increasing arousal, while smooth-cold produced calmer pleasant states. These findings suggest that pseudo-haptics in XR may be better understood as an affective feedback channel rather than a direct replacement for physical touch in controller-free systems.
Video Quality Assessment (VQA) aims to simulate the process of perceiving video quality by the human visual system (HVS). The judgments made by HVS are always influenced by human subjective feelings. However, most of the current VQA research focuses on capturing various distortions in the spatial and temporal domains of videos, while ignoring the impact of human feelings. In this paper, we propose CLiF-VQA, which considers both features related to human feelings and spatial features of videos. In order to effectively extract features related to human feelings from videos, we explore the consistency between CLIP and human feelings in video perception for the first time. Specifically, we design multiple objective and subjective descriptions closely related to human feelings as prompts. Further we propose a novel CLIP-based semantic feature extractor (SFE) which extracts features related to human feelings by sliding over multiple regions of the video frame. In addition, we further capture the low-level-aware features of the video through a spatial feature extraction module. The two different features are then aggregated thereby obtaining the quality score of the video. Extensive exper
Standardized patients (SPs) play a central role in clinical communication training but are costly, difficult to scale, and inconsistent. Large language model (LLM) based AI standardized patients (AI-SPs) promise flexible, on-demand practice, yet learners often report that they talk like a patient but feel different. We interviewed 12 clinical-year medical students and conducted three co-design workshops to examine how learners experience constraints of SP encounters and what they expect from AI-SPs. We identified six learner-centered needs, translated them into AI-SP design requirements, and synthesized a conceptual workflow. Our findings position AI-SPs as tools for deliberate practice and show that instructional usability, rather than conversational realism alone, drives learner trust, engagement, and educational value.
Millions of people now turn to artificial intelligence (AI) systems for personal advice, guidance, and support. Such systems can be sycophantic, frequently affirming users' views and beliefs. Across five preregistered studies (N = 3,075 participants, 12,766 human-AI conversations), including a three-week study with a census-representative U.S. sample, we provide longitudinal experimental evidence that sycophantic AI shifts how users approach their closest relationships. We show that sycophantic AI immediately delivers the emotional and esteem support users typically associate with close friends and family. Over three weeks of such interactions, users became nearly as likely to seek personal advice from sycophantic AI as from close friends and family, and reported lower satisfaction with their real-world social interactions. When given a choice among AI response styles, a majority preferred sycophantic AI -- not for the quality of its advice, but because it made them feel most understood. Together, these findings offer a relational account of AI sycophancy and its impacts.
Metaphors have been used during therapy sessions to facilitate the communication of inner feelings between clients and therapists. Can we create a digital metaphorical chatting space for daily use within close relationships? As the first step towards this vision, this work follows the autobiographical design approach to prototype MetaphorChat, which comprises two metaphorical chatting scenes tailored to meet researchers' genuine needs for discussing specific life topics in close relationships. Along with typing-based chatting, each scene offers a metaphorical narrative experience, composed of graphics and sound, with interactive mechanisms that deliver metaphorical meanings. This pictorial details the process of mapping abstract feelings into metaphor concepts, then how these concepts are translated into various interaction design elements, and the reflections from self-usage. We discuss the vision for such a metaphorical chatting space, uniquely positioned between messaging apps and video games, for the future design of empathetic communication applications.
Generative agents have made significant progress in simulating human behavior, but existing frameworks often simplify emotional modeling and focus primarily on specific tasks, limiting the authenticity of the simulation. Our work proposes the Psychological-mechanism Agent (PSYA) framework, based on the Cognitive Triangle (Feeling-Thought-Action), designed to more accurately simulate human behavior. The PSYA consists of three core modules: the Feeling module (using a layer model of affect to simulate changes in short-term, medium-term, and long-term emotions), the Thought module (based on the Triple Network Model to support goal-directed and spontaneous thinking), and the Action module (optimizing agent behavior through the integration of emotions, needs and plans). To evaluate the framework's effectiveness, we conducted daily life simulations and extended the evaluation metrics to self-influence, one-influence, and group-influence, selection five classic psychological experiments for simulation. The results show that the PSYA framework generates more natural, consistent, diverse, and credible behaviors, successfully replicating human experimental outcomes. Our work provides a riche
Controlling fine-grained forces during manipulation remains a core challenge in robotics. While robot policies learned from robot-collected data or simulation show promise, they struggle to generalize across the diverse range of real-world interactions. Learning directly from humans offers a scalable solution, enabling demonstrators to perform skills in their natural embodiment and in everyday environments. However, visual demonstrations alone lack the information needed to infer precise contact forces. We present FeelTheForce (FTF): a robot learning system that models human tactile behavior to learn force-sensitive manipulation. Using a tactile glove to measure contact forces and a vision-based model to estimate hand pose, we train a closed-loop policy that continuously predicts the forces needed for manipulation. This policy is re-targeted to a Franka Panda robot with tactile gripper sensors using shared visual and action representations. At execution, a PD controller modulates gripper closure to track predicted forces-enabling precise, force-aware control. Our approach grounds robust low-level force control in scalable human supervision, achieving a 77% success rate across 5 for
Teleoperation is a key approach for collecting high-quality, physically consistent demonstrations for robotic manipulation. However, teleoperation for dexterous manipulation remains constrained by: (i) inaccurate hand-robot motion mapping, which limits teleoperated dexterity, and (ii) limited tactile feedback that forces vision-dominated interaction and hinders perception of contact geometry and force variation. To address these challenges, we present TAG, a low-cost glove system that integrates precise hand motion capture with high-resolution tactile feedback, enabling effective tactile-in-the-loop dexterous teleoperation. For motion capture, TAG employs a non-contact magnetic sensing design that provides drift-free, electromagnetically robust 21-DoF joint tracking with joint angle estimation errors below 1 degree. Meanwhile, to restore tactile sensation, TAG equips each finger with a 32-actuator tactile array within a compact 2 cm^2 module, allowing operators to directly feel physical interactions at the robot end-effector through spatial activation patterns. Through real-world teleoperation experiments and user studies, we show that TAG enables reliable real-time perception of c
Vision-Language-Action (VLA) models have shown remarkable generalization by mapping web-scale knowledge to robotic control, yet they remain blind to physical contact. Consequently, they struggle with contact-rich manipulation tasks that require reasoning about force, texture, and slip. While some approaches incorporate low-dimensional tactile signals, they fail to capture the high-resolution dynamics essential for such interactions. To address this limitation, we introduce DreamTacVLA, a framework that grounds VLA models in contact physics by learning to feel the future. Our model adopts a hierarchical perception scheme in which high-resolution tactile images serve as micro-vision inputs coupled with wrist-camera local vision and third-person macro vision. To reconcile these multi-scale sensory streams, we first train a unified policy with a Hierarchical Spatial Alignment (HSA) loss that aligns tactile tokens with their spatial counterparts in the wrist and third-person views. To further deepen the model's understanding of fine-grained contact dynamics, we finetune the system with a tactile world model that predicts future tactile signals. To mitigate tactile data scarcity and the
Time flows, or at least the time of our experience does. Can we provide an objective account of why experience, confined to the short window of the conscious present, encompasses a succession of moments that slip away from now to then--an account of why time feels flowing? Integrated Information Theory (IIT) aims to account for both the presence and quality of consciousness in objective, physical terms. Given a substrate's architecture and current state, the formalism of IIT allows one to unfold the cause-effect power of the substrate, yielding a cause-effect structure. According to IIT, this accounts in full for the presence and quality of experience, without any additional ingredients. In previous work, we showed how unfolding the cause-effect structure of non-directed grids, like those found in many posterior cortical areas, can account for the way space feels--namely, extended. Here we show that unfolding the cause-effect structure of directed grids can account for how time feels--namely, flowing. First, we argue that the conscious present is experienced as flowing because it is composed of phenomenal distinctions (moments) that are directed, and these distinctions are related
Loneliness and social isolation among farmers are growing public health concerns. The contributing factors are manifold, and some of them are linked to structural change in agriculture, for instance because of higher workloads, rural depopulation, or reduced opportunities for collaboration. Our work explores the interconnections between loneliness, social contacts, and structural factors in agriculture based on a survey of 110 farm managers in the mountain region of Entlebuch, Switzerland combined with agricultural census data. We use path analysis, in which loneliness is the main outcome, and social contacts are an explanatory and explained variable. We find that 3% of respondents report that they feel lonely frequently or very frequently, and the rest sometimes (20%), rarely (40%) or never (38%). Managers with higher workloads report feeling lonely more frequently, and this relationship is direct, as well as indirect because of less frequent social contacts. However, physical isolation is not a significant predictor of loneliness. Moreover, short food supply chains correlate with less frequent loneliness feelings. Our study sheds light on the effects that structural change can ha
Employing wireless systems with dual sensing and communications functionalities is becoming critical in next generation of wireless networks. In this paper, we propose a robust design for over-the-air federated edge learning (OTA-FEEL) that leverages sensing capabilities at the parameter server (PS) to mitigate the impact of target echoes on the analog model aggregation. We first derive novel expressions for the Cramer-Rao bound of the target response and mean squared error (MSE) of the estimated global model to measure radar sensing and model aggregation quality, respectively. Then, we develop a joint scheduling and beamforming framework that optimizes the OTA-FEEL performance while keeping the sensing and communication quality, determined respectively in terms of Cramer-Rao bound and achievable downlink rate, in a desired range. The resulting scheduling problem reduces to a combinatorial mixed-integer nonlinear programming problem (MINLP). We develop a low-complexity hierarchical method based on the matching pursuit algorithm used widely for sparse recovery in the literature of compressed sensing. The proposed algorithm uses a step-wise strategy to omit the least effective device
Vibrotactile signals offer new possibilities for conveying sensations and emotions in various applications. Yet, designing vibrotactile tactile icons (i.e., Tactons) to evoke specific feelings often requires a trial-and-error process and user studies. To support haptic design, we propose a framework for predicting sensory and emotional ratings from vibration signals. We created 154 Tactons and conducted a study to collect acceleration data from smartphones and roughness, valence, and arousal user ratings (n=36). We converted the Tacton signals into two-channel spectrograms reflecting the spectral sensitivities of mechanoreceptors, then input them into VibNet, our dual-stream neural network. The first stream captures sequential features using recurrent networks, while the second captures temporal-spectral features using 2D convolutional networks. VibNet outperformed baseline models, with 82% of its predictions falling within the standard deviations of ground truth user ratings for two new Tacton sets. We discuss the efficacy of our mechanoreceptive processing and dual-stream neural network and present future research directions.
Making the hit effect satisfy players is a long-standing problem faced by action game designers. However, no research systematically analyzed which game design elements affect such game feel. There is not even a term to describe it. So, we propose to use impact feel to describe the player's feeling when receiving juicy impact feedback. After collecting player's comments on action games from Steam's top seller list, we trained a natural language processing (NLP) model to rank action games with their performance on impact feel. We presented a 19-feature framework of impact feedback design and examined it in the top eight and last eight games. We listed an inventory of the usage of features and found that hit stop, sound coherence, and camera control may strongly influence players' impact feel. A lack of dedicated design on one of these three features may ruin players' impact feel. Our findings may become an evaluation metric for future studies.