Pre-deployment evaluations inspect only a limited sample of model actions. A malicious model seeking to evade oversight could exploit this by randomizing when to "defect": misbehaving so rarely that no malicious actions are observed during evaluation, but often enough that they occur eventually in deployment. But this requires taking actions at very low rates, while maintaining calibration. Are frontier models even capable of that? We prompt the GPT-5, Claude-4.5 and Qwen-3 families to take a target action at low probabilities (e.g. 0.01%), either given directly or requiring derivation, and evaluate their calibration (i.e. whether they perform the target action roughly 1 in 10,000 times when resampling). We find that frontier models are surprisingly good at this task. If there is a source of entropy in-context (such as a UUID), they maintain high calibration at rates lower than 1 in 100,000 actions. Without external entropy, some models can still reach rates lower than 1 in 10,000. When target rates are given, larger models achieve good calibration at lower rates. Yet, when models must derive the optimal target rate themselves, all models fail to achieve calibration without entropy
Digital system design lectures are mandatory in the electrical and electronics engineering curriculum. Besides HDL simulators and viewers, FPGA boards are necessary for the real implementation of HDL, which were previously costly for students. With the emergence of low-cost FPGA boards, the use of take-home labs is increasing. The COVID-19 pandemic has further accelerated this process. Traditional lab sessions have limitations, prompting the exploration of take-home lab kits to enhance learning flexibility and engagement. This study aims to evaluate the effectiveness of a low-cost take-home lab kit, consisting of a Tang Nano 9K FPGA board and a Saleae Logic Analyzer, in improving students' practical skills and sparking curiosity in digital system design. The research was conducted in the EEE 303 Digital Design lecture. Students used the Tang Nano 9K FPGA and Saleae Logic Analyzer for a term project involving PWM signal generation. Data was collected through a survey assessing the kit's impact on learning and engagement. Positive Acceptance: 75% of students agreed or strongly agreed that the take-home lab kit was beneficial. Preference for Lab Types: 60% of students preferred classi
To facilitate human-robot interaction and gain human trust, a robot should recognize and adapt to changes in human behavior. This work documents different human behaviors observed while taking objects from an interactive robot in an experimental study, categorized across two dimensions: pull force applied and handedness. We also present the changes observed in human behavior upon repeated interaction with the robot to take various objects.
In autonomous driving, motion prediction aims at forecasting the future trajectories of nearby agents, helping the ego vehicle to anticipate behaviors and drive safely. A key challenge is generating a diverse set of future predictions, commonly addressed using data-driven models with Multiple Choice Learning (MCL) architectures and Winner-Takes-All (WTA) training objectives. However, these methods face initialization sensitivity and training instabilities. Additionally, to compensate for limited performance, some approaches rely on training with a large set of hypotheses, requiring a post-selection step during inference to significantly reduce the number of predictions. To tackle these issues, we take inspiration from annealed MCL, a recently introduced technique that improves the convergence properties of MCL methods through an annealed Winner-Takes-All loss (aWTA). In this paper, we demonstrate how the aWTA loss can be integrated with state-of-the-art motion forecasting models to enhance their performance using only a minimal set of hypotheses, eliminating the need for the cumbersome post-selection step. Our approach can be easily incorporated into any trajectory prediction model
This paper introduces a nonlinear multi-agent dynamic model that characterizes the resource-seizing mechanism for a fixed amount of resources. The model demonstrates a winners-take-all behavior within a zero-sum game framework. It represents one of the simplest dynamics where equilibria correspond to states of winners and losers, with every trajectory converging to such an equilibrium. Notably, when the model operates in reverse time, it resembles a multi-agent consensus model, referred to as a reverse consensus model. The key characteristics of this model are explored through rigorous analysis.
Perception-based image analysis technologies can be used to help visually impaired people take better quality pictures by providing automated guidance, thereby empowering them to interact more confidently on social media. The photographs taken by visually impaired users often suffer from one or both of two kinds of quality issues: technical quality (distortions), and semantic quality, such as framing and aesthetic composition. Here we develop tools to help them minimize occurrences of common technical distortions, such as blur, poor exposure, and noise. We do not address the complementary problems of semantic quality, leaving that aspect for future work. The problem of assessing and providing actionable feedback on the technical quality of pictures captured by visually impaired users is hard enough, owing to the severe, commingled distortions that often occur. To advance progress on the problem of analyzing and measuring the technical quality of visually impaired user-generated content (VI-UGC), we built a very large and unique subjective image quality and distortion dataset. This new perceptual resource, which we call the LIVE-Meta VI-UGC Database, contains $40$K real-world distor
In this study, we interviewed 22 prominent hacktivists to learn their take on the increased proliferation of misinformation on social media. We found that none of them welcomes the nefarious appropriation of trolling and memes for the purpose of political (counter)argumentation and dissemination of propaganda. True to the original hacker ethos, misinformation is seen as a threat to the democratic vision of the Internet, and as such, it must be confronted on the face with tried hacktivists' methods like deplatforming the "misinformers" and doxing or leaking data about their funding and recruitment. The majority of the hacktivists also recommended interventions for raising misinformation literacy in addition to targeted hacking campaigns. We discuss the implications of these findings relative to the emergent recasting of hacktivism in defense of a constructive and factual social media discourse.
In a Take-Away Game on hypergraphs, two players take turns to remove the vertices and the hyperedges of the hypergraphs. In each turn, a player must remove either a single vertex or a hyperedge. When a player chooses to remove one vertex, all of the hyperedges that contain the chosen vertex are also removed. When a player chooses to remove one hyperedge, only that chosen hyperedge is removed. Whoever removes the last vertex wins the game. Following from the winning strategy for the Take-Away Impartial Combinatorial Games on only Oddly Uniform or only Evenly Uniform Hypergraphs, this paper is about the new winning strategy for Take-Away Games on neither Oddly nor Evenly Uniform Hypergraphs. These neither Oddly nor Evenly Uniform Hypergraphs, however, have to satisfy the specific given requirements.
In this paper, we propose a winner-take-all method for learning hierarchical sparse representations in an unsupervised fashion. We first introduce fully-connected winner-take-all autoencoders which use mini-batch statistics to directly enforce a lifetime sparsity in the activations of the hidden units. We then propose the convolutional winner-take-all autoencoder which combines the benefits of convolutional architectures and autoencoders for learning shift-invariant sparse representations. We describe a way to train convolutional autoencoders layer by layer, where in addition to lifetime sparsity, a spatial sparsity within each feature map is achieved using winner-take-all activation functions. We will show that winner-take-all autoencoders can be used to to learn deep sparse representations from the MNIST, CIFAR-10, ImageNet, Street View House Numbers and Toronto Face datasets, and achieve competitive classification performance.
In conditional automation, a response from the driver is expected when a take over request is issued due to unexpected events, emergencies, or reaching the operational design domain boundaries. Cooperation between the automated driving system and the driver can help to guarantee a safe and pleasant transfer if the driver is guided through a haptic guidance system that applies a slight counter-steering force to the steering wheel. We examine in this work the impact of haptic guidance systems on driving performance after a take over request was triggered to avoid sudden obstacles on the road. We studied different driver conditions that involved Non-Driving Related Tasks (NRDT). Results showed that haptic guidance systems increased road safety by reducing the lateral error, the distance and reaction time to a sudden obstacle and the number of collisions.
The creative process is essentially Darwinian and only a small proportion of creative ideas are selected for further development. However, the threshold that identifies this small fraction of successfully disseminated creative ideas at their early stage has not been thoroughly analyzed through the lens of Rogers innovation diffusion theory. Here, we take highly cited (top 1%) research papers as an example of the most successfully disseminated creative ideas and explore the time it takes and citations it receives at their take off stage, which play a crucial role in the dissemination of creativity. Results show the majority of highly cited papers will reach 10% and 25% of their total citations within two years and four years, respectively. Interestingly, our results also present a minimal number of articles that attract their first citation before publication. As for the discipline, number of references, and Price index, we find a significant difference exists: Clinical, Pre-Clinical & Health and Life Sciences are the first two disciplines to reach the C10% and C25% in a shorter amount of time. Highly cited papers with limited references usually take more time to reach 10% and 2
The solution here proposed can be used to conduct economic analysis in randomized clinical trials. It is based on a statistical approach and aims at calculating a revised version of the incremental costeffective ratio (ICER) in order to take into account the key factors that can influence the choice of therapy causing confounding by indication. Let us take as an example a new therapy to treat cancer being compared to an existing therapy with effectiveness taken as time to death. A challenging problem is that the ICER is defined in terms of means over the entire treatment groups. It makes no provision for stratification by groups of patients with differing risk of death. For example, for a fair and unbiased analysis, one would desire to compare time to death in groups with similar life expectancy which would be impacted by factors such as age, gender, disease severity, etc. The method we decided to apply is borrowed by cluster analysis and aims at (i) discard any outliers in the set under analysis that may arise, (ii) identify groups (i.e. clusters) of patients with "similar" key factors.
We believe, in the sense of supporting ideas and considering them correct while dismissing doubts about them. We take sides about ideas and theories as if that was the right thing to do. And yet, from a rational point of view, this type of support and belief is not justifiable at all. The best we can hope when describing the real world, as far as we know, is to have probabilistic knowledge, to have probabilities associated to each statement. And even that can be very hard to achieve in a reliable way. Far worse, when we defend ideas and believe them as if they were true, Cognitive Psychology experiments show that we stop being able to analyze the question we believe at with competence. In this paper, I gather the evidence we have about taking sides and present the obvious but unseen conclusion that these facts combined mean that we should actually never believe in anything about the real world, except in a probabilistic way. We must actually never take sides because taking sides destroy out abilities to seek for the most correct description of the world. That means we need to start reformulating the way we debate ideas, from our teaching to our political debates, if we actually wan
Humans often assume that robots are rational. We believe robots take optimal actions given their objective; hence, when we are uncertain about what the robot's objective is, we interpret the robot's actions as optimal with respect to our estimate of its objective. This approach makes sense when robots straightforwardly optimize their objective, and enables humans to learn what the robot is trying to achieve. However, our insight is that---when robots are aware that humans learn by trusting that the robot actions are rational---intelligent robots do not act as the human expects; instead, they take advantage of the human's trust, and exploit this trust to more efficiently optimize their own objective. In this paper, we formally model instances of human-robot interaction (HRI) where the human does not know the robot's objective using a two-player game. We formulate different ways in which the robot can model the uncertain human, and compare solutions of this game when the robot has conservative, optimistic, rational, and trusting human models. In an offline linear-quadratic case study and a real-time user study, we show that trusting human models can naturally lead to communicative ro
WTA (Winner Take All) hashing has been successfully applied in many large scale vision applications. This hashing scheme was tailored to take advantage of the comparative reasoning (or order based information), which showed significant accuracy improvements. In this paper, we identify a subtle issue with WTA, which grows with the sparsity of the datasets. This issue limits the discriminative power of WTA. We then propose a solution for this problem based on the idea of Densification which provably fixes the issue. Our experiments show that Densified WTA Hashing outperforms Vanilla WTA both in image classification and retrieval tasks consistently and significantly.
Students often take digital notes during live lectures, but current methods can be slow when capturing information from lecture slides or the instructor's speech, and require them to focus on their devices, leading to distractions and missing important details. This paper explores supporting live lecture note-taking with mixed reality (MR) to quickly capture lecture information and take notes while staying engaged with the lecture. A survey and interviews with university students revealed common note-taking behaviors and challenges to inform the design. We present MaRginalia to provide digital note-taking with a stylus tablet and MR headset. Students can take notes with an MR representation of the tablet, lecture slides, and audio transcript without looking down at their device. When preferred, students can also perform detailed interactions by looking at the physical tablet. We demonstrate the feasibility and usefulness of MaRginalia and MR-based note-taking in a user study with 12 students.
Recent studies have shown that visually impaired people have desires to take selfies in the same way as sighted people do to record their photos and share them with others. Although support applications using sound and vibration have been developed to help visually impaired people take selfies using smartphone cameras, it is still difficult to capture everyone in the angle of view, and it is also difficult to confirm that they all have good expressions in the photo. To mitigate these issues, we propose a method to take selfies with multiple people using an omni-directional camera. Specifically, a user takes a few seconds of video with an omni-directional camera, followed by face detection on all frames. The proposed method then eliminates false face detections and complements undetected ones considering the consistency across all frames. After performing facial expression recognition on all the frames, the proposed method finally extracts the frame in which the participants are happiest, and generates a perspective projection image in which all the participants are in the angle of view from the omni-directional frame. In experiments, we use several scenes with different number of p
Refrigeration based on the magnetocaloric effect (MCE) can contribute to energysaving, environmentally friendly cooling in private households, or industrial application. The cooling is based on the reversible heat release or uptake during a phase-transformation of the materials that can be controlled by a magnetic field. This process could replace conventional compression-based refrigeration, which often relies on environmentally harmful refrigerants. Here we show, how to digitalize the process chain for the synthesis, theoretical and experimental characterization, and prototypical application of magnetocaloric alloy. Different Heusler alloys are examined experimentally as model systems for potential application in magnetic cooling. OTTR templates are used for the acquisition and semantic representation of knowledge in the development of an ontology. The ontology, when combined with unstructured data, can be exploited to train a model that can then be used to predict missing facts, which can help to gain new insights and to generate new hypotheses. Furthermore, tools are developed that automate data acquisition into ontological structures and workflows are implemented that provide
Turn-taking is a fundamental aspect of conversation, but current Human-Robot Interaction (HRI) systems often rely on simplistic, silence-based models, leading to unnatural pauses and interruptions. This paper investigates, for the first time, the application of general turn-taking models, specifically TurnGPT and Voice Activity Projection (VAP), to improve conversational dynamics in HRI. These models are trained on human-human dialogue data using self-supervised learning objectives, without requiring domain-specific fine-tuning. We propose methods for using these models in tandem to predict when a robot should begin preparing responses, take turns, and handle potential interruptions. We evaluated the proposed system in a within-subject study against a traditional baseline system, using the Furhat robot with 39 adults in a conversational setting, in combination with a large language model for autonomous response generation. The results show that participants significantly prefer the proposed system, and it significantly reduces response delays and interruptions.
Automation significantly alters human behavior, particularly risk-taking. Previous researches have paid limited attention to the underlying characteristics of automation and their mechanisms of influence on risk-taking. This study investigated how automation affects risk-taking and examined the role of sense of agency therein. By quantifying sense of agency through subjective ratings, this research explored the impact of automation level and reliability level on risk-taking. The results of three experiments indicated that automation reduced the level of risk-taking; higher automation level was associated with lower sense of agency and lower risk-taking, with sense of agency playing a complete mediating role; higher automation reliability was associated with higher sense of agency and higher risk-taking, with sense of agency playing a partial mediating role. The study concludes that automation influences risk-taking, such that higher automation level or lower reliability is associated with a lower likelihood of risk-taking. Sense of agency mediates the impact of automation on risk-taking, and automation level and reliability have different effects on risk-taking.