This paper investigates the problem of leadership development for an external influencer using the Friedkin-Johnsen (FJ) opinion dynamics model, where the influencer is modeled as a fully stubborn agent and leadership is quantified by social power. The influencer seeks to maximize her social power by strategically adding a limited number of links to regular agents. This optimization problem is shown to be equivalent to maximizing the absorbing probability to the influencer in an augmented Markov chain. The resulting objective function is both monotone and submodular, enabling the use of a greedy algorithm to compute an approximate solution. To handle large-scale networks efficiently, a random walk sampling over the Markov chain is employed to reduce computational complexity. Analytical characterizations of the solution are provided for both low and high stubbornness of regular agents. Specific network topologies are also examined: for complete graphs with rank-one weight matrices, the problem reduces to a hyperbolic 0-1 programmming problem, which is solvable in polynomial time; for symmetric ring graphs with circulant weight matrices and uniform agent stubbornness, the optimal str
Information-seeking agents have emerged as a powerful paradigm for solving knowledge-intensive tasks. Existing information-seeking agents are typically specialized for open web, documents, or local knowledge bases, which constrains scalability and cross-domain generalization. In this work, we investigate how to consolidate heterogeneous information-seeking agents into a single foundation agentic model. We study two complementary consolidation strategies: data-level consolidation, which jointly trains a unified model on a mixture of domain-specific datasets, and parameter-level consolidation, which merges independently trained agent models at the parameter level. Our analysis compares these approaches in terms of performance retention, cross-domain generalization, and interference across information-seeking behaviors. Our results show that data-level consolidation remains a strong and stable baseline, while parameter-level consolidation offers a promising, efficient alternative but suffers from interference and robustness challenges. We further identify key design factors for effective agent consolidation at the parameter level, including fine-grained merging granularity, awareness
This paper proposes a discrete-time event-triggered extremum seeking control scheme for real-time optimization of nonlinear systems. Unlike conventional discrete-time implementations relying on periodic updates, the proposed approach updates the control input only when a state-dependent triggering condition is satisfied, reducing unnecessary actuation and communication. The resulting closed-loop system combines extremum seeking with an event-triggering mechanism that adaptively determines the input update instants. Using discrete-time averaging and Lyapunov analysis, we establish practical convergence of the trajectories to a neighborhood of the unknown extremum point and show exponential stability of the associated average dynamics. The proposed method preserves the optimization capability of classical extremum seeking while significantly reducing the number of input updates. Simulation results illustrate the effectiveness of the approach for resource-aware real-time optimization.
Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. I show how the argument from power-seeking rests on a strong version of a claim known as the instrumental convergence thesis. I explore leading defenses of the instrumental convergence thesis and argue that none establishes the thesis in a strong enough form to ground the argument from power-seeking. I discuss implications for longtermism, the governance of artificial intelligence, and the methodology of studying risks posed by artificial agents.
Scaling video generation from seconds to minutes faces a critical bottleneck: while short-video data is abundant and high-fidelity, coherent long-form data is scarce and limited to narrow domains. To address this, we propose a training paradigm where Mode Seeking meets Mean Seeking, decoupling local fidelity from long-term coherence based on a unified representation via a Decoupled Diffusion Transformer. Our approach utilizes a global Flow Matching head trained via supervised learning on long videos to capture narrative structure, while simultaneously employing a local Distribution Matching head that aligns sliding windows to a frozen short-video teacher via a mode-seeking reverse-KL divergence. This strategy enables the synthesis of minute-scale videos that learns long-range coherence and motions from limited long videos via supervised flow matching, while inheriting local realism by aligning every sliding-window segment of the student to a frozen short-video teacher, resulting in a few-step fast long video generator. Evaluations show that our method effectively closes the fidelity-horizon gap by jointly improving local sharpness, motion and long-range consistency. Project website
Generative AI (GenAI) tools offer increasing opportunities for augmenting human cognitive tasks. Among these tasks, information seeking is being rapidly reshaped by GenAI tools, with potentially profound implications for learning and knowledge acquisition. To investigate these implications, we conducted a between-subjects field experiment in which participants pursued informal learning by seeking information through either ChatGPT or Google Search over a span of 8 days. Using a daily diary protocol, we gathered in-situ data on their information-seeking processes. Our findings show that participants in the ChatGPT group experienced diminished agency in their information-seeking processes, as they offloaded much of the information selection to AI, and consequently experienced greater meta-cognitive load arising from this reduced sense of control. We further highlight two sources of distortion in information access when using ChatGPT: biases in ChatGPT outputs, particularly towards providing solution-oriented artifacts over principled knowledge; and systematic shifts in users' information-seeking behaviors, whereby the conversational and socially-oriented interaction paradigm of curre
The use of mobile robotics in radioactive source seeking has become an important part of modern radiation-safety practices, supporting timely mitigation of contamination risks and helping protect public health. However, measuring radiation is often time-consuming, rendering traditional gradient-based source-seeking methods less effective due to lower sample efficiency. This paper proposes a sample-efficient Bayesian-Optimisation source-seeking strategy that utilises a heteroscedastic Gaussian process surrogate to balance exploration and exploitation. Excessive inter-sample travel is discouraged through a movement switching cost. The strategy is shown to generate sublinear regret in the source-seeking task, while simulations demonstrate its effectiveness in localising radioactive sources.
This white paper was submitted to NASA's Search For Life Science Analysis Group (SFL-SAG): To ensure that the first mission designed to seek signs of extant life since 1976 is able to produce an unambiguous biological interpretation, the SFL-SAG is tasked with identifying the most high-confidence, agnostic biosignatures which are targetable, detectable, and measurable in Martian subsurface mid-latitude ice. To aid in this effort, this white paper highlights three examples of target materials or phenomena, along with associated instrument concepts, which the SFL-SAG shall prioritize in its efforts to define the appropriate astrobiological strategy. These include 1) polyelectrolyte informational biopolymers, 2) macromolecular biological homochirality, and 3) chiral-specific metabolic reactions. The Agnostic Life Finding Association (ALFA) and University of Florida (UF) support the development of instrumentation that seeks these high-confidence biosignatures.
Addressing intricate real-world problems necessitates in-depth information seeking and multi-step reasoning. Recent progress in agentic systems, exemplified by Deep Research, underscores the potential for autonomous multi-step research. In this work, we present a cohesive paradigm for building end-to-end agentic information seeking agents from a data-centric and training-stage perspective. Our approach consists of four key stages: (1) browsing data construction, (2) trajectories sampling, (3) supervised fine-tuning for effective cold start, and (4) reinforcement learning for enhanced generalisation. We instantiate this framework in a web agent based on the ReAct, WebDancer. Empirical evaluations on the challenging information seeking benchmarks, GAIA and WebWalkerQA, demonstrate the strong performance of WebDancer, achieving considerable results and highlighting the efficacy of our training paradigm. Further analysis of agent training provides valuable insights and actionable, systematic pathways for developing more capable agentic models. The codes and demo will be released in https://github.com/Alibaba-NLP/WebAgent.
Schelling games use a game-theoretic approach to study the phenomenon of residential segregation as originally modeled by Schelling. Inspired by the recent increase in the number of people and businesses preferring and promoting diversity, we propose swap games under three diversity-seeking utility functions: the binary utility of an agent is 1 if it has a neighbor of a different type, and 0 otherwise; the difference-seeking utility of an agent is equal to the number of its neighbors of a different type; the variety-seeking utility of an agent is equal to the number of types different from its own in its neighborhood. We consider four global measures of diversity: degree of integration, number of colorful edges, neighborhood variety, and evenness, and prove asymptotically tight or almost tight bounds on the price of anarchy with respect to these measures on both general graphs, as well as on cycles, cylinders, and tori that model residential neighborhoods. We complement our theoretical results with simulations of our swap games starting either from random placements of agents, or from segregated placements. Our simulation results are generally consistent with our theoretical result
This paper introduces an event-triggered source seeking control (ET-SSC) for autonomous vehicles modeled as the nonholonomic unicycle. The classical source seeking control is enhanced with static-triggering conditions to enable aperiodic and less frequent updates of the system's input signals, offering a resource-aware control design. Our convergence analysis is based on time-scaling combined with Lyapunov and averaging theories for systems with discontinuous right-hand sides. ET-SSC ensures exponentially stable behavior for the resulting average system, leading to practical asymptotic convergence to a small neighborhood of the source point. We guarantee the avoidance of Zeno behavior by establishing a minimum dwell time to prevent infinitely fast switching. The performance optimization is aligned with classical continuous-time source seeking algorithms while balancing system performance with actuation resource consumption. Our ET-SSC algorithm, the first of its kind, allows for arbitrarily large inter-sampling times, overcoming the limitations of classical sampled-data implementations for source seeking control.
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by grounding responses with retrieved information. As an emerging paradigm, Agentic RAG further enhances this process by introducing autonomous LLM agents into the information seeking process. However, existing benchmarks fall short in evaluating such systems, as they are confined to a static retrieval environment with a fixed, limited corpus} and simple queries that fail to elicit agentic behavior. Moreover, their evaluation protocols assess information seeking effectiveness by pre-defined gold sets of documents, making them unsuitable for the open-ended and dynamic nature of real-world web environments. To bridge this gap, we present InfoDeepSeek, a new benchmark with challenging questions designed for assessing agentic information seeking in real-world, dynamic web environments. We propose a systematic methodology for constructing challenging queries satisfying the criteria of determinacy, difficulty, and diversity. Based on this, we develop the first evaluation framework tailored to dynamic agentic information seeking, including fine-grained metrics about the accuracy, utility, and compactness of informa
This paper introduces a new method to achieve stable convergence to Nash equilibrium in duopoly noncooperative games. Inspired by the recent fixed-time Nash Equilibrium seeking (NES) as well as prescribed-time extremum seeking (ES) and source seeking schemes, our approach employs a distributed sliding mode control (SMC) scheme, integrating extremum seeking with sinusoidal perturbation signals to estimate the pseudogradients of quadratic payoff functions. Notably, this is the first attempt to address noncooperative games without relying on models, combining classical extremum seeking with relay components instead of proportional control laws. We prove finite-time convergence of the closed-loop average system to Nash equilibrium using stability analysis techniques such as time-scaling, Lyapunov's direct method, and averaging theory for discontinuous systems. Additionally, we quantify the size of residual sets around the Nash equilibrium and validate our theoretical results through simulations.
Our goal is to model and experimentally assess trust evolution to predict future beliefs and behaviors of human-robot teams in dynamic environments. Research suggests that maintaining trust among team members in a human-robot team is vital for successful team performance. Research suggests that trust is a multi-dimensional and latent entity that relates to past experiences and future actions in a complex manner. Employing a human-robot collaborative task, we design an optimal assistance-seeking strategy for the robot using a POMDP framework. In the task, the human supervises an autonomous mobile manipulator collecting objects in an environment. The supervisor's task is to ensure that the robot safely executes its task. The robot can either choose to attempt to collect the object or seek human assistance. The human supervisor actively monitors the robot's activities, offering assistance upon request, and intervening if they perceive the robot may fail. In this setting, human trust is the hidden state, and the primary objective is to optimize team performance. We execute two sets of human-robot interaction experiments. The data from the first experiment are used to estimate POMDP par
The complexities of table structures and question logic make table-based question answering (TQA) tasks challenging for Large Language Models (LLMs), often requiring task simplification before solving. This paper reveals that the reasoning process during task simplification may be more valuable than the simplified tasks themselves and aims to improve TQA performance by leveraging LLMs' reasoning capabilities. We propose a Seek-and-Solve pipeline that instructs the LLM to first seek relevant information and then answer questions, integrating these two stages at the reasoning level into a coherent Seek-and-Solve Chain of Thought (SS-CoT). Additionally, we distill a single-step TQA-solving prompt from this pipeline, using demonstrations with SS-CoT paths to guide the LLM in solving complex TQA tasks under In-Context Learning settings. Our experiments show that our approaches result in improved performance and reliability while being efficient. Our findings emphasize the importance of eliciting LLMs' reasoning capabilities to handle complex TQA tasks effectively.
We investigate the question: if an AI agent is known to be safe in one setting, is it also safe in a new setting similar to the first? This is a core question of AI alignment--we train and test models in a certain environment, but deploy them in another, and we need to guarantee that models that seem safe in testing remain so in deployment. Our notion of safety is based on power-seeking--an agent which seeks power is not safe. In particular, we focus on a crucial type of power-seeking: resisting shutdown. We model agents as policies for Markov decision processes, and show (in two cases of interest) that not resisting shutdown is "stable": if an MDP has certain policies which don't avoid shutdown, the corresponding policies for a similar MDP also don't avoid shutdown. We also show that there are natural cases where safety is _not_ stable--arbitrarily small perturbations may result in policies which never shut down. In our first case of interest--near-optimal policies--we use a bisimulation metric on MDPs to prove that small perturbations won't make the agent take longer to shut down. Our second case of interest is policies for MDPs satisfying certain constraints which hold for vario
This paper discusses the design of an extremum seeking controller that relies on a monitoring function for a class of SISO uncertain nonlinear systems characterized by arbitrary and uncertain relative degree. Our demonstration illustrates the feasibility of achieving an arbitrarily small proximity to the desired optimal point through output feedback. The core concept involves integrating a monitoring function with a norm state observer for the unitary relative degree case and its expansion to arbitrary relative degrees by means of the employment of a time-scaling technique. Significantly, our proposed scheme attains the extremum of an unknown nonlinear mapping across the entire domain of initial conditions, ensuring global convergence and stability for the real-time optimization algorithm. Furthermore, we provide tuning rules to ensure convergence to the global maximum in the presence of local extrema. To validate the effectiveness of the proposed approach, we present a numerical example and apply it to a source-seeking problem involving a cart-track linear positioning servomechanism. Notably, the cart lacks the ability to sense its velocity or the source's position, but can detect
Help-seeking is a critical way for students to learn new concepts, acquire new skills, and get unstuck when problem-solving in their computing courses. The recent proliferation of generative AI tools, such as ChatGPT, offers students a new source of help that is always available on-demand. However, it is unclear how this new resource compares to existing help-seeking resources along dimensions of perceived quality, latency, and trustworthiness. In this paper, we investigate the help-seeking preferences and experiences of computing students now that generative AI tools are available to them. We collected survey data (n=47) and conducted interviews (n=8) with computing students. Our results suggest that although these models are being rapidly adopted, they have not yet fully eclipsed traditional help resources. The help-seeking resources that students rely on continue to vary depending on the task and other factors. Finally, we observed preliminary evidence about how help-seeking with generative AI is a skill that needs to be developed, with disproportionate benefits for those who are better able to harness the capabilities of LLMs. We discuss potential implications for integrating g
The desire and ability to seek new information strategically are fundamental to human learning but often overlooked in current language agent evaluation. We analyze a popular web shopping task designed to test language agents' ability to perform strategic exploration and discover that it can be reformulated and solved as a single-turn retrieval task without the need for interactive information seeking. This finding encourages us to rethink realistic constraints on information access that would necessitate strategic information seeking. We then redesign the task to introduce a notion of task ambiguity and the role of a shopper, serving as a dynamic party with whom the agent strategically interacts in an open-ended conversation to make informed decisions. Our experiments demonstrate that the proposed task can effectively evaluate the agent's ability to explore and gradually accumulate information through multi-turn interactions. Additionally, we show that large language model-simulated shoppers serve as a good proxy for real human shoppers, revealing similar error patterns in agents.