Scientific discovery pipelines typically involve complex, rigid, and time-consuming processes, from data preparation to analyzing and interpreting findings. Recent advances in AI have the potential to transform such pipelines in a way that domain experts can focus on interpreting and understanding findings, rather than debugging rigid pipelines or manually annotating data. As part of an active collaboration between data science/AI researchers and behavioral neuroscientists, we showcase an example AI-enhanced pipeline, specifically designed to transform and accelerate the way that the domain experts in the team are able to gain insights out of experimental data. The application at hand is in the domain of behavioral neuroscience, studying fear generalization in mice, an important problem whose progress can advance our understanding of clinically significant and often debilitating conditions such as PTSD (Post-Traumatic Stress Disorder). We identify the emerging paradigm of "In-Context Learning" (ICL) as a suitable interface for domain experts to automate parts of their pipeline without the need for or familiarity with AI model training and fine-tuning, and showcase its remarkable ef
Reinforcement learning methods have recently been very successful at performing complex sequential tasks like playing Atari games, Go and Poker. These algorithms have outperformed humans in several tasks by learning from scratch, using only scalar rewards obtained through interaction with their environment. While there certainly has been considerable independent innovation to produce such results, many core ideas in reinforcement learning are inspired by phenomena in animal learning, psychology and neuroscience. In this paper, we comprehensively review a large number of findings in both neuroscience and psychology that evidence reinforcement learning as a promising candidate for modeling learning and decision making in the brain. In doing so, we construct a mapping between various classes of modern RL algorithms and specific findings in both neurophysiological and behavioral literature. We then discuss the implications of this observed relationship between RL, neuroscience and psychology and its role in advancing research in both AI and brain science.
Modern AI models, such as large language models, are usually trained once on a huge corpus of data, potentially fine-tuned for a specific task, and then deployed with fixed parameters. Their training is costly, slow, and gradual, requiring billions of repetitions. In stark contrast, animals continuously adapt to the ever-changing contingencies in their environments. This is particularly important for social species, where behavioral policies and reward outcomes may frequently change in interaction with peers. The underlying computational processes are often marked by rapid shifts in an animal's behaviour and rather sudden transitions in neuronal population activity. Such computational capacities are of growing importance for AI systems operating in the real world, like those guiding robots or autonomous vehicles, or for agentic AI interacting with humans online. Can AI learn from neuroscience? This Perspective explores this question, integrating the literature on continual and in-context learning in AI with the neuroscience of learning on behavioral tasks with shifting rules, reward probabilities, or outcomes. We will outline an agenda for how specifically insights from neuroscienc
Understanding how decision making changes across the lifespan is a central challenge for neuroscience, yet research on cognitive aging has remained largely disconnected from the theoretical and computational advances that now shape modern systems neuroscience. Over the past two decades, theoretical frameworks have transformed how we study cognition in young, healthy brains, providing principled tools to model latent decision states, neural dynamics, population codes, and interareal communication. In contrast, aging research has often relied on single metric behavioral readouts, cross sectional comparisons, and descriptive neural analyses, limiting our ability to explain fundamental differences in individual aging trajectories. This gap represents a missed opportunity because aging offers a powerful platform for testing theories of neural computation, stability, and flexibility under changing biological constraints. Here, we argue that closer integration between aging research and contemporary theoretical neuroscience can move the field beyond descriptive accounts toward more mechanistic explanations of decision making across the lifespan. To this end, we outline how recent advances
Reinforcement learning (RL) has a rich history in neuroscience, from early work on dopamine as a reward prediction error signal (Schultz et al., 1997) to recent work proposing that the brain could implement a form of 'distributional reinforcement learning' popularized in machine learning (Dabney et al., 2020). There has been a close link between theoretical advances in reinforcement learning and neuroscience experiments throughout this literature, and the theories describing the experimental data have therefore become increasingly complex. Here, we provide an introduction and mathematical background to many of the methods that have been used in systems neroscience. We start with an overview of the RL problem and classical temporal difference algorithms, followed by a discussion of 'model-free', 'model-based', and intermediate RL algorithms. We then introduce deep reinforcement learning and discuss how this framework has led to new insights in neuroscience. This includes a particular focus on meta-reinforcement learning (Wang et al., 2018) and distributional RL (Dabney et al., 2020). Finally, we discuss potential shortcomings of the RL formalism for neuroscience and highlight open q
We introduce CheeseBench, a benchmark that evaluates large language models (LLMs) on nine classical behavioral neuroscience paradigms (Morris water maze, Barnes maze, T-maze, radial arm maze, star maze, operant chamber, shuttle box, conditioned place preference, and delayed non-match to sample), spanning six cognitive dimensions. Each task is grounded in peer-reviewed rodent protocols with approximate animal baselines. The agent receives a unified system prompt with no task-specific instructions and must discover goals purely from ASCII text observations and reward signals, much like a rodent placed into an unfamiliar apparatus. We evaluate six open-weight LLMs (3B to 72B parameters) on text-based ASCII renderings and compare against both a random baseline and a graph-based reinforcement learning agent. Our best model (Qwen2.5-VL-7B) reaches 52.6% average success on ASCII input, compared to 32.1% for random agents and 78.9% for approximate rodent baselines. We find that (1) scaling beyond 7B yields diminishing returns, (2) longer context history degrades performance, (3) chain-of-thought prompting hurts rather than helps, and (4) a vision-language architecture provides an advantage
This paper provides a perspective on applying the concepts of information thermodynamics, developed recently in non-equilibrium statistical physics, to problems in theoretical neuroscience. Historically, information and energy in neuroscience have been treated separately, in contrast to physics approaches, where the relationship of entropy production with heat is a central idea. It is argued here that also in neural systems information and energy can be considered within the same theoretical framework. Starting from basic ideas of thermodynamics and information theory on a classic Brownian particle, it is shown how noisy neural networks can infer its probabilistic motion. The decoding of the particle motion by neurons is performed with some accuracy and it has some energy cost, and both can be determined using information thermodynamics. In a similar fashion, we also discuss how neural networks in the brain can learn the particle velocity, and maintain that information in the weights of plastic synapses from a physical point of view. Generally, it is shown how the framework of stochastic and information thermodynamics can be used practically to study neural inference, learning, and
In recent years, Artificial Intelligence (AI) shows a spectacular ability of insertion inside a variety of disciplines which use it for scientific advancements and which sometimes improve it for their conceptual and methodological needs. According to the transverse science framework originally conceived by Shinn and Joerges, AI can be seen as an instrument which is progressively acquiring a universal character through its diffusion across science. In this paper we address empirically one aspect of this diffusion, namely the penetration of AI into a specific field of research. Taking neuroscience as a case study, we conduct a scientometric analysis of the development of AI in this field. We especially study the temporal egocentric citation network around the articles included in this literature, their represented journals and their authors linked together by a temporal collaboration network. We find that AI is driving the constitution of a particular disciplinary ecosystem in neuroscience which is distinct from other subfields, and which is gathering atypical scientific profiles who are coming from neuroscience or outside it. Moreover we observe that this AI community in neuroscienc
The exponential growth of neuroscience literature presents a significant challenge for researchers seeking to efficiently access and utilize relevant information. To address this issue, we introduce the Brain Knowledge Engine (BrainKnow), an automated system designed to extract, link, and synthesize neuroscience knowledge from scientific publications. BrainKnow constructs a comprehensive knowledge graph encompassing 3,626,931 relationships across 37,011 neuroscience concepts, derived from 1,817,744 articles. This vast repository of knowledge is accessible through a user-friendly web interface, facilitating efficient navigation and data retrieval. BrainKnow employs advanced graph network algorithms, specifically Node2Vec, to enhance knowledge recommendation and visualization. This enables users to explore semantic relationships between concepts, predict potential new relationships, and gain a deeper understanding of the interconnectedness within neuroscience. Additionally, BrainKnow ensures real-time updates by synchronizing with PubMed, providing researchers with access to the most current information. BrainKnow serves as a valuable resource for neuroscience researchers, offering a
Behavioral evaluation is the dominant paradigm for assessing alignment in large language models (LLMs). In current practice, observed compliance under finite evaluation protocols is treated as evidence of latent alignment. However, the inference from bounded behavioral evidence to claims about global latent properties is rarely analyzed as an identifiability problem. In this paper, we study alignment evaluation through the lens of statistical identifiability under partial observability. We allow agent policies to condition their behavior on observable signals correlated with the evaluation regime, a phenomenon we term evaluation awareness. Within this framework, we formalize the Alignment Verifiability Problem and introduce Normative Indistinguishability, which arises when distinct latent alignment hypotheses induce identical distributions over evaluator-accessible observations. Our main theoretical contribution is a conditional impossibility result: under finite behavioral evaluation and evaluation-aware policies, observed compliance does not uniquely identify latent alignment, but only membership in an equivalence class of conditionally compliant policies, under explicit assumpti
Data aggregators collect large amount of information about individual users and create detailed online behavioral profiles of individuals. Behavioral profiles benefit users by improving products and services. However, they have also raised concerns regarding user privacy, transparency of collection practices and accuracy of data in the profiles. To improve transparency, some companies are allowing users to access their behavioral profiles. In this work, we investigated behavioral profiles of users by utilizing these access mechanisms. Using in-person interviews (n=8), we analyzed the data shown in the profiles, elicited user concerns, and estimated accuracy of profiles. We confirmed our interview findings via an online survey (n=100). To assess the claim of improving transparency, we compared data shown in profiles with the data that companies have about users. More than 70% of the participants expressed concerns about collection of sensitive data such as credit and health information, level of detail and how their data may be used. We found a large gap between the data shown in profiles and the data possessed by companies. A large number of profiles were inaccurate with as much as
The proliferation and refinement of affordable virtual reality (VR) technologies and wearable sensors have opened new frontiers in cognitive and behavioral neuroscience. This chapter offers a broad overview of VR for anyone interested in leveraging it as a research tool. In the first section, it examines the fundamental functionalities of VR and outlines important considerations that inform the development of immersive content that stimulates the senses. In the second section, the focus of the discussion shifts to the implementation of VR in the context of the neuroscience lab. Practical advice is offered on adapting commercial, off-theshelf devices to specific research purposes. Further, methods are explored for recording, synchronizing, and fusing heterogeneous forms of data obtained through the VR system or add-on sensors, as well as for labeling events and capturing game play.
Pharmacies are critical in healthcare systems, particularly in low- and middle-income countries. Procuring pharmacists with the right behavioral interventions or nudges can enhance their skills, public health awareness, and pharmacy inventory management, ensuring access to essential medicines that ultimately benefit their patients. We introduce a reinforcement learning operational system to deliver personalized behavioral interventions through mobile health applications. We illustrate its potential by discussing a series of initial experiments run with SwipeRx, an all-in-one app for pharmacists, including B2B e-commerce, in Indonesia. The proposed method has broader applications extending beyond pharmacy operations to optimize healthcare delivery.
Recent advances and reflections on reproducible human neuroscience, especially brain-wide association studies (BWAS) leveraging large datasets, have led to divergent and sometimes opposing views on research practices and priorities. The debates span multiple dimensions. Shifts along these axes have fractured consensus and further fragmented an already heterogeneous field of cognitive neuroscience. Here, we sketch a holistic and integrative response grounded in population neuroscience, organized around a closed-loop "design-analysis-interpretation" research cycle that aims to build consensus while bridging these divides. Our central claim is that population neuroscience offers a unique population-level vantage point for identifying general principles, characterizing inter-individual variabilities, and benchmarking intra-individual changes, thereby providing a supportive framework for small-scale, mechanism-focused studies at the individual level and allowing them to co-evolve with population-level studies. Population neuroscience is not simply about providing larger N for BWAS; its deeper goal is to accumulate a family of cross-scale priors and shared infrastructures that can suppor
Autonomous vehicles (AVs) have significantly advanced in real-world deployment in recent years, yet safety continues to be a critical barrier to widespread adoption. Traditional functional safety approaches, which primarily verify the reliability, robustness, and adequacy of AV hardware and software systems from a vehicle-centric perspective, do not sufficiently address the AV's broader interactions and behavioral impact on the surrounding traffic environment. To overcome this limitation, we propose a paradigm shift toward behavioral safety, a comprehensive approach focused on evaluating AV responses and interactions within traffic environment. To systematically assess behavioral safety, we introduce a third-party AV safety assessment framework comprising two complementary evaluation components: Driver Licensing Test and Driving Intelligence Test. The Driver Licensing Test evaluates AV's reactive behaviors under controlled scenarios, ensuring basic behavioral competency. In contrast, the Driving Intelligence Test assesses AV's interactive behaviors within naturalistic traffic conditions, quantifying the frequency of safety-critical events to deliver statistically meaningful safety
We investigate the use of sequence analysis for behavior modeling, emphasizing that sequential context often outweighs the value of aggregate features in understanding human behavior. We discuss framing common problems in fields like healthcare, finance, and e-commerce as sequence modeling tasks, and address challenges related to constructing coherent sequences from fragmented data and disentangling complex behavior patterns. We present a framework for sequence modeling using Ensembles of Hidden Markov Models, which are lightweight, interpretable, and efficient. Our ensemble-based scoring method enables robust comparison across sequences of different lengths and enhances performance in scenarios with imbalanced or scarce data. The framework scales in real-world scenarios, is compatible with downstream feature-based modeling, and is applicable in both supervised and unsupervised learning settings. We demonstrate the effectiveness of our method with results on a longitudinal human behavior dataset.
Like many scientific disciplines, neuroscience has increasingly attempted to confront pervasive gender imbalances within the field. While much of the conversation has centered around publishing and conference participation, recent research in other fields has called attention to the prevalence of gender bias in citation practices. Because of the downstream effects that citations can have on visibility and career advancement, understanding and eliminating gender bias in citation practices is vital for addressing inequity in a scientific community. In this study, we sought to determine whether there is evidence of gender bias in the citation practices of neuroscientists. Using data from five top neuroscience journals, we find that reference lists tend to include more papers with men as first and last author than would be expected if gender were not a factor in referencing. Importantly, we show that this overcitation of men and undercitation of women is driven largely by the citation practices of men, and is increasing over time as the field becomes more diverse. We develop a co-authorship network to assess homophily in researchers' social networks, and we find that men tend to overci
Digital traces of daily activities, such as e-commerce (EC) purchase histories, provide scalable signals for public health surveillance, yet their epidemiological validity remains unclear. This study validates a behavioral proxy for disease onset, defined as transitions from regular to therapeutic diets, by comparing large-scale EC data (N=55,645) against independent insurance-derived clinical records. Using feline lower urinary tract disease (FLUTD) as a case study, the proxy showed strong agreement with clinical data for ingredient-level risk patterns (r=0.74) and seasonal dynamics (r=0.82). Furthermore, analysis using EC data alone reproduced the established protective association of wet food consumption. These results demonstrate that validated behavioral signals from EC data can serve as cost-effective complements to traditional surveillance, with potential applicability to monitoring lifestyle-related diseases in human populations.
Current AI alignment paradigms rely on behavioral correction: external supervisors (e.g., RLHF) observe outputs, judge against preferences, and adjust parameters. This paper argues that behavioral correction is structurally analogous to an economy without property rights, where order requires perpetual policing and does not scale. Drawing on institutional economics (Coase, Alchian, Cheung), capability mutual exclusivity, and competitive cost discovery, we propose alignment as institutional design: the designer specifies internal transaction structures (module boundaries, competition topologies, cost-feedback loops) such that aligned behavior emerges as the lowest-cost strategy for each component. We identify three irreducible levels of human intervention (structural, parametric, monitorial) and show that this framework transforms alignment from a behavioral control problem into a political-economy problem. No institution eliminates self-interest or guarantees optimality; the best design makes misalignment costly, detectable, and correctable. We conclude that the proper goal is institutional robustness-a dynamic, self-correcting process under human oversight, not perfection. This wo
To learn how cognition is implemented in the brain, we must build computational models that can perform cognitive tasks, and test such models with brain and behavioral experiments. Cognitive science has developed computational models of human cognition, decomposing task performance into computational components. However, its algorithms still fall short of human intelligence and are not grounded in neurobiology. Computational neuroscience has investigated how interacting neurons can implement component functions of brain computation. However, it has yet to explain how those components interact to explain human cognition and behavior. Modern technologies enable us to measure and manipulate brain activity in unprecedentedly rich ways in animals and humans. However, experiments will yield theoretical insight only when employed to test brain-computational models. It is time to assemble the pieces of the puzzle of brain computation. Here we review recent work in the intersection of cognitive science, computational neuroscience, and artificial intelligence. Computational models that mimic brain information processing during perceptual, cognitive, and control tasks are beginning to be deve