Skull-induced aberrations remain a major drawback of transcranial ultrasound localization microscopy (ULM), degrading sensitivity and spatial accuracy through microbubble mislocalization, false detections, and imaging artifacts, such as disconnected or duplicated vessels. Here, we present a differentiable beamforming framework for automatic aberration correction in transcranial Doppler and ULM. Our approach uses spatially distributed delay-based parameterization of the aberration that is optimized in a closed-loop manner using angular coherence as an objective function. We demonstrate robust improvements of transcranial ULM, in vivo, with enhanced resolution of both mouse and nonhuman primate (NHP) brains. We also extended differentiable beamforming to functional measurements, with improvements in the sensitivity of transcranial functional ultrasound (fUS) and ULM based hemodynamic quantification. Extending this approach to 3D transcranial ULM imaging in NHPs, we show efficient correction of skull induced aberrations and removal of artifacts, such as vessel duplications. By providing a fully automated and generalizable solution for aberration correction, this work lowers a major te
This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a machine learning-based approach, for analyzing gene expression data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV). We utilize a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen interaction analysis. SMAS effectively combines gene selection based on statistical significance and expression changes, employing linear classifiers such as logistic regression to accurately differentiate between RT-qPCR positive and negative NHP samples. A key finding of our research is the identification of IFI6 and IFI27 as critical biomarkers, demonstrating exceptional predictive performance with 100% accuracy and Area Under the Curve (AUC) metrics in classifying various stages of Ebola infection. Alongside IFI6 and IFI27, genes, including MX1, OAS1, and ISG15, were significantly upregulated, highlighting their essential roles in the immune response to EBOV. Our results underscore the efficacy of the SMAS method in revealing complex genetic interactions and response mechanisms during EBOV infection. This
Non-human primates (NHPs), especially rhesus macaques, have played a significant role in our current understanding of the neural computations underlying human vision. Apart from the established homologies in the visual brain areas between these two species, and our extended abilities to probe detailed neural mechanisms in monkeys at multiple scales, one major factor that makes NHPs an extremely appealing animal model of human-vision is their ability to perform human-like visual behavior. Traditionally, such behavioral studies have been conducted in controlled laboratory settings. Such in-lab studies offer the experimenter a tight control over many experimental variables like overall luminance, eye movements (via eye tracking), auditory interference etc. However, there are several constraints related to such experiments. These include, 1) limited total experimental time, 2) requirement of dedicated human experimenters for the NHPs, 3) requirement of additional lab-space for the experiments, 4) NHPs often need to undergo invasive surgeries for a head-post implant, 5) additional time and training required for chairing and head restraints of monkeys. To overcome these limitations, many
This study evaluates the concordance between RNA sequencing (RNA-Seq) and NanoString technologies for gene expression analysis in non-human primates (NHPs) infected with Ebola virus (EBOV). We performed a detailed comparison of both platforms, demonstrating a strong correlation between them, with Spearman coefficients for 56 out of 62 samples ranging from 0.78 to 0.88, with a mean of 0.83 and a median of 0.85. Bland-Altman analysis further confirmed high consistency, with most measurements falling within 95% confidence limits. A machine learning approach, using the Supervised Magnitude-Altitude Scoring (SMAS) method trained on NanoString data, identified OAS1 as a key marker for distinguishing RT-qPCR positive from negative samples. Remarkably, when applied to RNA-Seq data, OAS1 also achieved 100% accuracy in differentiating infected from uninfected samples using logistic regression, demonstrating its robustness across platforms. Further differential expression analysis identified 12 common genes including ISG15, OAS1, IFI44, IFI27, IFIT2, IFIT3, IFI44L, MX1, MX2, OAS2, RSAD2, and OASL which demonstrated the highest levels of statistical significance and biological relevance across
This Dissertation is comprised of two main projects, addressing questions in neuroscience through applications of generative modeling. Project #1 (Chapter 4) explores how neurons encode features of the external world. I combine Helmholtz's "Perception as Unconscious Inference" -- paralleled by modern generative models like variational autoencoders (VAE) -- with the hierarchical structure of the visual cortex. This combination leads to the development of a hierarchical VAE model, which I test for its ability to mimic neurons from the primate visual cortex in response to motion stimuli. Results show that the hierarchical VAE perceives motion similar to the primate brain. Additionally, the model identifies causal factors of retinal motion inputs, such as object- and self-motion, in a completely unsupervised manner. Collectively, these results suggest that hierarchical inference underlines the brain's understanding of the world, and hierarchical VAEs can effectively model this understanding. Project #2 (Chapter 5) investigates the spatiotemporal structure of spontaneous brain activity and its reflection of brain states like rest. Using simultaneous fMRI and wide-field Ca2+ imaging data
If neuroscientists were asked which brain area is responsible for object recognition in primates, most would probably answer infero-temporal (IT) cortex. While IT is likely responsible for fine discriminations, and it is accordingly dominated by foveal visual inputs, there is more to object recognition than fine discrimination. Importantly, foveation of an object of interest usually requires recognizing, with reasonable confidence, its presence in the periphery. Arguably, IT plays a secondary role in such peripheral recognition, and other visual areas might instead be more critical. To investigate how signals carried by early visual processing areas (such as LGN and V1) could be used for object recognition in the periphery, we focused here on the task of distinguishing faces from non-faces. We tested how sensitive various models were to nuisance parameters, such as changes in scale and orientation of the image, and the type of image background. We found that a model of V1 simple or complex cells could provide quite reliable information, resulting in performance better than 80% in realistic scenarios. An LGN model performed considerably worse. Because peripheral recognition is both
This study investigates the prevalence and implications of nestedness within primate social networks, examining its relationship with cognitive and structural factors. We analysed data from 51 primate groups across 21 species, employing network analysis to evaluate nestedness and its correlation with modularity, neocortex ratio, and group size. We used Bayesian mixed effects modelling to investigate nestedness in primate social networks, controlling for phylogenetic dependencies and exploring various factors like neocortex ratio and group size. Our findings reveal a significant occurrence of nestedness in 66% of the species studied, exceeding chance expectations. This nestedness was more pronounced in groups with less steep dominance hierarchies, contrary to traditional assumptions linking it to hierarchical social structures. A notable inverse relationship between nestedness and modularity was observed, suggesting a structural trade-off in network formation. This pattern persisted even after controlling for species-specific social behaviours, indicating a general structural feature of primate networks. Surprisingly, our analysis showed no significant correlation between nestedness
Using drones to track multiple individuals simultaneously in their natural environment is a powerful approach for better understanding group primate behavior. Previous studies have demonstrated that it is possible to automate the classification of primate behavior from video data, but these studies have been carried out in captivity or from ground-based cameras. To understand group behavior and the self-organization of a collective, the whole troop needs to be seen at a scale where behavior can be seen in relation to the natural environment in which ecological decisions are made. This study presents a novel dataset from drone videos for baboon detection, tracking, and behavior recognition. The baboon detection dataset was created by manually annotating all baboons in drone videos with bounding boxes. A tiling method was subsequently applied to create a pyramid of images at various scales from the original 5.3K resolution images, resulting in approximately 30K images used for baboon detection. The tracking dataset is derived from the detection dataset, where all bounding boxes are assigned the same ID throughout the video. This process resulted in half an hour of very dense tracking
Visual object recognition -- the behavioral ability to rapidly and accurately categorize many visually encountered objects -- is core to primate cognition. This behavioral capability is algorithmically impressive because of the myriad identity-preserving viewpoints and scenes that dramatically change the visual image produced by the same object. Until recently, the brain mechanisms that support that capability were deeply mysterious. However, over the last decade, this scientific mystery has been illuminated by the discovery and development of brain-inspired, image-computable, artificial neural network (ANN) systems that rival primates in this behavioral feat. Apart from fundamentally changing the landscape of artificial intelligence (AI), modified versions of these ANN systems are the current leading scientific hypotheses of an integrated set of mechanisms in the primate ventral visual stream that support object recognition. What separates brain-mapped versions of these systems from prior conceptual models is that they are Sensory-computable, Mechanistic, Anatomically Referenced, and Testable (SMART). Here, we review and provide perspective on the brain mechanisms that the current
We present a new method of primate face recognition, and evaluate this method on several endangered primates, including golden monkeys, lemurs, and chimpanzees. The three datasets contain a total of 11,637 images of 280 individual primates from 14 species. Primate face recognition performance is evaluated using two existing state-of-the-art open-source systems, (i) FaceNet and (ii) SphereFace, (iii) a lemur face recognition system from literature, and (iv) our new convolutional neural network (CNN) architecture called PrimNet. Three recognition scenarios are considered: verification (1:1 comparison), and both open-set and closed-set identification (1:N search). We demonstrate that PrimNet outperforms all of the other systems in all three scenarios for all primate species tested. Finally, we implement an Android application of this recognition system to assist primate researchers and conservationists in the wild for individual recognition of primates.
Primates exhibit a robust deviation from canonical allometric scaling: at fixed body mass, their lifespans exceed those of non-primate mammals by factors of two to three. A rhesus macaque (8 kg) lives 25-40 years, whereas a cat of similar mass rarely exceeds 18 years. This statistically significant clade-level excess cannot be explained by standard metabolic or ecological models. We provide a thermodynamic explanation within the Principle of Biological Time Equivalence (PBTE), where lifespan is determined by a finite cycle budget governed by entropy production. We show that primates reduce entropy production per physiological cycle through increased neural energy allocation. The neural power fraction acts as a control parameter, extending the effective lifetime cycle count. Three mechanisms, predictive regulation, enhanced repair, and behavioral buffering, jointly suppress dissipation. This yields a quantitative neuro-metabolic multiplier that explains primate longevity and provides testable predictions linking brain energetics, entropy production, and lifespan.
In primates, loci associated with adaptive trait variation often fall in non-coding regions. Understanding the mechanisms linking these regulatory variants to fitness-relevant phenotypes remains challenging, but can be addressed using functional genomic data. However, such data are rarely generated at scale in non-human primates. When they are, only select tissues, cell types, developmental stages, and cellular environments are typically considered, despite appreciation that adaptive variants often exhibit context-dependent effects. In this review, we 1) discuss why context-dependent regulatory loci might be especially evolutionarily relevant in primates, 2) explore challenges and emerging solutions for mapping such context-dependent variation, and 3) discuss the scientific questions these data could address. We argue that filling this gap will provide critical insights into evolutionary processes, human disease, and regulatory adaptation.
Understanding how neural activity gives rise to perception is a central challenge in neuroscience. We address the problem of decoding visual information from high-density intracortical recordings in primates, using the THINGS Ventral Stream Spiking Dataset. We systematically evaluate the effects of model architecture, training objectives, and data scaling on decoding performance. Results show that decoding accuracy is mainly driven by modeling temporal dynamics in neural signals, rather than architectural complexity. A simple model combining temporal attention with a shallow MLP achieves up to 70% top-1 image retrieval accuracy, outperforming linear baselines as well as recurrent and convolutional approaches. Scaling analyses reveal predictable diminishing returns with increasing input dimensionality and dataset size. Building on these findings, we design a modular generative decoding pipeline that combines low-resolution latent reconstruction with semantically conditioned diffusion, generating plausible images from 200 ms of brain activity. This framework provides principles for brain-computer interfaces and semantic neural decoding.
Progress has led to a detailed understanding of the neural mechanisms that underlie decision making in primates. However, less is known about why such mechanisms are present in the first place. Theory suggests that primate decision making mechanisms, and their resultant behavioral abilities, emerged to maximize reward in the face of noisy, temporally evolving information. To test this theory, we trained an end-to-end deep recurrent neural network using reinforcement learning on a noisy perceptual discrimination task. Networks learned several key abilities of primate-like decision making including trading off speed for accuracy, and flexibly changing their mind in the face of new information. Internal dynamics of these networks suggest that these abilities were supported by similar decision mechanisms as those observed in primate neurophysiological studies. These results provide experimental support for key pressures that gave rise to the primate ability to make flexible decisions.
A major goal of computational neuroscience has been to explain how the primate ventral visual stream (VVS) transforms visual input into temporally evolving neural representations that support robust visual perception. Historically, most modeling efforts have assumed static conditions: monkeys fixate a dot, images are briefly flashed, and neural responses are analyzed through time-averaged metrics. Feedforward deep networks trained on static object recognition tasks outperform prior work in approximating these static snapshot-driven VVS responses. However, mounting neurophysiological evidence demonstrates that VVS responses are rich dynamical signals shaped not only by the retinal input but also by intrinsic circuit dynamics, recurrent interactions, and widespread top-down modulation. Moreover, real-world vision is inherently dynamic: objects move, the observer moves, and the eyes actively sample the environment. Here, we review recent progress in modeling dynamic responses in the macaque ventral stream across three domains: (1) intrinsic dynamics elicited by static images, (2) dynamics evoked by dynamic visual stimuli, and (3) dynamics generated by active sensing during eye movemen
Video recordings of nonhuman primates in their natural habitat are a common source for studying their behavior in the wild. We fine-tune pre-trained video-text foundational models for the specific domain of capuchin monkeys, with the goal of developing useful computational models to help researchers to retrieve useful clips from videos. We focus on the challenging problem of training a model based solely on raw, unlabeled video footage, using weak audio descriptions sometimes provided by field collaborators. We leverage recent advances in Multimodal Large Language Models (MLLMs) and Vision-Language Models (VLMs) to address the extremely noisy nature of both video and audio content. Specifically, we propose a two-folded approach: an agentic data treatment pipeline and a fine-tuning process. The data processing pipeline automatically extracts clean and semantically aligned video-text pairs from the raw videos, which are subsequently used to fine-tune a pre-trained Microsoft's X-CLIP model through Low-Rank Adaptation (LoRA). We obtained an uplift in $Hits@5$ of $167\%$ for the 16 frames model and an uplift of $114\%$ for the 8 frame model on our domain data. Moreover, based on $NDCG@K
Non-human primates (NHPs) serve as critical models for understanding human brain function and neurological disorders due to their close evolutionary relationship with humans. Accurate brain tissue segmentation in NHPs is critical for understanding neurological disorders, but challenging due to the scarcity of annotated NHP brain MRI datasets, the small size of the NHP brain, the limited resolution of available imaging data and the anatomical differences between human and NHP brains. To address these challenges, we propose a novel approach utilizing STU-Net with transfer learning to leverage knowledge transferred from human brain MRI data to enhance segmentation accuracy in the NHP brain MRI, particularly when training data is limited. The combination of STU-Net and transfer learning effectively delineates complex tissue boundaries and captures fine anatomical details specific to NHP brains. Notably, our method demonstrated improvement in segmenting small subcortical structures such as putamen and thalamus that are challenging to resolve with limited spatial resolution and tissue contrast, and achieved DSC of over 0.88, IoU over 0.8 and HD95 under 7. This study introduces a robust m
Non-human primates are our closest living relatives, and analyzing their behavior is central to research in cognition, evolution, and conservation. Computer vision could greatly aid this research, but existing methods often rely on human-centric pretrained models and focus on single datasets, which limits generalization. We address this limitation by shifting from a model-centric to a data-centric approach and introduce PriVi, a large-scale primate-centric video pretraining dataset. PriVi contains 424 hours of curated video, combining 174 hours from behavioral research across 11 settings with 250 hours of diverse web-sourced footage, assembled through a scalable data curation pipeline. We continue pretraining V-JEPA, a large-scale video model, on PriVi to learn primate-specific representations and evaluate it using a lightweight frozen classifier. Across four benchmark datasets, ChimpACT, PanAf500, BaboonLand, and ChimpBehave, our approach consistently outperforms prior work, including fully finetuned baselines, and scales favorably with fewer labels. These results demonstrate for the first time that domain-level pretraining, where pretraining is conducted on similar data but not t
Neural signals related to movement can be measured from intracranial recordings and used in brain-machine interface devices (BMI) to restore physical function in impaired patients. In this study, we explore the use of more abstract neural signals related to economic value in a BMI context. Using data collected from the orbitofrontal cortex in non-human primates, we develop deep learning-based neural decoders that can predict the monkey's choice in a value-based decision-making task. Out-of-sample performance was improved by augmenting the training set with synthesized data, showing the feasibility of using limited training data. We further demonstrate that we can predict the monkey's choice sooner using a neural forecasting module that is equipped with task-related information. These findings support the feasibility of user preference-informed neuroengineering devices that leverage abstract cognitive signals.
Independent component analysis (ICA) is widely used to separate mixed signals and recover statistically independent components. However, in non-human primate neuroimaging studies, most ICA-recovered spatial maps are often dense. To extract the most relevant brain activation patterns, post-hoc thresholding is typically applied-though this approach is often imprecise and arbitrary. To address this limitation, we employed the Sparse ICA method, which enforces both sparsity and statistical independence, allowing it to extract the most relevant activation maps without requiring additional post-processing. Simulation experiments demonstrate that Sparse ICA performs competitively against 11 classical linear ICA methods. We further applied Sparse ICA to real non-human primate neuroimaging data, identifying several independent component networks spanning different brain networks. These spatial maps revealed clearly defined activation areas, providing further evidence that Sparse ICA is effective and reliable in practical applications.