This paper presents a systematic evaluation of the privacy behaviors and attributes of eight recent, popular browser agents. Browser agents are software that automate Web browsing using large language models and ancillary tooling. However, the automated capabilities that make browser agents powerful also make them high-risk points of failure. Both the kinds of tasks browser agents are designed to execute, along with the kinds of information browser agents are entrusted with to fulfill those tasks, mean that vulnerabilities in these tools can result in enormous privacy harm. This work presents a framework of five broad factors (totaling 15 distinct measurements) to measure the privacy risks in browser agents. Our framework assesses i. vulnerabilities in the browser agent's components, ii. how the browser agent protects against website behaviors, iii. whether the browser agent prevents cross-site tracking, iv. how the agent responds to privacy-affecting prompts, and v. whether the tool leaks personal information to sites. We apply our framework to eight browser agents and identify 30 vulnerabilities, ranging from disabled browser privacy features to "autocompleting" sensitive persona
Large language models (LLMs) are increasingly being integrated into web browsers to create agentic browsing systems that execute actions on behalf of the user. Prior work considering the security of agentic browsers focuses exclusively on indirect prompt-injection attacks. However, by failing to consider traditional web attacks, previous agentic browser threat models have a blind spot to web social engineering attacks originally designed to trick humans. In this paper, we propose the first web-focused threat model for agentic browsers and use it to derive a taxonomy of 20 attacks across both the web and LLM space, and implement 18 of the attacks. Our threat model extends the original See$\rightarrow$Act browser agent model to account for all components of a browser, and frames the agent as a confused deputy unable to distinguish task steps from traditional web attacks. We show that 10 web threats can reemerge often in amplified forms once an agent can be influenced by untrusted page content. We further conduct a generalizability study on 14 of the 20 attacks, showing that our attacks reproduce across 4 major LLM models spanning multiple vendors. We show that agentic browsers exhibi
Web browsers provide the security foundation for our online experiences. Significant research has been done into the security of browsers themselves, but relatively little investigation has been done into how they interact with the operating system or the file system. In this work, we provide the first systematic security study of browser profiles, the on-disk persistence layer of browsers, used for storing everything from users' authentication cookies and browser extensions to certificate trust decisions and device permissions. We show that, except for the Tor Browser, all modern browsers store sensitive data in home directories with little to no integrity or confidentiality controls. We show that security measures like password and cookie encryption can be easily bypassed. In addition, HTTPS can be sidestepped entirely by deploying malicious root certificates within users' browser profiles. The Public Key Infrastructure (PKI), the backbone of the secure Web. HTTPS can be fully bypassed with the deployment of custom potentially malicious root certificates. More worryingly, we show how these powerful attacks can be fully mounted directly from web browsers themselves, through the Fi
Modern web browsers have effectively become the new operating system for business applications, yet their security posture is often under-scrutinized. This paper presents a novel, comprehensive Browser Security Posture Analysis Framework[1], a browser-based client-side security assessment toolkit that runs entirely in JavaScript and WebAssembly within the browser. It performs a battery of over 120 in-browser security tests in situ, providing fine-grained diagnostics of security policies and features that network-level or os-level tools cannot observe. This yields insights into how well a browser enforces critical client-side security invariants. We detail the motivation for such a framework, describe its architecture and implementation, and dive into the technical design of numerous test modules (covering the same-origin policy, cross-origin resource sharing, content security policy, sandboxing, XSS protection, extension interference via WeakRefs, permissions audits, garbage collection behavior, cryptographic APIs, SSL certificate validation, advanced web platform security features like SharedArrayBuffer, Content filtering controls ,and internal network accessibility). We then pres
For safety reasons, large language models (LLMs) are trained to refuse harmful user instructions, such as assisting dangerous activities. We study an open question in this work: does the desired safety refusal, typically enforced in chat contexts, generalize to non-chat and agentic use cases? Unlike chatbots, LLM agents equipped with general-purpose tools, such as web browsers and mobile devices, can directly influence the real world, making it even more crucial to refuse harmful instructions. In this work, we primarily focus on red-teaming browser agents, LLMs that manipulate information via web browsers. To this end, we introduce Browser Agent Red teaming Toolkit (BrowserART), a comprehensive test suite designed specifically for red-teaming browser agents. BrowserART is consist of 100 diverse browser-related harmful behaviors (including original behaviors and ones sourced from HarmBench [Mazeika et al., 2024] and AirBench 2024 [Zeng et al., 2024b]) across both synthetic and real websites. Our empirical study on state-of-the-art browser agents reveals that, while the backbone LLM refuses harmful instructions as a chatbot, the corresponding agent does not. Moreover, attack methods
Browser fingerprinting is the identification of a browser through the network traffic captured during communication between the browser and server. This can be done using the HTTP protocol, browser extensions, and other methods. This paper discusses browser fingerprinting using the HTTPS over TLS 1.3 protocol. The study observed that different browsers use a different number of messages to communicate with the server, and the length of messages also varies. To conduct the study, a network was set up using a UTM hypervisor with one virtual machine as the server and another as a VM with a different browser. The communication was captured, and it was found that there was a 30\%-35\% dissimilarity in the behavior of different browsers.
As the use of web browsers continues to grow, the potential for cybercrime and web-related criminal activities also increases. Digital forensic investigators must understand how different browsers function and the critical areas to consider during web forensic analysis. Web forensics, a subfield of digital forensics, involves collecting and analyzing browser artifacts, such as browser history, search keywords, and downloads, which serve as potential evidence. While existing research has provided valuable insights, many studies focus on individual browsing modes or limited forensic scenarios, leaving gaps in understanding the full scope of data retention and recovery across different modes and browsers. This paper addresses these gaps by defining four browsing scenarios and critically analyzing browser artifacts across normal, private, and portable modes using various forensic tools. We define four browsing scenarios to perform a comprehensive evaluation of popular browsers -- Google Chrome, Mozilla Firefox, Brave, Tor, and Microsoft Edge -- by monitoring changes in key data storage areas such as cache files, cookies, browsing history, and local storage across different browsing mod
Browser extensions are additional tools developed by third parties that integrate with web browsers to extend their functionality beyond standard capabilities. However, the browser extension platform is increasingly being exploited by hackers to launch sophisticated cyber threats. These threats encompass a wide range of malicious activities, including but not limited to phishing, spying, Distributed Denial of Service (DDoS) attacks, email spamming, affiliate fraud, malvertising, and payment fraud. This paper examines the evolving threat landscape of malicious browser extensions in 2025, focusing on Mozilla Firefox and Chrome. Our research successfully bypassed security mechanisms of Firefox and Chrome, demonstrating that malicious extensions can still be developed, published, and executed within the Mozilla Add-ons Store and Chrome Web Store. These findings highlight the persisting weaknesses in browser's vetting process and security framework. It provides insights into the risks associated with browser extensions, helping users understand these threats while aiding the industry in developing controls and countermeasures to defend against such attacks. All experiments discussed in
Web client fingerprinting has become a widely used technique for uniquely identifying users, browsers, operating systems, and devices with high accuracy. While it is beneficial for applications such as fraud detection and personalized experiences, it also raises privacy concerns by enabling persistent tracking and detailed user profiling. This paper introduces an advanced fingerprinting method using WebAssembly (Wasm) - a low-level programming language that offers near-native execution speed in modern web browsers. With broad support across major browsers and growing adoption, WebAssembly provides a strong foundation for developing more effective fingerprinting methods. In this work, we present a new approach that leverages WebAssembly's computational capabilities to identify returning devices-such as smartphones, tablets, laptops, and desktops across different browsing sessions. Our method uses subtle differences in the WebAssembly JavaScript API implementation to distinguish between Chromium-based browsers like Google Chrome and Microsoft Edge, even when identifiers such as the User-Agent are completely spoofed, achieving a false-positive rate of less than 1%. The fingerprint is
Browser-using agents (BUAs) are an emerging class of AI agents that interact with web browsers in human-like ways, including clicking, scrolling, filling forms, and navigating across pages. While these agents help automate repetitive online tasks, they are vulnerable to prompt injection attacks that trick an agent into performing undesired actions, such as leaking private information or issuing unintended state-changing requests. We propose ceLLMate, a browser-level sandboxing framework that restricts the agent's ambient authority and reduces the blast radius of prompt injections. We address the semantic gap challenge that is fundamental to BUAs -- writing and enforcing security policies for low-level UI tools like clicks and keystrokes is brittle and error-prone. Our core insight is to perform sandboxing at the HTTP layer because all side-effecting UI operations will result in network communication to the website's backend. We implement ceLLMate as an agent-agnostic browser extension and demonstrate how it enables sandboxing policies that block prompt injection attacks in the WASP benchmark with 7.25--15% latency overhead.
The integration of Large Language Models (LLMs) into browser extensions has revolutionized web browsing, enabling sophisticated functionalities like content summarization, intelligent translation, and context-aware writing assistance. However, these AI-powered extensions introduce unprecedented challenges in testing and reliability assurance. Traditional browser extension testing approaches fail to address the non-deterministic behavior, context-sensitivity, and complex web environment integration inherent to LLM-powered extensions. Similarly, existing LLM testing methodologies operate in isolation from browser-specific contexts, creating a critical gap in effective evaluation frameworks. To bridge this gap, we present ASSURE, a modular automated testing framework specifically designed for AI-powered browser extensions. ASSURE comprises three principal components: (1) a modular test case generation engine that supports plugin-based extension of testing scenarios, (2) an automated execution framework that orchestrates the complex interactions between web content, extension processing, and AI model behavior, and (3) a configurable validation pipeline that systematically evaluates beh
A fundamental tension exists between the demand for sophisticated AI assistance in web search and the need for user data privacy. Current centralized models require users to transmit sensitive browsing data to external services, which limits user control. In this paper, we present a browser extension that provides a viable in-browser alternative. We introduce a hybrid architecture that functions entirely on the client side, combining two components: (1) an adaptive probabilistic model that learns a user's behavioral policy from direct feedback, and (2) a Small Language Model (SLM), running in the browser, which is grounded by the probabilistic model to generate context-aware suggestions. To evaluate this approach, we conducted a three-week longitudinal user study with 18 participants. Our results show that this privacy-preserving approach is highly effective at adapting to individual user behavior, leading to measurably improved search efficiency. This work demonstrates that sophisticated AI assistance is achievable without compromising user privacy or data control.
Browser agents enable autonomous web interaction but face critical reliability and security challenges in production. This paper presents findings from building and operating a production browser agent. The analysis examines where current approaches fail and what prevents safe autonomous operation. The fundamental insight: model capability does not limit agent performance; architectural decisions determine success or failure. Security analysis of real-world incidents reveals prompt injection attacks make general-purpose autonomous operation fundamentally unsafe. The paper argues against developing general browsing intelligence in favor of specialized tools with programmatic constraints, where safety boundaries are enforced through code instead of large language model (LLM) reasoning. Through hybrid context management combining accessibility tree snapshots with selective vision, comprehensive browser tooling matching human interaction capabilities, and intelligent prompt engineering, the agent achieved approximately 85% success rate on the WebGames benchmark across 53 diverse challenges (compared to approximately 50% reported for prior browser agents and 95.7% human baseline).
Web browsers have been used widely by users to conduct various online activities, such as information seeking or online shopping. To improve user experience and extend the functionality of browsers, practitioners provide mechanisms to allow users to install third-party-provided plugins (i.e., extensions) on their browsers. However, little is known about the performance implications caused by such extensions. In this paper, we conduct an empirical study to understand the impact of extensions on the user-perceived performance (i.e., energy consumption and page load time) of Google Chrome, the most popular browser. We study a total of 72 representative extensions from 11 categories (e.g., Developer Tools and Sports). We observe that browser performance can be negatively impacted by the use of extensions, even when the extensions are used in unintended circumstances (e.g., when logging into an extension is not granted but required, or when an extension is not used for designated websites). We also identify a set of factors that significantly influence the performance impact of extensions, such as code complexity and privacy practices (i.e., collection of user data) adopted by the exten
Digital Twin (DT) technology holds immense potential for surgical planning and personalized medicine. However, generating interactive, patient-specific anatomical twins currently relies on computationally heavy Server-Side Rendering (SSR) or expensive local workstations, creating significant barriers to deployment, especially in resource-constrained settings (RCS). This paper presents a decentralized, client-side WebGPU architecture that democratizes access to high-fidelity anatomical Digital Twins. By bypassing standard server-side rendering pipelines, the framework executes deterministic single-pass raymarching and morphological gradient calculations directly on low-cost integrated edge GPUs. Eliminating the network latency inherent to cloud-rendered solutions, the system achieves a Time to First Pixel (TTFP) of under 920.0ms and maintains stable interactivity at >= 82.0 FPS. Continuous Interaction Fidelity is maintained via uniform buffers, enabling zero-latency manipulation of tissue parameters for dynamic clinical decision-making. By proving that complex 3D medical simulations of patient-specific MRI scan can be executed natively in the browser without deep learning or exte
Securing browsers in mobile devices is very challenging, because these browser apps usually provide browsing services to other apps in the same device. A malicious app installed in a device can potentially obtain sensitive information through a browser app. In this paper, we identify four types of attacks in Android, collectively known as FileCross, that exploits the vulnerable file:// to obtain users' private files, such as cookies, bookmarks, and browsing histories. We design an automated system to dynamically test 115 browser apps collected from Google Play and find that 64 of them are vulnerable to the attacks. Among them are the popular Firefox, Baidu and Maxthon browsers, and the more application-specific ones, including UC Browser HD for tablet users, Wikipedia Browser, and Kids Safe Browser. A detailed analysis of these browsers further shows that 26 browsers (23%) expose their browsing interfaces unintentionally. In response to our reports, the developers concerned promptly patched their browsers by forbidding file:// access to private file zones, disabling JavaScript execution in file:// URLs, or even blocking external file:// URLs. We employ the same system to validate t
Mobile web browsing has recently surpassed desktop browsing both in term of popularity and traffic. Following its desktop counterpart, the mobile browsers ecosystem has been growing from few browsers (Chrome, Firefox, and Safari) to a plethora of browsers, each with unique characteristics (battery friendly, privacy preserving, lightweight, etc.). In this paper, we introduce a browser benchmarking pipeline for Android browsers encompassing automation, in-depth experimentation, and result analysis. We tested 15 Android browsers, using Cappuccino a novel testing suite we built for third party Android applications. We perform a battery-centric analysis of such browsers and show that: 1) popular browsers tend also to consume the most, 2) adblocking produces significant battery savings (between 20 and 40% depending on the browser), and 3) dark mode offers an extra 10% battery savings on AMOLED screens. We exploit this observation to build AttentionDim, a screen dimming mechanism driven by browser events. Via integration with the Brave browser and 10 volunteers, we show potential battery savings up to 30%, on both devices with AMOLED and LCD screens.
Browser fingerprinting can be used to identify and track users across the Web, even without cookies, by collecting attributes from users' devices to create unique "fingerprints". This technique and resulting privacy risks have been studied for over a decade. Yet further research is limited because prior studies used data not publicly available. Additionally, data in prior studies lacked user demographics. Here we provide a first-of-its-kind dataset to enable further research. It includes browser attributes with users' demographics and survey responses, collected with informed consent from 8,400 US study participants. We use this dataset to demonstrate how fingerprinting risks differ across demographic groups. For example, we find lower income users are more at risk, and find that as users' age increases, they are both more likely to be concerned about fingerprinting and at real risk of fingerprinting. Furthermore, we demonstrate an overlooked risk: user demographics, such as gender, age, income level and race, can be inferred from browser attributes commonly used for fingerprinting, and we identify which browser attributes most contribute to this risk. Our data collection process a
We present a simple yet potentially devastating and hard-to-detect threat, called Gummy Browsers, whereby the browser fingerprinting information can be collected and spoofed without the victim's awareness, thereby compromising the privacy and security of any application that uses browser fingerprinting. The idea is that attacker A first makes the user U connect to his website (or to a well-known site the attacker controls) and transparently collects the information from U that is used for fingerprinting purposes. Then, A orchestrates a browser on his own machine to replicate and transmit the same fingerprinting information when connecting to W, fooling W to think that U is the one requesting the service rather than A. This will allow the attacker to profile U and compromise U's privacy. We design and implement the Gummy Browsers attack using three orchestration methods based on script injection, browser settings and debugging tools, and script modification, that can successfully spoof a wide variety of fingerprinting features to mimic many different browsers (including mobile browsers and the Tor browser). We then evaluate the attack against two state-of-the-art browser fingerprint
Browser fingerprinting consists in collecting attributes from a web browser to build a browser fingerprint. In this work, we assess the adequacy of browser fingerprints as an authentication factor, on a dataset of 4,145,408 fingerprints composed of 216 attributes. It was collected throughout 6 months from a population of general browsers. We identify, formalize, and assess the properties for browser fingerprints to be usable and practical as an authentication factor. We notably evaluate their distinctiveness, their stability through time, their collection time, and their size in memory. We show that considering a large surface of 216 fingerprinting attributes leads to an unicity rate of 81% on a population of 1,989,365 browsers. Moreover, browser fingerprints are known to evolve, but we observe that between consecutive fingerprints, more than 90% of the attributes remain unchanged after nearly 6 months. Fingerprints are also affordable. On average, they weigh a dozen of kilobytes, and are collected in a few seconds. We conclude that browser fingerprints are a promising additional web authentication factor.