Vehicle computing represents a fundamental shift in how autonomous vehicles are designed and deployed, transforming them from isolated transportation systems into mobile computing platforms that support both safety-critical, real-time driving and data-centric services. In this setting, vehicles simultaneously support real-time driving pipelines and a growing set of data-driven applications, placing increased responsibility on the vehicle operating system to coordinate computation, data movement, storage, and access. These demands highlight recurring system considerations related to predictable execution, data and execution protection, efficient handling of high-rate sensor data, and long-term system evolvability, commonly summarized as Safety, Security, Efficiency, and Extensibility (SSEE). Existing vehicle operating systems and runtimes address these concerns in isolation, resulting in fragmented software stacks that limit coordination between autonomy workloads and vehicle data services. This paper presents DAVOS, the Dependable Autonomous Vehicle Operating System, a unified vehicle operating system architecture designed for the vehicle computing context. DAVOS provides a cohesiv
The growing complexity of embedded systems creates tension between rich functionality and strict resource and real-time constraints. Traditional monolithic operating system and hypervisor designs suffer from resource bloat and unpredictable scheduling, making them unsuitable for time-critical workloads where low latency and low jitter are essential. We propose TenonOS, a demand-driven, self-generating, lightweight operating system framework for time-critical embedded systems that rethinks both hypervisor and operating system architectures. TenonOS introduces a LibOS-on-LibOS model that decomposes hypervisor and operating system functionality into fine-grained, reusable micro-libraries. A generative orchestration engine dynamically composes these libraries to synthesize a customized runtime tailored to each application's criticality, timing requirements, and resource profile. TenonOS consists of two core components: Mortise, a minimalist micro-hypervisor, and Tenon, a real-time library operating system. Mortise provides lightweight isolation and removes the usual double-scheduler overhead in virtualized setups, while Tenon provides precise and deterministic task management. By gener
We introduce OSVBench, a new benchmark for evaluating Large Language Models (LLMs) on the task of generating complete formal specifications for verifying the functional correctness of operating system kernels. This benchmark is built upon a real-world operating system kernel, Hyperkernel, and consists of 245 complex specification generation tasks in total, each of which is a long-context task of about 20k-30k tokens. The benchmark formulates the specification generation task as a program synthesis problem confined to a domain for specifying states and transitions. This formulation is provided to LLMs through a programming model. The LLMs must be able to understand the programming model and verification assumptions before delineating the correct search space for syntax and semantics and generating formal specifications. Guided by the operating system's high-level functional description, the LLMs are asked to generate a specification that fully describes all correct states and transitions for a potentially buggy code implementation of the operating system. Experimental results with 12 state-of-the-art LLMs indicate limited performance of existing LLMs on the specification generation
The operating system (OS) is the backbone of modern computing, providing essential services and managing resources for computer hardware and software. This review paper offers an in-depth analysis of operating systems' evolution, current state, and prospects. We begin with an overview of the concept and significance of operating systems in the digital era. In the second section, we delve into the existing released operating systems, examining their architectures, functionalities, and the ecosystems they support. We then explore recent advances in OS evolution, highlighting innovations in real-time processing, distributed computing, and security. The third section focuses on the new era of operating systems, discussing emerging trends like the Internet of Things (IoT), cloud computing, and artificial intelligence (AI) integration. We also consider the challenges and opportunities presented by these developments. This review concludes with a synthesis of the current landscape and a forward-looking discussion on the future trajectories of operating systems, including open issues and areas ripe for further research and innovation. Finally, we put forward a new OS architecture.
Modern deep learning workloads often consist of many small tensor operations, especially in inference, attention, and micro-batched training. In these settings, kernel launch overhead can become a major bottleneck, sometimes exceeding the actual computation time. We present GPUOS, a GPU runtime JIT system that reduces launch overhead using a persistent kernel architecture with runtime operator injection. GPUOS runs a single long-lived GPU kernel that continuously processes tasks from a host-managed work queue, eliminating repeated kernel launches. To support diverse operations, GPUOS uses NVIDIA NVRTC to just-in-time compile operators at runtime and inject them into the running kernel through device function pointer tables. This design enables operator updates without restarting the kernel or recompiling the system. GPUOS introduces four key ideas: (1) a persistent worker kernel with atomic task queues, (2) a runtime operator injection mechanism based on NVRTC and relocatable device code, (3) a dual-slot aliasing scheme for safe concurrent operator updates, and (4) transparent PyTorch integration through TorchDispatch that batches micro-operations into unified submissions. The syst
Vulnerabilities in open-source operating systems (OSs) pose substantial security risks to software systems, making their detection crucial. While fuzzing has been an effective vulnerability detection technique in various domains, OS fuzzing (OSF) faces unique challenges due to OS complexity and multi-layered interaction, and has not been comprehensively reviewed. Therefore, this work systematically surveys the state-of-the-art OSF techniques, categorizes them based on the general fuzzing process, and investigates challenges specific to kernel, file system, driver, and hypervisor fuzzing. Finally, future research directions for OSF are discussed.
Real-time operating systems employ spatial and temporal isolation to guarantee predictability and schedulability of real-time systems on multi-core processors. Any unbounded and uncontrolled cross-core performance interference poses a significant threat to system time safety. However, the current Linux kernel has a number of interference issues and represents a primary source of interference. Unfortunately, existing research does not systematically and deeply explore the cross-core performance interference issue within the OS itself. This paper presents our industry practice for mitigating cross-core performance interference in Linux over the past 6 years. We have fixed dozens of interference issues in different Linux subsystems. Compared to the version without our improvements, our enhancements reduce the worst-case jitter by a factor of 8.7, resulting in a maximum 11.5x improvement over system schedulability. For the worst-case latency in the Core Flight System and the Robot Operating System 2, we achieve a 1.6x and 1.64x reduction over RT-Linux. Based on our development experience, we summarize the lessons we learned and offer our suggestions to system developers for systematica
Delivering hands-on practice laboratories for introductory courses on operating systems is a difficult task. One of the main sources of the difficulty is the sheer size and complexity of the operating systems software. Consequently, some of the solutions adopted in the literature to teach operating systems laboratory consider smaller and simpler systems, generally referred to as instructional operating systems. This work continues in the same direction and is threefold. First, it considers a simpler hardware platform. Second, it argues that a minimal operating system is a viable option for delivering laboratories. Third, it presents a laboratory teaching platform, whereby students build a minimal operating system for an embedded hardware platform. The proposed platform is called MiniOS. An important aspect of MiniOS is that it is sufficiently supported with additional technical and pedagogic material. Finally, the effectiveness of the proposed approach to teach operating systems laboratories is illustrated through the experience of using it to deliver laboratory projects in the Operating Systems course at the University of Northern British Columbia. Finally, we discuss experimental
The rise in computing hardware choices is driving a reevaluation of operating systems. The traditional role of an operating system controlling the execution of its own hardware is evolving toward a model whereby the controlling processor is distinct from the compute engines that are performing most of the computations. In this context, an operating system can be viewed as software that brokers and tracks the resources of the compute engines and is akin to a database management system. To explore the idea of using a database in an operating system role, this work defines key operating system functions in terms of rigorous mathematical semantics (associative array algebra) that are directly translatable into database operations. These operations possess a number of mathematical properties that are ideal for parallel operating systems by guaranteeing correctness over a wide range of parallel operations. The resulting operating system equations provide a mathematical specification for a Tabular Operating System Architecture (TabulaROSA) that can be implemented on any platform. Simulations of forking in TabularROSA are performed using an associative array implementation and compared to
This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in varying environments and workloads. We discuss a wide range of possibilities that then arise, from employing foundation models as policy agents to utilizing them as generators and predictors to assist traditional OS control algorithms. Our hope is that this paper spurs further research into OS foundation models and creating the next generation of operating systems for the evolving computing landscape.
Operating systems are vital system software that, without them, humans would not be able to manage and use computer systems. In essence, an operating system is a collection of software programs whose role is to manage computer resources and provide an interface for client applications to interact with the different computer hardware. Most of the commercial operating systems available today on the market have buggy code and they exhibit security flaws and vulnerabilities. In effect, building a trusted operating system that can mostly resist attacks and provide a secure computing environment to protect the important assets of a computer is the goal of every operating system manufacturer. This paper deeply investigates the various security features of the two most widespread and successful operating systems, Microsoft Windows and Linux. The different security features, designs, and components of the two systems are to be covered elaborately, pin-pointing the key similarities and differences between them. In due course, a head-to-head comparison is to be drawn for each security aspect, exposing the advantage of one system over the other.
LLM-based intelligent agents face significant deployment challenges, particularly related to resource management. Allowing unrestricted access to LLM or tool resources can lead to inefficient or even potentially harmful resource allocation and utilization for agents. Furthermore, the absence of proper scheduling and resource management mechanisms in current agent designs hinders concurrent processing and limits overall system efficiency. To address these challenges, this paper proposes the architecture of AIOS (LLM-based AI Agent Operating System) under the context of managing LLM-based agents. It introduces a novel architecture for serving LLM-based agents by isolating resources and LLM-specific services from agent applications into an AIOS kernel. This AIOS kernel provides fundamental services (e.g., scheduling, context management, memory management, storage management, access control) for runtime agents. To enhance usability, AIOS also includes an AIOS SDK, a comprehensive suite of APIs designed for utilizing functionalities provided by the AIOS kernel. Experimental results demonstrate that using AIOS can achieve up to 2.1x faster execution for serving agents built by various ag
The end of Moore's Law has ushered in a diversity of hardware not seen in decades. Operating system (and system software) portability is accordingly becoming increasingly critical. Simultaneously, there has been tremendous progress in program synthesis. We set out to explore the feasibility of using modern program synthesis to generate the machine-dependent parts of an operating system. Our ultimate goal is to generate new ports automatically from descriptions of new machines. One of the issues involved is writing specifications, both for machine-dependent operating system functionality and for instruction set architectures. We designed two domain-specific languages: Alewife for machine-independent specifications of machine-dependent operating system functionality and Cassiopea for describing instruction set architecture semantics. Automated porting also requires an implementation. We developed a toolchain that, given an Alewife specification and a Cassiopea machine description, specializes the machine-independent specification to the target instruction set architecture and synthesizes an implementation in assembly language with a customized symbolic execution engine. Using this ap
The C programming language was developed in the 1970s as a fairly unconventional systems and operating systems development tool, but has, through the course of the ISO Standards process, added many attributes of more conventional programming languages and become less suitable for operating systems development. Operating system programming continues to be done in non-ISO dialects of C. The differences provide a glimpse of operating system requirements for programming languages.
This paper presents CARTOS, a charging-aware real-time operating system designed to enhance the functionality of intermittently-powered batteryless devices (IPDs) for various Internet of Things (IoT) applications. While IPDs offer significant advantages such as extended lifespan and operability in extreme environments, they pose unique challenges, including the need to ensure forward progress of program execution amidst variable energy availability and maintaining reliable real-time time behavior during power disruptions. To address these challenges, CARTOS introduces a mixed-preemption scheduling model that classifies tasks into computational and peripheral tasks, and ensures their efficient and timely execution by adopting just-in-time checkpointing for divisible computation tasks and uninterrupted execution for indivisible peripheral tasks. CARTOS also supports processing chains of tasks with precedence constraints and adapts its scheduling in response to environmental changes to offer continuous execution under diverse conditions. CARTOS is implemented with new APIs and components added to FreeRTOS but is designed for portability to other embedded RTOSs. Through real hardware e
This research analyzed the performance and consistency of four synchronization mechanisms-reentrant locks, semaphores, synchronized methods, and synchronized blocks-across three operating systems: macOS, Windows, and Linux. Synchronization ensures that concurrent processes or threads access shared resources safely, and efficient synchronization is vital for maintaining system performance and reliability. The study aimed to identify the synchronization mechanism that balances efficiency, measured by execution time, and consistency, assessed by variance and standard deviation, across platforms. The initial hypothesis proposed that mutex-based mechanisms, specifically synchronized methods and blocks, would be the most efficient due to their simplicity. However, empirical results showed that reentrant locks had the lowest average execution time (14.67ms), making them the most efficient mechanism, but with the highest variability (standard deviation of 1.15). In contrast, synchronized methods, blocks, and semaphores exhibited higher average execution times (16.33ms for methods and 16.67ms for blocks) but with greater consistency (variance of 0.33). The findings indicated that while reen
The Internet of Things (IoT) is becoming an integral part of our modern lives as we converge towards a world surrounded by ubiquitous connectivity. The inherent complexity presented by the vast IoT ecosystem ends up in an insufficient understanding of individual system components and their interactions, leading to numerous security challenges. In order to create a secure IoT platform from the ground up, there is a need for a unifying operating system (OS) that can act as a cornerstone regulating the development of stable and secure solutions. In this paper, we present a classification of the security challenges stemming from the manifold aspects of IoT development. We also specify security requirements to direct the secure development of an unifying IoT OS to resolve many of those ensuing challenges. Survey of several modern IoT OSs confirm that while the developers of the OSs have taken many alternative approaches to implement security, we are far from engineering an adequately secure and unified architecture. More broadly, the study presented in this paper can help address the growing need for a secure and unified platform to base IoT development on and assure the safe, secure, a
Operating system is a bridge between system and user. An operating system (OS) is a software program that manages the hardware and software resources of a computer. The OS performs basic tasks, such as controlling and allocating memory, prioritizing the processing of instructions, controlling input and output devices, facilitating networking, and managing files. It is difficult to present a complete as well as deep account of operating systems developed till date. So, this paper tries to overview only a subset of the available operating systems and its different categories. OS are being developed by a large number of academic and commercial organizations for the last several decades. This paper, therefore, concentrates on the different categories of OS with special emphasis to those that had deep impact on the evolution process. The aim of this paper is to provide a brief timely commentary on the different categories important operating systems available today.
This note concerns a search for publications in which one can find statements that explain the concept of an operating system, reasons for introducing operating systems, a formalization of the concept of an operating system or theory about operating systems based on such a formalization. It reports on the way in which the search has been carried out and the outcome of the search. The outcome includes not only what the search was meant for, but also some added bonuses.
Tock began 10 years ago as a research operating system developed by academics to help other academics build urban sensing applications. By leveraging a new language (Rust) and new hardware protection mechanisms, Tock enabled Multiprogramming a 64 kB Computer Safely and Efficiently. Today, it is an open source project with a vibrant community of users and contributors. It is deployed on root of trust hardware in data center servers and on millions of laptops; it is used to develop automotive and space products, wearable electronics, and hardware security tokens--all while remaining a platform for operating systems research. This paper focuses on the impact of Tock's technical design on its adoption, the challenges and unexpected benefits of using a type safe language (Rust)--particularly in security sensitive settings--and the experience of supporting a production open4source operating system from academia.