We present SQL Query Engine, an open-source, self-hosted service that translates natural language questions into validated PostgreSQL queries through a two-stage LLM pipeline. The first stage performs automatic schema introspection and SQL generation; a multi-strategy response parser extracts SQL from any LLM output format (JSON, code blocks, or raw text) without requiring structured output APIs. The second stage executes the query against PostgreSQL and, upon failure or empty results, enters an iterative self-healing loop in which the LLM diagnoses the error using full SQLSTATE codes and PostgreSQL diagnostic messages. Two mechanisms prevent regressions: early-accept returns successful queries immediately without LLM re-evaluation, and best-result tracking preserves the best partial result across retries. Schema context is cached per session in Redis, progress events stream via Redis Pub/Sub and SSE, and an OpenAI-compatible /v1/chat/completions endpoint lets existing tools work without modification. All database connections are read-only at the driver level. We evaluate across five LLM backends on a synthetic benchmark (75 questions, three databases) where the self-healing loop y
Filtered Vector Search (FVS) is critical for supporting semantic search and GenAI applications in modern database systems. However, existing research most often evaluates algorithms in specialized libraries, making optimistic assumptions that do not align with enterprise-grade database systems. Our work challenges this premise by demonstrating that in a production-grade database system, commonly made assumptions do not hold, leading to performance characteristics and algorithmic trade-offs that are fundamentally different from those observed in isolated library settings. This paper presents the first in-depth analysis of filter-agnostic FVS algorithms within a production PostgreSQL-compatible system. We systematically evaluate post-filtering and inline-filtering strategies across a wide range of selectivities and correlations. Our central finding is that the optimal algorithm is not dictated by the cost of distance computations alone, but that system-level overheads that come from both distance computations and filter operations (like page accesses and data retrieval) play a significant role. We demonstrate that graph-based approaches (such as NaviX/ACORN) can incur prohibitive num
Documentation has long guided computer system tuning by distilling expert knowledge into per-parameter recommendations. Yet such guides capture only what experts conclude, discarding how they reason. This fundamental gap manifests in three concrete deficiencies: documentation grows stale as software evolves, fails under heterogeneous workloads, and ignores inter-parameter dependencies. We propose shifting from static documentation to dynamic action for system tuning. We introduce PerfEvolve, which translates expert tuning methodologies into executable skills that equip LLM-based agents to perform version-consistency verification, workload-specific profiling, and multi-parameter joint optimization. Evaluated on PostgreSQL under TPC-C and TPC-H benchmarks, PerfEvolve outperforms state-of-the-art documentation-driven tuning baselines by up to 35.2%. The tool is available at https://github.com/ISCAS-OSLab/PerfEvolve.
The NVMeVirt paper analyzes the implication of storage performance on database engine performance to promote the tunable performance of NVMeVirt. They perform analysis on two very popular database engines, MariaDB and PostgreSQL. The result shows that MariaDB is more efficient when the storage is slow, but PostgreSQL outperforms MariaDB as I/O bandwidth increases. Although this verifies that NVMeVirt can support advanced storage bandwidth configurations, the paper does not provide a clear explanation of why two database engines react very differently to the storage performance. To understand why the above two database engines have different performance characteristics, we conduct a study of the database engine's internals. We focus on three major differences in Multi-version concurrency control (MVCC) implementations: version storage, garbage collection, and index management. We also evaluated each scheme's I/O overhead using OLTP workload. Our analysis identifies the reason why MariaDB outperforms PostgreSQL when the bandwidth is low.
Query optimizers are essential components of relational database management systems that directly impact query performance as they transform input queries into efficient execution plans. While users can obtain the final execution plan using the EXPLAIN command and leverage existing visualization tools for intuitive understanding, the internal decision-making processes of query optimizers are hidden from users, making it difficult to understand how the plan is constructed. To address this challenge, we present Jovis, an interactive visualization tool designed to explore the query optimization process in PostgreSQL. Jovis provides a comprehensive view of the entire optimization workflow through tailored visualization for each optimization strategy. It also includes features that allow users to participate in optimization by providing hints, tuning parameters, and reusing prior optimization results. Jovis serves as both an educational tool for learners and a practical resource for database professionals, helping users understand and improve query optimization by guiding the optimizer to make better decisions or consider previously unexplored plans. The source code, data, and/or other
PostgreSQL is an object-relational database (ORDBMS) that was introduced into the database community and has been avidly used for a variety of information extraction use cases. It is also known to be an advanced SQL-compliant open source Object RDBMS. However, users have not yet resolved to PostgreSQL due to the fact that it is still under the layers and the complexity of its persistent textual environment for an amateur user. Hence, there is a dire need to provide an easy environment for users to comprehend the procedure and standards with which databases are created, tables and the relationships among them, manipulating queries and their flow based on conditions in PostgreSQL. As such, this project identifies the dominant features offered by Postgresql, analyzes the constraints that exist in the database user community in migrating to PostgreSQL and based on the scope and constraints identified, develop a system that will serve as a query generation platform as well as a learning tool that will provide an interactive environment to cognitively learn PostgreSQL query building. This is achieved using a visual editor incorporating a textual editor for a well-versed user. By providin
In January 2026, torrential rains killed 200-300 people across Southern Africa, exposing a critical reality: 60% of the continent lacks effective early warning systems due to infrastructure costs. Traditional radar stations exceed USD 1 million each, leaving Africa with an 18x coverage deficit compared to the US and EU. We present a production-grade architecture for deploying NVIDIA Earth-2 AI weather models at USD 1,430-1,730/month for national-scale deployment - enabling coverage at 2,000-4,545x lower cost than radar. The system generates 15-day global atmospheric forecasts, cached in PostgreSQL to enable user queries under 200 milliseconds without real-time inference. Deployed in South Africa in February 2026, our system demonstrates three technical contributions: (1) a ProcessPoolExecutor-based event loop isolation pattern that resolves aiobotocore session lifecycle conflicts in async Python applications; (2) a database-backed serving architecture where the GPU writes global forecasts directly to PostgreSQL, eliminating HTTP transfer bottlenecks for high-resolution tensors; and (3) an automated coordinate management pattern for multi-step inference across 61 timesteps. Forecast
As autonomous LLM agents increasingly hold real credentials and operate infrastructure without a human in the loop, operators have no standard way to tell an agent that a resource is off-limits. Access controls either let the agent in (it has valid credentials) or hard-fail it (indistinguishable from any other client). We propose a third mode: a lightweight, published in-band deny signal -- the Recuse Signal -- that a server emits over a protocol's existing channels (an SSH banner, a PostgreSQL NOTICE) asking a connecting automated agent to voluntarily withdraw. This is a cooperative governance control, the robots.txt analogue for live access; it is explicitly not a security boundary. Its value is entirely empirical and, to our knowledge, unmeasured: do compliant LLM agents actually honor such a signal? We define the signal as an open mini-standard, implement three zero- or low-footprint adapters (an SSH banner/PAM hook, a PostgreSQL wire-protocol proxy, and a Kubernetes admission webhook), deploy them on a live production host, and run a controlled experiment in which fresh agents are given a benign operations task and observed for recusal. In a pilot (SSH; OpenAI GPT-4o and GPT-4
Cardinality estimators are built to support arbitrary schemas and workloads, forcing them to rely on generic statistics even when the schema and workload is known in advance, leaving optimizers prone to large errors and poor plans. We present Bespoke-Card, an agent-driven system that synthesizes workload-specific cardinality estimators as executable code: a planning agent designs the estimators strategies, a coding agent implements them, and a validator scores the estimates against true cardinalities and PostgreSQL estimates, forming a robust and deterministic harness. Going beyond naive prompting, Bespoke-Card uses structured q-error feedback, regression analysis, concrete outlier subplans, a curriculum isolating join-only, filter-only, and full-subplan errors, and archival selection of the best implementation. Injecting its estimates into the optimizer cuts total PostgreSQL runtime on JOB by 33% and reduces median q-error over all JOB subplans by 41%, while synthesizing a strong estimator in under one hour for less than $10. Bespoke-Card is opening a new avenue for cardinality estimation next to classical generic estimators and learned estimator architectures.
Temporal range filtering is critical in large-scale search systems, particularly location-based services filtering businesses by operating hours. Traditional approaches suffer from poor query performance (scope filtering), index size explosion (minute-level indexing), or reduced precision (coarse-grained indexing). PostgreSQL TSRANGE with GiST indexing offers exact semantics but imposes P50 latencies of 15-224 ms at 100K-1M scale, prohibitive for interactive search, and cannot embed within inverted index pipelines. We present Timehash, a hierarchical time indexing algorithm achieving over 97% reduction in index size versus minute-level indexing while maintaining 100% precision. Timehash uses a flexible multi-resolution strategy that integrates seamlessly into inverted index infrastructure. Through analysis of 12.6 million records from a production location search service deployed for 18 months, we demonstrate a domain-informed hierarchy-selection methodology via boundary-distribution analysis, with cross-dataset validation on the Yelp Open Dataset (127K US/CA businesses), where the same 5-level hierarchy reduces total terms to 0.77% of the 1-minute baseline (vs. 2.17% on the produc
This paper analyzes execution instability in traditional cost-based database management systems (DBMS) and identifies a structural timing misalignment between optimization and execution stages that contributes to tail-latency amplification. Beyond estimation accuracy and raw execution throughput, we argue that decision timing and the availability of runtime signals materially affect robustness under uncertainty. In conventional DBMS architectures, the optimizer relies on historical statistics, the executor observes runtime data distributions and resource states, and accelerators impose up-front transfer costs and amortization constraints. This temporal asynchrony can lead to rigid early-bound decisions that fail under input-scale shifts or stale statistics. We propose a cross-layer decision timing orchestration framework that shifts final decision authority from the compile-time optimizer to the runtime executor via selective late binding of operator-level choices. A Unified Risk Signal (URS) integrates optimizer uncertainty, execution-time observations, and accelerator cost signals without collapsing them into a single static cost model. Experiments on a modified PostgreSQL protot
Data is critical for the operation of any organization and needs to be protected, especially against attacks that compromise the state of the database. In this paper, we explore an approach based on Byzantine-fault tolerant replicated state machines, built on top of a deterministic extension of PostgreSQL. Each replica deterministically executes transactions recorded in a shared log/blockchain. Our focus is on creating a practical system that is designed for efficient and quick detection of corruption, as well as quick repair concurrent with execution of transactions. We also present a performance study showing the efficiency and practicality of our approach. We believe our work lays the foundations for the practical use of the BFT replicated state machine approach in the context of databases.
Enterprise AI agents are useful for internal analysis, audit, compliance review, and operational investigation, but they create a difficult authorization problem. A manager or data owner may approve a business task, while the agent later generates open-ended SQL below the application layer. Existing systems help identify agents, delegate authority, govern data products, or enforce database policy, but they do not directly turn an approved enterprise task into a bounded database execution context. SessionBound fills this gap. It turns approved enterprise tasks into short-lived, budgeted, and auditable database sessions for AI agents. A control plane defines task templates, accepts task applications, records approvals, assigns budgets, and issues signed task tokens. A database runtime, SessionBoundDB, binds a token to a session and enforces safe views, row scope, denied fields, operation limits, query budgets, disclosure budgets, and receipts. The database does not rely on an LLM to decide whether a query is safe. The agent may generate SQL freely, but each attempt must stay inside the approved boundary. A PostgreSQL prototype passed a 24-scenario validation suite. Microbenchmarks sh
DBOS (DataBase Operating System) is a novel capability that integrates web services, operating system functions, and database features to significantly reduce web-deployment effort while increasing resilience. Integration of high performance network sensing enables DBOS web services to collaboratively create a shared awareness of their network environments to enhance their collective resilience and security. Network sensing is added to DBOS using GraphBLAS hypersparse traffic matrices via two approaches: (1) Python-GraphBLAS and (2) OneSparse PostgreSQL. These capabilities are demonstrated using the workflow and analytics from the IEEE/MIT/Amazon Anonymized Network Sensing Graph Challenge. The system was parallelized using pPython and benchmarked using 64 compute nodes on the MIT SuperCloud. The web request rate sustained by a single DBOS instance was ${>}10^5$, well above the required maximum, indicating that network sensing can be added to DBOS with negligible overhead. For collaborative awareness, many DBOS instances were connected to a single DBOS aggregator. The Python-GraphBLAS and OneSparse PostgreSQL implementations scaled linearly up to 64 and 32 nodes respectively. The
Accurate query runtime prediction is a critical component of effective query optimization in modern database systems. Traditional cost models, such as those used in PostgreSQL, rely on static heuristics that often fail to reflect actual query performance under complex and evolving workloads. This remains an active area of research, with recent work exploring machine learning techniques to replace or augment traditional cost estimators. In this paper, we present a machine learning-based framework for predicting SQL query runtimes using execution plan features extracted from PostgreSQL. Our approach integrates scalar and structural features from execution plans and semantic representations of SQL queries to train predictive models. We construct an automated pipeline for data collection and feature extraction using parameterized TPC-H queries, enabling systematic evaluation of multiple modeling techniques. Unlike prior efforts that focus either on cardinality estimation or on synthetic cost metrics, we model the actual runtimes using fine-grained plan statistics and query embeddings derived from execution traces, to improve the model accuracy. We compare baseline regressors, a refined
For applications that store structured data in relational databases, there is an impedance mismatch between the flat representations encouraged by relational data models and the deeply nested information that applications expect to receive. In this work, we present the graph-relational database model, which provides a flexible, compositional, and strongly-typed solution to this "object-relational mismatch." We formally define the graph-relational database model and present a static and dynamic semantics for queries. In addition, we discuss the realization of the graph-relational database model in EdgeQL, a general-purpose SQL-style query language, and the Gel system, which compiles EdgeQL schemas and queries into PostgreSQL queries. Gel facilitates the kind of object-shaped data manipulation that is frequently provided inefficiently by object-relational mapping (ORM) technologies, while achieving most of the efficiency that comes from writing complex PostgreSQL queries directly.
The increasing demand for deep neural inference within database environments has driven the emergence of AI-native DBMSs. However, existing solutions either rely on model-centric designs requiring developers to manually select, configure, and maintain models, resulting in high development overhead, or adopt task-centric AutoML approaches with high computational costs and poor DBMS integration. We present MorphingDB, a task-centric AI-native DBMS that automates model storage, selection, and inference within PostgreSQL. To enable flexible, I/O-efficient storage of deep learning models, we first introduce specialized schemas and multi-dimensional tensor data types to support BLOB-based all-in-one and decoupled model storage. Then we design a transfer learning framework for model selection in two phases, which builds a transferability subspace via offline embedding of historical tasks and employs online projection through feature-aware mapping for real-time tasks. To further optimize inference throughput, we propose pre-embedding with vectoring sharing to eliminate redundant computations and DAG-based batch pipelines with cost-aware scheduling to minimize the inference time. Implemente
Efficient data processing is increasingly vital, with query optimizers playing a fundamental role in translating SQL queries into optimal execution plans. Traditional cost-based optimizers, however, often generate suboptimal plans due to flawed heuristics and inaccurate cost models, leading to the emergence of Learned Query Optimizers (LQOs). To address challenges in existing LQOs, such as the inconsistency and suboptimality inherent in pairwise ranking methods, we introduce CARPO, a generic framework leveraging listwise learning-to-rank for context-aware query plan optimization. CARPO distinctively employs a Transformer-based model for holistic evaluation of candidate plan sets and integrates a robust hybrid decision mechanism, featuring Out-Of-Distribution (OOD) detection with a top-k fallback strategy to ensure reliability. Furthermore, CARPO can be seamlessly integrated with existing plan embedding techniques, demonstrating strong adaptability. Comprehensive experiments on TPC-H and STATS benchmarks demonstrate that CARPO significantly outperforms both native PostgreSQL and Lero, achieving a Top-1 Rate of 74.54% on the TPC-H benchmark compared to Lero's 3.63%, and reducing the
The performance of modern DBMSs such as MySQL and PostgreSQL heavily depends on the configuration of performance-critical knobs. Manual tuning these knobs is laborious and inefficient due to the complex and high-dimensional nature of the configuration space. Among the automated tuning methods, reinforcement learning (RL)-based methods have recently sought to improve the DBMS knobs tuning process from several different perspectives. However, they still encounter challenges with slow convergence speed during offline training. In this paper, we mainly focus on how to leverage the valuable tuning hints contained in various textual documents such as DBMS manuals and web forums to improve the offline training of RL-based methods. To this end, we propose an efficient DBMS knobs tuning framework named DemoTuner via a novel LLM-assisted demonstration reinforcement learning method. Specifically, to comprehensively and accurately mine tuning hints from documents, we design a structured chain of thought prompt to employ LLMs to conduct a condition-aware tuning hints extraction task. To effectively integrate the mined tuning hints into RL agent training, we propose a hint-aware demonstration re
This paper presents a novel AI-powered framework designed to streamline database management and query optimization for PostgreSQL systems. Structured in three phases: Natural Language to SQL Translation, Query Execution and Analysis, and Insight Generation, the approach empowers users with intuitive, intelligent interaction with databases. By leveraging advanced natural language processing and language models, the system enables non-technical users to extract meaningful insights from complex datasets, reducing the dependency on specialized DBAs. The framework also introduces iterative refinement of queries, automatic report generation, and support for temporal data forecasting. Experimental results demonstrate improved productivity, reduced query latency, and enhanced accuracy, validating the system's effectiveness across diverse business use cases. The solution was developed and evaluated at ABV-IIITM Gwalior, under the guidance of Prof. Dr. Arun Kuma