共找到 20 条结果
This is a recollection of the UC Berkeley Postgres project, which was led by Mike Stonebraker from the mid-1980's to the mid-1990's. The article was solicited for Stonebraker's Turing Award book, as one of many personal/historical recollections. As a result it focuses on Stonebraker's design ideas and leadership. But Stonebraker was never a coder, and he stayed out of the way of his development team. The Postgres codebase was the work of a team of brilliant students and the occasional university "staff programmers" who had little more experience (and only slightly more compensation) than the students. I was lucky to join that team as a student during the latter years of the project. I got helpful input on this writeup from some of the more senior students on the project, but any errors or omissions are mine. If you spot any such, please contact me and I will try to fix them.
Replicating data over a cluster of workstations is a powerful tool to increase performance, and provide fault-tolerance for demanding database applications. The big challenge in such systems is to combine replica control (keeping the copies consistent) with concurrency control. Most of the research so far has focused on providing the traditional correctness criteria serializability. However, more and more database systems, e.g., Oracle and PostgreSQL, use multi-version concurrency control providing the isolation level snapshot isolation. In this paper, we present Postgres-R(SI), an extension of PostgreSQL offering transparent replication. Our replication tool is designed to work smoothly with PostgreSQL's concurrency control providing snapshot isolation for the entire replicated system. We present a detailed description of the replica control algorithm, and how it is combined with PostgreSQL's concurrency control component. Furthermore, we discuss some challenges we encountered when implementing the protocol. Our performance analysis based on the TPC-W benchmark shows that this approach exhibits excellent performance for real-life applications even if they are update intensive.
This research focuses on monitoring and transferring logs of operations performed on a relational database, specifically PostgreSQL, in real-time using an event-driven approach. The logs generated from database operations are transferred using Apache Kafka, an open-source message queuing system, and Debezium running on Kafka, to Redis, a non-relational (No-SQL) key-value database. Time-consuming query operations and read operations are performed on Redis, which operates on memory (in-memory), instead of on the primary database, PostgreSQL. This approach has significantly improved query execution performance, data processing time, and backend service performance. The study showcases the practical application of an event-driven approach using Debezium, Kafka, Redis, and relational databases for real-time data processing and querying.
Autonomous AI agents increasingly issue side-effect-bearing actions: database mutations, refunds, payments, external commitments. We propose the Actuarial Action Interface (AAI), a deterministic runtime contract that prices each such action against a contractually fixed safe default under a time-consistent risk mapping, and gates execution against a per-boundary reserve capital budget. We then develop the Authority Frontier, an evaluation primitive measuring how much autonomous authority the runtime releases at each level of reserve capital. The framework provides (i) a deterministic quote-bind-commit protocol with toll-bounded capability tokens; (ii) a universal seven-class action taxonomy mapping heterogeneous tool calls to comparable authority units; (iii) replay determinism and pathwise reserve coverage under alpha-spending; (iv) cross-domain normalization via full reserve demand C_full and capital metrics Capital@k. We instantiate AAI across four agentic environments (database mutation, customer-service refund, and the public tau-bench retail and airline tool-use traces) and report a live Postgres panel in which three Azure-hosted models propose actions through the same contra
Colony-forming unit (CFU) detection is critical in pharmaceutical manufacturing, serving as a key component of Environmental Monitoring programs and ensuring compliance with stringent quality standards. Manual counting is labor-intensive and error-prone, while deep learning (DL) approaches, though accurate, remain vulnerable to sample quality variations and artifacts. Building on our earlier CNN-based framework (Beznik et al., 2020), we evaluated YOLOv5, YOLOv7, and YOLOv8 for CFU detection; however, these achieved only 97.08 percent accuracy, insufficient for pharmaceutical-grade requirements. A custom Detectron2 model trained on GSK's dataset of over 50,000 Petri dish images achieved 99 percent detection rate with 2 percent false positives and 0.6 percent false negatives. Despite high validation accuracy, Detectron2 performance degrades on outlier cases including contaminated plates, plastic artifacts, or poor optical clarity. To address this, we developed a multi-agent framework combining DL with vision-language models (VLMs). The VLM agent first classifies plates as valid or invalid. For valid samples, both DL and VLM agents independently estimate colony counts. When prediction
Cardinality estimation is the problem of estimating the size of the output of a query, without actually evaluating the query. The cardinality estimator is a critical piece of a query optimizer, and is often the main culprit when the optimizer chooses a poor plan. This paper introduces LpBound, a pessimistic cardinality estimator for multijoin queries (acyclic or cyclic) with selection predicates and group-by clauses. LpBound computes a guaranteed upper bound on the size of the query output using simple statistics on the input relations, consisting of $\ell_p$-norms of degree sequences. The bound is the optimal solution of a linear program whose constraints encode data statistics and Shannon inequalities. We introduce two optimizations that exploit the structure of the query in order to speed up the estimation time and make LpBound practical. We experimentally evaluate LpBound against a range of traditional, pessimistic, and machine learning-based estimators on the JOB, STATS, and subgraph matching benchmarks. Our main finding is that LpBound can be orders of magnitude more accurate than traditional estimators used in mainstream open-source and commercial database systems. Yet it ha
Autotuning plays a pivotal role in optimizing the performance of systems, particularly in large-scale cloud deployments. One of the main challenges in performing autotuning in the cloud arises from performance variability. We first investigate the extent to which noise slows autotuning and find that as little as $5\%$ noise can lead to a $2.5$x slowdown in converging to the best-performing configuration. We measure the magnitude of noise in cloud computing settings and find that while some components (CPU, disk) have almost no performance variability, there are still sources of significant variability (caches, memory). Furthermore, variability leads to autotuning finding unstable configurations. As many as $63.3\%$ of the configurations selected as "best" during tuning can have their performance degrade by $30\%$ or more when deployed. Using this as motivation, we propose a novel approach to improve the efficiency of autotuning systems by (a) detecting and removing outlier configurations and (b) using ML-based approaches to provide a more stable true signal of de-noised experiment results to the optimizer. The resulting system, TUNA (Tuning Unstable and Noisy Cloud Applications) en
Query scheduling is a critical task that directly impacts query performance in database management systems (DBMS). Deeply integrated schedulers, which require changes to DBMS internals, are usually customized for a specific engine and can take months to implement. In contrast, non-intrusive schedulers make coarse-grained decisions, such as controlling query admission and re-ordering query execution, without requiring modifications to DBMS internals. They require much less engineering effort and can be applied across a wide range of DBMS engines, offering immediate benefits to end users. However, most existing non-intrusive scheduling systems rely on simplified cost models and heuristics that cannot accurately model query interactions under concurrency and different system states, possibly leading to suboptimal scheduling decisions. This work introduces IconqSched, a new, principled non-intrusive scheduler that optimizes the execution order and timing of queries to enhance total end-to-end runtime as experienced by the user query queuing time plus system runtime. Unlike previous approaches, IconqSched features a novel fine-grained predictor, Iconq, which treats the DBMS as a black b
Query optimizers are crucial for the performance of database systems. Recently, many learned query optimizers (LQOs) have demonstrated significant performance improvements over traditional optimizers. However, most of them operate under a limited assumption: a static query environment. This limitation prevents them from effectively handling complex, dynamic query environments in real-world scenarios. Extensive retraining can lead to the well-known catastrophic forgetting problem, which reduces the LQO generalizability over time. In this paper, we address this limitation and introduce LIMAO (Lifelong Modular Learned Query Optimizer), a framework for lifelong learning of plan cost prediction that can be seamlessly integrated into existing LQOs. LIMAO leverages a modular lifelong learning technique, an attention-based neural network composition architecture, and an efficient training paradigm designed to retain prior knowledge while continuously adapting to new environments. We implement LIMAO in two LQOs, showing that our approach is agnostic to underlying engines. Experimental results show that LIMAO significantly enhances the performance of LQOs, achieving up to a 40% improvement i
Migrations of systems from on-site premises to the cloud has been a fundamental endeavor by many industrial institutions. A crucial component of such cloud migrations is the transition of databases to be hosted online. In this work, we consider the difficulties of this migration for SQL databases. While SQL is one of the prominent methods for storing database procedures, there are a plethora of different SQL dialects (e.g., MySQL, Postgres, etc.) which can complicate migrations when the on-premise SQL dialect differs to the dialect hosted on the cloud. Tools exist by common cloud provides such as AWS and Azure to aid in translating between dialects in order to mitigate the majority of the difficulties. However, these tools do not successfully translate $100\%$ of the code. Consequently, software engineers must manually convert the remainder of the untranslated database. For large organizations, this task quickly becomes intractable and so more innovative solutions are required. We consider this challenge a novel yet vital industrial research problem for any large corporation that is considering cloud migrations. Furthermore, we introduce potential avenues of research to tackle this
A DBMS allows trading consistency for efficiency through the allocation of isolation levels that are strictly weaker than serializability. The robustness problem asks whether, for a given set of transactions and a given allocation of isolation levels, every possible interleaved execution of those transactions that is allowed under the provided allocation, is always safe. In the literature, safe is interpreted as conflict-serializable (to which we refer here as conflict-robustness). In this paper, we study the view-robustness problem, interpreting safe as view-serializable. View-serializability is a more permissive notion that allows for a greater number of schedules to be serializable and aligns more closely with the intuitive understanding of what it means for a database to be consistent. However, view-serializability is more complex to analyze (e.g., conflict-serializability can be decided in polynomial time whereas deciding view-serializability is NP-complete). While conflict-robustness implies view-robustness, the converse does not hold in general. In this paper, we provide a sufficient condition for isolation levels guaranteeing that conflict- and view-robustness coincide and
We introduce the General Index of Software Engineering Papers, a dataset of fulltext-indexed papers from the most prominent scientific venues in the field of Software Engineering. The dataset includes both complete bibliographic information and indexed ngrams (sequence of contiguous words after removal of stopwords and non-words, for a total of 577 276 382 unique n-grams in this release) with length 1 to 5 for 44 581 papers retrieved from 34 venues over the 1971-2020 period.The dataset serves use cases in the field of meta-research, allowing to introspect the output of software engineering research even when access to papers or scholarly search engines is not possible (e.g., due to contractual reasons). The dataset also contributes to making such analyses reproducible and independently verifiable, as opposed to what happens when they are conducted using 3rd-party and non-open scholarly indexing services.The dataset is available as a portable Postgres database dump and released as open data.
The goal of multi-objective query optimization (MOQO) is to find query plans that realize a good compromise between conflicting objectives such as minimizing execution time and minimizing monetary fees in a Cloud scenario. A previously proposed exhaustive MOQO algorithm needs hours to optimize even simple TPC-H queries. This is why we propose several approximation schemes for MOQO that generate guaranteed near-optimal plans in seconds where exhaustive optimization takes hours. We integrated all MOQO algorithms into the Postgres optimizer and present experimental results for TPC-H queries; we extended the Postgres cost model and optimize for up to nine conflicting objectives in our experiments. The proposed algorithms are based on a formal analysis of typical cost functions that occur in the context of MOQO. We identify properties that hold for a broad range of objectives and can be exploited for the design of future MOQO algorithms.
Join optimization has been dominated by Selinger-style, pairwise optimizers for decades. But, Selinger-style algorithms are asymptotically suboptimal for applications in graphic analytics. This suboptimality is one of the reasons that many have advocated supplementing relational engines with specialized graph processing engines. Recently, new join algorithms have been discovered that achieve optimal worst-case run times for any join or even so-called beyond worst-case (or instance optimal) run time guarantees for specialized classes of joins. These new algorithms match or improve on those used in specialized graph-processing systems. This paper asks can these new join algorithms allow relational engines to close the performance gap with graph engines? We examine this question for graph-pattern queries or join queries. We find that classical relational databases like Postgres and MonetDB or newer graph databases/stores like Virtuoso and Neo4j may be orders of magnitude slower than these new approaches compared to a fully featured RDBMS, LogicBlox, using these new ideas. Our results demonstrate that an RDBMS with such new algorithms can perform as well as specialized engines like Gra
Decades of research have sought to improve transaction processing performance and scalability in database management systems (DBMSs). However, significantly less attention has been dedicated to the predictability of performance: how often individual transactions exhibit execution latency far from the mean? Performance predictability is vital when transaction processing lies on the critical path of a complex enterprise software or an interactive web service, as well as in emerging database-as-a-service markets where customers contract for guaranteed levels of performance. In this paper, we take several steps towards achieving more predictable database systems. First, we propose a profiling framework called VProfiler that, given the source code of a DBMS, is able to identify the dominant sources of variance in transaction latency. VProfiler automatically instruments the DBMS source code to deconstruct the overall variance of transaction latencies into variances and covariances of the execution time of individual functions, which in turn provide insight into the root causes of variance. Second, we use VProfiler to analyze MySQL and Postgres - two of the most popular and complex open-s
Encrypted database systems provide a great method for protecting sensitive data in untrusted infrastructures. These systems are built using either special-purpose cryptographic algorithms that support operations over encrypted data, or by leveraging trusted computing co-processors. Strong cryptographic algorithms (e.g., public-key encryptions, garbled circuits) usually result in high performance overheads, while weaker algorithms (e.g., order-preserving encryption) result in large leakage profiles. On the other hand, some encrypted database systems (e.g., Cipherbase, TrustedDB) leverage non-standard trusted computing devices, and are designed to work around the architectural limitations of the specific devices used. In this work we build StealthDB - an encrypted database system from Intel SGX. Our system can run on any newer generation Intel CPU. StealthDB has a very small trusted computing base, scales to large transactional workloads, requires minor DBMS changes, and provides a relatively strong security guarantees at steady state and during query execution. Our prototype on top of Postgres supports the full TPC-C benchmark with a 30% decrease in the average throughput over an un
UDO is a versatile tool for offline tuning of database systems for specific workloads. UDO can consider a variety of tuning choices, reaching from picking transaction code variants over index selections up to database system parameter tuning. UDO uses reinforcement learning to converge to near-optimal configurations, creating and evaluating different configurations via actual query executions (instead of relying on simplifying cost models). To cater to different parameter types, UDO distinguishes heavy parameters (which are expensive to change, e.g. physical design parameters) from light parameters. Specifically for optimizing heavy parameters, UDO uses reinforcement learning algorithms that allow delaying the point at which the reward feedback becomes available. This gives us the freedom to optimize the point in time and the order in which different configurations are created and evaluated (by benchmarking a workload sample). UDO uses a cost-based planner to minimize reconfiguration overheads. For instance, it aims to amortize the creation of expensive data structures by consecutively evaluating configurations using them. We evaluate UDO on Postgres as well as MySQL and on TPC-H a
Astronomers may have witnessed one of the rarest and most dramatic cosmic events ever seen: a long-sought intermediate-mass black hole ripping apart a dense white dwarf star and devouring it。 The Einstein Probe space telescope caught the explosion in its earliest moments, revealing an unusual sequence of intense X-ray flashes unlike anything seen i
A surprisingly simple fuel modification could help tackle one of diesel engines’ biggest problems: pollution。 Researchers reviewing studies from around the world found that mixing small amounts of water into diesel fuel can dramatically reduce harmful emissions, including nitrogen oxides and soot, while maintaining or even improving engine efficien