A minimal perfect hash function (MPHF) maps a set S of n keys to the first n integers without collisions. There is a lower bound of n*log(e)=1.44n bits needed to represent an MPHF. This can be reached by a brute-force algorithm that tries e^n hash function seeds in expectation and stores the first seed leading to an MPHF. The most space-efficient previous algorithms for constructing MPHFs all use such a brute-force approach as a basic building block. In this paper, we introduce ShockHash - Small, heavily overloaded cuckoo hash tables for minimal perfect hashing. ShockHash uses two hash functions h_0 and h_1, hoping for the existence of a function f : S->{0, 1} such that x -> h_{f(x)}(x) is an MPHF on S. It then uses a 1-bit retrieval data structure to store f using n + o(n) bits. In graph terminology, ShockHash generates n-edge random graphs until stumbling on a pseudoforest - where each component contains as many edges as nodes. Using cuckoo hashing, ShockHash then derives an MPHF from the pseudoforest in linear time. We show that ShockHash needs to try only about (e/2)^n=1.359^n seeds in expectation. This reduces the space for storing the seed by roughly n bits (maintaining
使用 AI 将内容摘要翻译为中文,便于快速阅读
使用 AI 分析这篇文章的核心发现、关键要点和深度见解
由 DeepSeek AI 提供分析 · 首次使用需配置 API Key