LeMonHash
LeMonHash copied to clipboard
Learned Monotone Minimal Perfect Hashing
Learned Monotone Minimal Perfect Hashing

LeMonHash is a Monotone Minimal Perfect Hash function that combines the PGM-Index for space-efficient ranking and BuRR for low-overhead retrieval.
A variant for variable-length strings provides significantly faster queries than competitors.
Requirements
- GCC 11 or later
- libxxhash v0.8.0 or later
Usage
Clone the repository (as a submodule) and add the following to your CMakeLists.txt
.
add_subdirectory(path/to/LeMonHash)
target_link_libraries(YourTarget PRIVATE LeMonHash)
Then you can use the straight-forward interface of LeMonHash:
std::vector<uint64_t> inputData {0, 1, 7, 15, 23, 42, 250};
lemonhash::LeMonHash<> hashFunc(inputData);
for (uint64_t x : inputData) {
std::cout << x << ": \t" << hashFunc(x) << std::endl;
}
License
This code is licensed under the GPLv3. If you use the project in an academic context or publication, please cite our paper:
@inproceedings{DBLP:conf/esa/FerraginaL0V23,
author = {Paolo Ferragina and
Hans{-}Peter Lehmann and
Peter Sanders and
Giorgio Vinciguerra},
title = {Learned Monotone Minimal Perfect Hashing},
booktitle = {{ESA}},
series = {LIPIcs},
volume = {274},
pages = {46:1--46:17},
publisher = {Schloss Dagstuhl - Leibniz-Zentrum f{\"{u}}r Informatik},
year = {2023},
doi = {10.4230/LIPICS.ESA.2023.46}
}
The code of the experiments comparing LeMonHash to competitors from the literature is available here.