anonlink
anonlink copied to clipboard
Investigate effect of not including start/end bigrams
When creating the bi-grams, the first and last bi-gram are padded with a whitespace.
This is a weakness, because it allows an attacker to more easily to find the beginning and the end of a word. Intuitively it helps with matching so we should investigate if dropping the padding decreases matching accuracy.
Aha! Link: https://csiro.aha.io/features/ANONLINK-72
From Cryptanalysis of Basic Bloom Filters paper:
