bloom-filters
bloom-filters copied to clipboard
Browser friendliness?
I'd really like to use this module in browsers but the bundle size ends up being very large, just importing the CuckooFilter adds over 50KB of minified code to the bundle.
Using esbuild to bundle just this script:
import { CuckooFilter } from 'bloom-filters'
const filter = new CuckooFilter(1, 2, 3, 4)
console.info(filter.has('hello world'))
Creates a 216KB bundle (63KB minified), and I have to shim some node internals (e.g. the Buffer class) for it to work:
The CuckooFilter implementation itself is 8.4KB un-minified so there's a lot of unused code here.
Would you be open to some PRs that make this module more browser friendly?
Some low hanging fruit I can see immediately:
- Switch tsc to output ESM for better tree shaking
- Replace use of node
Buffers with built-inUint8Arrays - Remove, replace or optimise use of lodash
- Remove
longdependency and use built-inBigInts
The first is a breaking change but will yield the biggest benefit - the breaking change is that consumers will no longer be able to require the module, they must import it via static or dynamic imports.
The second is breaking where Buffers are used as return values which from what I can see only appears to be in the Invertible Bloom Lookup Table implementation.
The rest are just internal changes so are non-breaking.
One spanner in the works here is that this module uses an older, incompatible version of JavaScript decorators so updating TypeScript is quite painful.
The decorators just add saveAsJSON/fromJSON methods to the filters. Is it necessary to use decorators for this? Could they just be regular methods instead?
Hello 🖖 I'm working on a 4.0.0-alpha.0 available here https://github.com/Callidon/bloom-filters/tree/next/4.0.0. What's new in this version?
- All packages are updated to latest; means a lot of things were broken but partially fixed now I still need to fix some tests
- Use
@node-rs/xxhashinstead ofxxhashjs. A very good implementation and a very good boost in performance. Plus! It uses WASM when bundled in a browser 👍 - I removed the
lodashpackages to import only specific ones withlodash.xxxx. - Switch the package as ESM, it is still in typescript
.mtsbut compiled as.mjs - I added examples for building using
rspackandwebpack - I added an example when using pure mjs with node
eslintis now very strict, it extends:@typescript-eslint/strict-type-checked- use
jestinstead ofmocha; remove alldescribeto be able to usebeforeAll/beforeEachif necessary - The hashing library is only used in the
Hashingclass in a static variable called lib which only exportsxxh32andxxh64so that the functions are available everywhere by callingthis._hashing.lib.xxh[32/64]orHashing.lib.xxh[32/64] - the package manager is now yarn 2
yarn@berry - The test suite is now a complete run of:
- compile with
tsc - run
rspackbundle - run
webpackbundle - run
prettier - run
eslint - run
jest
- compile with
- no more decorators; it is now plain
fromJSONandsaveAsJSONfunctions in the different classes.
I would also like to remove the long and buffer dependencies, so if want to help you are welcome to create a PR from this new branch.
Let me get a stable state before going on! I need to fix the tests. With the new xxhash package I introduced good bugs dealing with bigints. This will prepare the work for the long package.
Update: buffer and long are not used anymore. The draft https://github.com/Callidon/bloom-filters/pull/71 is in progress but is working as expected. Take a look, I will be happy to get feedback, especially on the usage of the wasm which takes around 347kb 😬