simhash topic

List simhash repositories

stopwords

136
Stars
25
Forks
Watchers

Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.

nlp

434
Stars
45
Forks
Watchers

Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang

python-hashes

238
Stars
43
Forks
Watchers

Interesting (non-cryptographic) hashes implemented in pure Python.

simhash-java

152
Stars
81
Forks
Watchers

A simple implementation of simhash algorithm by java.

gosimhash

19
Stars
5
Forks
Watchers

A simhasher for Chinese documents implemented by golang, simply translated from yanyiwu/gosimhash

simhash-js

38
Stars
14
Forks
Watchers

Simhash implementation in Javascript

semantic-sh

23
Stars
3
Forks
Watchers

semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).

simhash_similarity

24
Stars
9
Forks
Watchers

A text similarity by simhash

spirit_fingers

18
Stars
1
Forks
Watchers

Elixir SimHash NIFs written in Rust

superminhash

19
Stars
7
Forks
Watchers

SuperMinHash: A New Minwise Hashing Algorithm for Jaccard Similarity Estimation, Simhash and SimhashIndex