dsds
dsds copied to clipboard
Traditional NLP
- Stemming. Snowball stemming. Only English.
- Count Vectorizer, mostly done
- TFIDF Vectorizer, mostly done
- Text related metrics, cosine_similarity, Levenshtein distance are done. Some others? Like Jaccard similarity for sets of text.
- Naive Bayes Classification.
- Latent Semantic Analysis. Need more ndarray + nalgebra integration with Rust.
- Other generic text cleaning/processing.