dsds icon indicating copy to clipboard operation
dsds copied to clipboard

Traditional NLP

Open abstractqqq opened this issue 2 years ago • 0 comments

  1. Stemming. Snowball stemming. Only English.
  2. Count Vectorizer, mostly done
  3. TFIDF Vectorizer, mostly done
  4. Text related metrics, cosine_similarity, Levenshtein distance are done. Some others? Like Jaccard similarity for sets of text.
  5. Naive Bayes Classification.
  6. Latent Semantic Analysis. Need more ndarray + nalgebra integration with Rust.
  7. Other generic text cleaning/processing.

abstractqqq avatar Aug 07 '23 19:08 abstractqqq