awesome-python icon indicating copy to clipboard operation
awesome-python copied to clipboard

Add hazm

Open ayub-kokabi opened this issue 2 years ago • 0 comments

What is this Python project?

Hazm is a python library to perform natural language processing tasks on Persian text. It offers various features for analyzing, processing, and understanding Persian text. You can use Hazm to normalize text, tokenize sentences and words, lemmatize words, assign part-of-speech tags, identify dependency relations, create word and sentence embeddings, or read popular Persian corpora.

Features:

  • Normalization: Converts text to a standard form, such as removing diacritics, correcting spacing, etc.
  • Tokenization: Splits text into sentences and words.
  • Lemmatization: Reduces words to their base forms.
  • POS tagging: Assigns a part of speech to each word.
  • Dependency parsing: Identifies the syntactic relations between words.
  • Embedding: Creates vector representations of words and sentences.
  • Persian corpora reading: Easily read popular Persian corpora with ready-made scripts and minimal code.

What's the difference between this Python project and similar ones?

As far as my knowledge goes, there are no other libraries that can match the utility provided by this package.

Anyone who agrees with this pull request could submit an Approve review to it.

ayub-kokabi avatar Aug 01 '23 09:08 ayub-kokabi