urduhack
urduhack copied to clipboard
An NLP library for the Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.
## How to reproduce the problem I ran the following code after installing urduhack as ```bash pip install Urduhack[tf] ``` from here: [UrduHack Docs](https://docs.urduhack.com/en/stable/installation.html) After that I ran the following...
## Feature description We found that this tool does not support romanization.
Tried to run the example given in the documentation for normalization and the results do not match. ```python normalize("پی ایس ایل میں 69 مقامی اور کرس گیل، ڈیرن سیمی، کیون...
Install requires both TensorFlow and TensorFlow-GPU? ```bash pip install urduhack[tf-gpu] ``` It seems both are required to use urduhack. ## Your Environment * Operating System: Ubuntu 18 * Python Version...
## How to reproduce the behavior import urduhack urduhack.download() from urduhack.tokenization import word_tokenizer a = "احسن فاروقی" word_tokenizer(a) ## Your Environment * Operating System: Windows 10 * Python Version Used:...
When I try to download urduhack i received a lot of errors. My python version is 2.8
Sometimes, a single unicode character `ﷲ` is used to denote `اللہ`. Please normalize this as part of the Arabic->Urdu conversion.
I run the following code to generate word tokenization for my urdu text corpus: ``` import urduhack nlp = urduhack.Pipeline() urduhack.download() doc = nlp(text) for sentence in doc.sentences: for word...
## Feature description Would it be possible to extend the Standard Urdu script to add support for ShahMukhi (Punjabi) and Sindhi's additional characters as well? That is, it would be...
BERT ?!
## BERT Usage Salaam team, Great work first of all 🥇 Can we consider adding the BERT model to the list of models ?