urduhack icon indicating copy to clipboard operation
urduhack copied to clipboard

An NLP library for the Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.

Results 14 urduhack issues
Sort by recently updated
recently updated
newest added

## How to reproduce the problem I ran the following code after installing urduhack as ```bash pip install Urduhack[tf] ``` from here: [UrduHack Docs](https://docs.urduhack.com/en/stable/installation.html) After that I ran the following...

## Feature description We found that this tool does not support romanization.

Tried to run the example given in the documentation for normalization and the results do not match. ```python normalize("پی ایس ایل میں 69 مقامی اور کرس گیل، ڈیرن سیمی، کیون...

bug

Install requires both TensorFlow and TensorFlow-GPU? ```bash pip install urduhack[tf-gpu] ``` It seems both are required to use urduhack. ## Your Environment * Operating System: Ubuntu 18 * Python Version...

## How to reproduce the behavior import urduhack urduhack.download() from urduhack.tokenization import word_tokenizer a = "احسن فاروقی" word_tokenizer(a) ## Your Environment * Operating System: Windows 10 * Python Version Used:...

When I try to download urduhack i received a lot of errors. My python version is 2.8

invalid

Sometimes, a single unicode character `ﷲ` is used to denote `اللہ`. Please normalize this as part of the Arabic->Urdu conversion.

bug

I run the following code to generate word tokenization for my urdu text corpus: ``` import urduhack nlp = urduhack.Pipeline() urduhack.download() doc = nlp(text) for sentence in doc.sentences: for word...

## Feature description Would it be possible to extend the Standard Urdu script to add support for ShahMukhi (Punjabi) and Sindhi's additional characters as well? That is, it would be...

## BERT Usage Salaam team, Great work first of all 🥇 Can we consider adding the BERT model to the list of models ?