urdu-words
urdu-words copied to clipboard
📝A text file containing 150,000 Urdu words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion.
150k+ unique Urdu words collections
Consists of text files containing 150k+ Urdu words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion / Embedding networks / Tagging
Files you may be interested in:
I pulled out the words into a simple new-line-delimited text file. Which is more useful when building apps or importing into databases etc.
- words.txt Contains all urdu words.
- bigram_words.txt Contains all urdu bigram words.
- trigram_words.txt Contains all urdu trigram words.
NER Labels
I have added words for labelling Named Entity Recognition(NER) Data. These labels contain words related to different categories like Persons, Locations, Organizations and Dates etc. These words give a good starting point for labelling NER data. Below are the files containing different label words.
- locations.txt Contains locations from across the world
- persons.txt Contains Person Names
- organizations.txt Contains Organization names
- dates.txt Contains time and date related words
Table of contents
- Contributing
- Bugs and feature requests
- Contributors
- Copyright and license
Contributing
All contributions are more than welcomed. Contributions may close an issue, fix a bug (reported or not reported), improve the existing code and so on. If you would like to add a word or a new set of words, send a PR.
Bugs and feature requests
Have a bug or a feature request? If you wish to remove or update some of the words, please file an issue first before sending a PR on the repo. [please open a new issue]
Contributors
Special thanks to everyone who contributed to getting the Urdu hack to the current state. Thanks to Center for Language Engineering for providing the word list.
Backers 
Thank you to all our backers! 🙏 [Become a backer]
Sponsors 
Support this project by becoming a sponsor. [Become a sponsor]
Copyright and license
Code released under the MIT License.