wordfreq
wordfreq copied to clipboard
Access a database of word frequencies, in various natural languages.
I'm working on an app that uses this library, but best I can tell, when I pack my app into an executable using PyInstaller, it doesn't realize that the data...
Bumps [ipython](https://github.com/ipython/ipython) from 7.34.0 to 8.10.0. Commits 15ea1ed release 8.10.0 560ad10 DOC: Update what's new for 8.10 (#13939) 7557ade DOC: Update what's new for 8.10 385d693 Merge pull request from...
Hello my name is Mikel, I would like to know if there is any possibility of adding a new language to the library. The **Basque** language. And if the answer...
[`mecab-python3`](https://github.com/SamuraiT/mecab-python3#Dictionaries) itself doesn't recommend `ipadic` anymore. In order to use MeCab, you must install a dictionary. There are many different dictionaries available for MeCab. These UniDic packages, which include slight...
Is it possible to get a character list for each language ordered by frequency?
Getting this error when trying to execute as a .py file. Is it even possible to execute this as a standalone script to, for example, get a quick wordlist? `ImportError:...
There is a project under Unicode License (GNU like) called Unilex and gathering frequency for 1000 languages. They are based on Google's corpuscrawler, which is python and handfeed links to...
Hi, I try to use wordfreq on Japanese on Centos 7. I keep getting an error of `Couldn't find the MeCab dictionary named 'mecab-ipadic-utf8'`, however, there's no such package on...
Contrary to the 'no-break space' ("\u00A0"), the 'narrow no-break space' ("\u202f") is not recognized as a word boundary. tokenize("La vois-tu souvent ?", "fr") returns ['la', 'vois', 'tu', 'souvent\u202f'] instead of...
Would be wonderful to be able to get frequencies for the different sources!