GrimPixel
GrimPixel
In fact, not only Chinese, but also Japanese, Korean has this problem. In Vietnamese, space is used to separate syllables; in Thai and Lao, space is used to separate sentences....
I just knew about tools for word segmentation and saw you needed them. I have no experience with them.
@unfa Yes, a very large one that is enough for mathematical use.
Looks like the data is not released for free. Now I doubt if that is legal.
I saw the page. They allow the redistribution of their free data. But I am not sure where lists of languages other than English are from. https://www.wordfrequency.info/samples.asp
尚沒有對 TSV 的支持,有點美中不足。
Don't know much about it. I just found a repo and want to share. Maybe it helps. https://github.com/NanoMichael/MicroTeX
People can consider establishing something new instead of sticking to proprietary contents. We need a framework to build virtual worlds with non-proprietary contents. The modding community is so powerful. I...
Here is a new tool: https://codeberg.org/GrimPixel/Text_to_Wordlist You can place your text file in the corresponding directory `0_text`, then check the `text_setting.yaml` and `dictionary_setting.yaml`, then run `extract_text.py` and `extract_dictionary.py` to generate...
You can simply imagine what need to be done in the scenario of a round table. The benefit of a round table is that everyone can face the centre of...