Hannes Krumbiegel

Results 91 comments of Hannes Krumbiegel

I wrote a bit not very tested code here (API might change): https://github.com/Vuizur/wiktextract-lemmatization The theory is that you can put a forms array in there and get a fixed one...

Nice, the word detection works now flawlessly for Russian. 👍

These results are extremely impressive! I recently tried to implement something [similar](https://github.com/Vuizur/gpt3-chatbot) in Python, only not locally, but instead using different online APIs, but it felt worse than your demo...

I'll do it with the next released dump. 👍 Thanks a lot for the work!

The external hard drive I ran the calculations on seems to be dying, I will have to find another way to repeat it 😅.

I noticed that the English translations have the key "code" for the items of the translation array, whereas other languages such as Spanish have the key "lang_code". (I think lang_code...

I haven't quite gotten it to work, my current version prints a huge number of error. Some small selection: ``` еділя: DEBUG: UNIMPLEMENTED top-level template: -uk- {} at ['неділя', '-uk-']...

Thanks for the help. The error might be in the data files, I will have to keep looking. I also can't seem to get the program to run completely (using...

Hmm, I haven't gotten it to work yet (but I also didn't have that much time recently). I think Wiktextract works pretty decently with the Chinese Wiktionary because they have...

Removing that one character works fine for books created by my program, for general purpose one should maybe use the more sophisticated remove_accents function like in Proficiency, which can also...