molokanov50
molokanov50
Hello team, Based on a trained multilingual Fairseq model (e.g. M2M-100), I run my translations as a service in a Docker container according to the following scheme: as an input...
Hello team, I'm just trying to translate long texts consisting of 6 - 8 sentences but not exceeding 100 tokens in general (in order to not overcome model's memory consumption)...
Hi team, The opportunity of parallel translation (in a single batch) from different source languages is of a particular interest,. The current obstacle lies in the fact that the tokenizer...
Hi. My goal is to finetune a large BERT-based MT model (e.g. `NLLB-200-1.3B`) on new words that are out of model's vocabulary. I managed to finetune it only from a...
There is an incorrect behavior of LID model presented here (`lid218e.bin`). Particularly: `Как сообщает пресса` is identified as `rus_Cyrl` that is correct, but `КАК СООБЩАЕТ ПРЕССА` - as `eng_Latn`; `Добрый...
Hello, Where can I find the list of languages LID model can identify? It seems that only FastText's list of 157 languages is published but it is an older version...
There is a need for me to determine grammatical case for terms in texts of a big dataset. I found that the increment of memory usage as large as 0.3...
I want to finetune an NLLB model on my own data, so according to my vision, the task is relatively simple - to convert my dataset to fairseq format. So...
I run `python test.py --cfg ./configs/caddm_test.cfg` as is indicated in your readme, and I expect a successful completion of evaluation task. Instead, I got an error: ``` (caddm) molokanov@molokanov-pc:~/CADDM$ python...