Sergei Averkiev
Sergei Averkiev
@cxcxcxcx in the /Multilingual_Parallel_Corpus directory there are files that you can use to reproduce all the parrallel text files.
@Zachary-YL in the /Multilingual_Parallel_Corpus directory there are files that you can use to reproduce all the parrallel text files.
@fansiawang in the /Multilingual_Parallel_Corpus directory there are files that you can use to reproduce all the parrallel text files.
Some thoughts: - unaligned sentence pairs will be check in the separate section - calculation also can be performed using conflicts section
For efficient search some indexing tools like Elasticsearch can be involved .
Hi! Please, provide the text you trying to split and the lang_code.
Hello! Here is the working Colab https://colab.research.google.com/drive/1_ics0YzWg5qIZIPhA1X_Wbfg0XZzRO-p Please, try it with your texts. Let me know in case of further errors.
Yes, you need to prepare the dataset in Turkish and train the model.
@vengodelsur Asked @nreimers ([sentence-transformers](https://www.sbert.net/) maintainer) for the advice and he was very helpful. Please, consider this issue [https://github.com/UKPLab/sentence-transformers/issues/634](https://github.com/UKPLab/sentence-transformers/issues/634) I'll investigate it too. The plan is following: - Get rid of...
PR is finally merged. https://github.com/onnx/onnx/pull/3741