Sergei Averkiev comments

Results 19 comments of


                                            Sergei Averkiev

Cannot fetch data files due to data quota

@cxcxcxcx in the /Multilingual_Parallel_Corpus directory there are files that you can use to reproduce all the parrallel text files.

Problem when to download the Bilingual_Parallel_Corpus

@Zachary-YL in the /Multilingual_Parallel_Corpus directory there are files that you can use to reproduce all the parrallel text files.

Bilingual_Parallel_Corpus: some links failure

@fansiawang in the /Multilingual_Parallel_Corpus directory there are files that you can use to reproduce all the parrallel text files.

Alignment completion flag

Some thoughts: - unaligned sentence pairs will be check in the separate section - calculation also can be performed using conflicts section

Search sentences in splitted texts

For efficient search some indexing tools like Elasticsearch can be involved .

A error when I use “splitter.split_by_sentences_wrapper”，please help check the error

Hi! Please, provide the text you trying to split and the lang_code.

A error when I use “splitter.split_by_sentences_wrapper”，please help check the error

Hello! Here is the working Colab https://colab.research.google.com/drive/1_ics0YzWg5qIZIPhA1X_Wbfg0XZzRO-p Please, try it with your texts. Let me know in case of further errors.

Turkish Trainig

Yes, you need to prepare the dataset in Turkish and train the model.

Extract/train bilingual model for zh-ru

@vengodelsur Asked @nreimers ([sentence-transformers](https://www.sbert.net/) maintainer) for the advice and he was very helpful. Please, consider this issue [https://github.com/UKPLab/sentence-transformers/issues/634](https://github.com/UKPLab/sentence-transformers/issues/634) I'll investigate it too. The plan is following: - Get rid of...

Exporting the operator stft to ONNX opset version 9 is not supported.

PR is finally merged. https://github.com/onnx/onnx/pull/3741