yanmtt
yanmtt copied to clipboard
Yet Another Neural Machine Translation Toolkit
Exactly what the title says. Find an undocumented part of the code and document it. I will give 1 potato per pull request.
Currently, the model information such as losses and gradients are saved every N steps to a local file, which is then visualized using tensorboard. WandB is the cool new kid...
The 2 core files pretrain_nmt.py, train_nmt.py contain a lot of repeated monolithic code. For example, the loss computation part of the code is mostly repeated. Desired cleanup: 1. Identify the...
Hi, after training 1.5million steps with setting in issue:https://github.com/prajdabre/yanmtt/issues/39 I check the loss in tensorboard, and the image is as following and refer to the run_train.log, the print loss has...
Hi, when I use `train_mbart_model.sh` to continue pre train the mBart-50, after 300k batches, the error occurred as:`RuntimeError: CUDA out of memory. Tried to allocate 1.90 GiB (GPU 0; 39.44...
using the below command: python pretrain_nmt.py -n 1 -nr 0 -g 1 --use_official_pretrained --pretrained_model ai4bharat/IndicBART --tokenizer_name_or_path ai4bharat/IndicBART --langs hi,kn,bn --mono_src /home/aniruddha/all_data/train.hi,/home/aniruddha/all_data/train.kn,/home/aniruddha/all_data/train.bn --batch_size 8 --batch_size_indicates_lines --shard_files --model_path aibharat/IndicBART/model --port 7878 -----------------------------------------------------------------------------------------------------------------------------------------...
Hi, When I use the `train_mbart_model.sh` to get further pre train based on the mBart-large-50 from this:https://huggingface.co/facebook/mbart-large-50; And when I run in single GPU, there is no problem, but when...
I have pre-trained the IndicBART model on new monolingual data, and in the model path two models are saved 1) IndicBART and 2) IndicBART_puremodel. Now which should we use during...
I have some confusion about the pretrain_nmt.py I just saw that for the first few lines in your pretrain_nmt.py, `from transformers import AutoTokenizer, MBartTokenizer, MBart50Tokenizer, BartTokenizer, AlbertTokenizer from transformers import...
Hi, i met a tricky problem on pretrain_nmt.py my commond: `CUDA_VISIBLE_DEVICES=3 python pretrain_nmt.py -n 1 -nr 0 -g 1 --pretrained_model facebook/bart-base --use_official_pretrained --tokenizer_name_or_path facebook/bart-base --is_summarization --warmup_steps 500 --save_intermediate_checkpoints --mono_src /home/WwhStuGrp/yyfwwhstu16/yanmtt/dataset/pubmed/pubmed-dataset/train_fineshed.txt...