yanmtt issues

Improve documentation

Exactly what the title says. Find an undocumented part of the code and document it. I will give 1 potato per pull request.

prajdabre

documentation

Adding WandB support

Currently, the model information such as losses and gradients are saved every N steps to a local file, which is then visualized using tensorboard. WandB is the cool new kid...

prajdabre

good first issue

Cleaning the code for calling the model and computing loss

The 2 core files pretrain_nmt.py, train_nmt.py contain a lot of repeated monolithic code. For example, the loss computation part of the code is mostly repeated. Desired cleanup: 1. Identify the...

prajdabre

documentation

good first issue

Hi, after training 1.5million steps with setting in issue:https://github.com/prajdabre/yanmtt/issues/39 I check the loss in tensorboard, and the image is as following and refer to the run_train.log, the print loss has...

raullese

some confused of <CUDA out of memory>

13

Hi, when I use `train_mbart_model.sh` to continue pre train the mBart-50, after 300k batches, the error occurred as:`RuntimeError: CUDA out of memory. Tried to allocate 1.90 GiB (GPU 0; 39.44...

raullese

Getting error when try to pre-train for three languages

15

using the below command: python pretrain_nmt.py -n 1 -nr 0 -g 1 --use_official_pretrained --pretrained_model ai4bharat/IndicBART --tokenizer_name_or_path ai4bharat/IndicBART --langs hi,kn,bn --mono_src /home/aniruddha/all_data/train.hi,/home/aniruddha/all_data/train.kn,/home/aniruddha/all_data/train.bn --batch_size 8 --batch_size_indicates_lines --shard_files --model_path aibharat/IndicBART/model --port 7878 -----------------------------------------------------------------------------------------------------------------------------------------...

Aniruddha-JU

RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading file data/94862161006912: file read failed

21

Hi, When I use the `train_mbart_model.sh` to get further pre train based on the mBart-large-50 from this:https://huggingface.co/facebook/mbart-large-50; And when I run in single GPU, there is no problem, but when...

raullese

which pre-train model should we use for fine-tuning

2

I have pre-trained the IndicBART model on new monolingual data, and in the model path two models are saved 1) IndicBART and 2) IndicBART_puremodel. Now which should we use during...

Aniruddha-JU

Some question of pretrain_nmt.py

1

I have some confusion about the pretrain_nmt.py I just saw that for the first few lines in your pretrain_nmt.py, `from transformers import AutoTokenizer, MBartTokenizer, MBart50Tokenizer, BartTokenizer, AlbertTokenizer from transformers import...

raullese

Exception: process 0 terminated with signal SIGSEGV

8

Hi, i met a tricky problem on pretrain_nmt.py my commond: `CUDA_VISIBLE_DEVICES=3 python pretrain_nmt.py -n 1 -nr 0 -g 1 --pretrained_model facebook/bart-base --use_official_pretrained --tokenizer_name_or_path facebook/bart-base --is_summarization --warmup_steps 500 --save_intermediate_checkpoints --mono_src /home/WwhStuGrp/yyfwwhstu16/yanmtt/dataset/pubmed/pubmed-dataset/train_fineshed.txt...

1029694141

yanmtt
yanmtt copied to clipboard

Metadata

Improve documentation

Adding WandB support

Cleaning the code for calling the model and computing loss

Some problem with loss

some confused of <CUDA out of memory>

Getting error when try to pre-train for three languages

RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading file data/94862161006912: file read failed

which pre-train model should we use for fine-tuning

Some question of pretrain_nmt.py

Exception: process 0 terminated with signal SIGSEGV

← Metadata

Owner

Metadata

yanmtt yanmtt copied to clipboard

Metadata

← Metadata

Owner

Metadata

yanmtt
yanmtt copied to clipboard