LaTeX-OCR Re Training on im2latex-100k dataset

Re Training on im2latex-100k dataset

Open sravanthOppo27 opened this issue 2 years ago • 0 comments

I have been training on im2Latex-100k dataset . But the results for validation are bad even after 60 epochs . I am attaching the config files and the graphs for bleu scores . Can any one please help

Config file :

gpu_devices: null #[0,1,2,3,4,5,6,7] backbone_layers:

2

3

7 betas:

0.9

0.999 batchsize: 64 bos_token: 1 channels: 1 data: ./data_kaggle_processed/train_modified.pkl debug: false decoder_args: attn_on_attn: true cross_attend: true ff_glu: true rel_pos_bias: false use_scalenorm: false dim: 256 encoder_depth: 4 eos_token: 2 epochs: 60 gamma: 0.9995 heads: 8 id: null load_chkpt: null lr: 0.001 lr_step: 30 max_height: 192 max_seq_len: 512 max_width: 672 micro_batchsize: -1 min_height: 32 min_width: 32 model_path: checkpoints name: pix2tex num_layers: 4 num_tokens: 8000 optimizer: Adam output_path: outputs pad: false pad_token: 0 patch_size: 16 sample_freq: 3000 save_freq: 5 scheduler: StepLR seed: 42 encoder_structure: hybrid temperature: 0.2 test_samples: 5 testbatchsize: 20 tokenizer: dataset/tokenizer.json valbatches: 100 valdata: ./data_kaggle_processed/validate_modified.pkl

Mar 01 '23 05:03 sravanthOppo27

LaTeX-OCR LaTeX-OCR copied to clipboard

Re Training on im2latex-100k dataset

LaTeX-OCR
LaTeX-OCR copied to clipboard