LaTeX-OCR
LaTeX-OCR copied to clipboard
Re Training on im2latex-100k dataset
I have been training on im2Latex-100k dataset . But the results for validation are bad even after 60 epochs . I am attaching the config files and the graphs for bleu scores . Can any one please help
Config file :
gpu_devices: null #[0,1,2,3,4,5,6,7] backbone_layers:
- 2
- 3
- 7 betas:
- 0.9
- 0.999 batchsize: 64 bos_token: 1 channels: 1 data: ./data_kaggle_processed/train_modified.pkl debug: false decoder_args: attn_on_attn: true cross_attend: true ff_glu: true rel_pos_bias: false use_scalenorm: false dim: 256 encoder_depth: 4 eos_token: 2 epochs: 60 gamma: 0.9995 heads: 8 id: null load_chkpt: null lr: 0.001 lr_step: 30 max_height: 192 max_seq_len: 512 max_width: 672 micro_batchsize: -1 min_height: 32 min_width: 32 model_path: checkpoints name: pix2tex num_layers: 4 num_tokens: 8000 optimizer: Adam output_path: outputs pad: false pad_token: 0 patch_size: 16 sample_freq: 3000 save_freq: 5 scheduler: StepLR seed: 42 encoder_structure: hybrid temperature: 0.2 test_samples: 5 testbatchsize: 20 tokenizer: dataset/tokenizer.json valbatches: 100 valdata: ./data_kaggle_processed/validate_modified.pkl