TITC

Results 27 comments of TITC

GPU6-new version: ```YAML gpu_devices: [6] #[0,1,2,3,4,5,6,7] num_workers: 20 ... ``` GPU7-previous version ```YAML gpu_devices: [7] #[0,1,2,3,4,5,6,7] ... same config ``` ![Video_20220524205302_Trim](https://user-images.githubusercontent.com/35098797/170043871-047f97a1-2cee-443c-af31-565976613b54.gif) training time compare GPU6 group ```shell BLEU: 0.044, ED:...

`pin_memory` not accelerate more in this situation. ```python class Dataloader(DataLoader): def __init__(self, dataset: Im2LatexDataset, shuffle=False, *args, **kwargs): self.dataset = dataset self.tokenizer = dataset.tokenizer self.dataset.update(shuffle=shuffle, *args, **kwargs) print("Here is the Dataloader...

1. eval needs tokenizer https://github.com/lukas-blecher/LaTeX-OCR/blob/d0251fa3ba130e4a29faf4642214fe9f86e60573/pix2tex/eval.py#L55-L56 2. pin_memory should work in theorem, because when set `pin_memory`==False, data is in virtual memory but when it turns True, data is kept in memory...

The gray curve is the new version. All configs are the same. Kinda out of my expectation. The purple curve achieve higher bleu at epoch 5, but the gray one...

The purple run is based on https://github.com/lukas-blecher/LaTeX-OCR/commit/dbf75d97f5d256ad3eae130ea6688bf2396df18d. The wall time [compare](https://wandb.ai/yht4work/LaTeX-OCR-pix2tex/reports/Wall-time-compare--VmlldzoyMDY0NzQy?accessToken=uw8sal842pjsdpctn8tccser8i4vpor7dueer39tlyt6bn3md7eihc0hxvpvlnwt). --- ~I thought team will share training log automatically, but it's blank in it. I am trying to find...

> So none of them are with the new dataloader setup? The [grey ](https://wandb.ai/pix2tex/LaTeX-OCR-pix2tex/runs/3k0z6n58/overview?workspace=user-yht4work) one with `num_workers`=35. The purple one without. That's the only difference between their‘s config.

Test data at this https://github.com/lukas-blecher/LaTeX-OCR/pull/154#issuecomment-1135895560 is under `dim`=256, in this case GPU utilization is average at 20%. But the other compare,https://github.com/lukas-blecher/LaTeX-OCR/pull/154#issuecomment-1136795322, `dim` =756, GPU utilization approximately at 60%. With the...

> The purple run is based on [dbf75d9](https://github.com/lukas-blecher/LaTeX-OCR/commit/dbf75d97f5d256ad3eae130ea6688bf2396df18d). >But the grey run is also based on https://github.com/lukas-blecher/LaTeX-OCR/commit/dbf75d97f5d256ad3eae130ea6688bf2396df18d so the num_workers argument doesn't change anything because it is not implemented yet....

> I can't follow your calculation The calculation is not rigorous, maybe is wrong. I just want to try to explain the speedup gap phenomenon. Hypothesis num_worker=35 means 35 times...

pix2tex-vit-5.27.11.00 - num_worker:20 - hash: b90567d07f9c0225fe49b626ddcbf20731050b12 pix2tex-vit-5.25.14.08 - hash: dbf75d97f5d256ad3eae130ea6688bf2396df18d pix2tex-vit-5.24.23.09 - num_worker:35 - hash: 9c6f96e538b49382d6e9f79c695a586bdcff6ddf --- [wandb-graph](https://wandb.ai/pix2tex/LaTeX-OCR-pix2tex/reports/speed-compare--VmlldzoyMDc5NTY4?accessToken=f31r3708gicht0llpuf9e5cye8qiqa13z1231xb34tmhhyacqqnzqkcrqcmf9dfr) in the view of elative time at 14 hours, pix2tex-vit-5.27.11.00 achieved epoch...