Giovanni Puccetti
Giovanni Puccetti
@iejMac my idea about the issue seems like it's wrong, another thing could be that since pad tokens are now ignored in the loss it looks higher. Tomorrow I will...
@iejMac sorry I didn't have time to work on this sooner and also for wasting some compute: - higher loss is due to now ignoring pad_tokens and indeed performance is...
@Soonhwan-Kwon i am working on adding coco as a dataset so we can make evaluation automatically, should make the PR on a couple days unless you are doing it already
No I think it used the default ones, I think the VisionTransformer doesn't call it either? I mean it calls it but it does nothing
@iejMac I added one more change that should make this ready for the temptative retraining
@rwightman @rom1504 @iejMac hi, I worked on this PR, as it is it has a few changes in tests, adds transformers compat and fixes the issues. This is the best...
> @gpucce so discussing here so I might possibly combine this with #660 checks, this was days before my second child was born so yeah, it got lost in the...
Hi, There is a PR #551 to fix this but I think nobody has time to review it
@JaejinCho hi do not worry about tagging :) in general I think I used the model with larger batch sizes in smaller gpus. Looking at the error it looks like...
@JaejinCho @tillaczel I am using 1.13.1 but it might not be the issue. Are you trying to fine-tune a pre-trained model or pretrain a new one?