Ross Wightman
Ross Wightman
@jn2clark there are no defaults for the architecture, the arch config covers only the model. The preprocess cfg (mean/std) are part of the pretrained mappings and there is one per...
I have thought about this a little be in context of #883 (that solution doesn't work) but could add support for saving/loading folder w/ the full config + checkpoint.
@coyotte508 thanks! my day to day does not involve any internal repos so not an hf-internal member and can't see the code there. Might be good time for me to...
So looks like the metadata does indeed have the right info. The config.jsons for timm are not Transformers though, so adding those fields doesn't make sense, it'd be more infer...
@julien-c Can use all of the timm models as image classifiers or feature extractors with transformers, including the AutoModel/AutoProcessor and pipeline APIs (https://huggingface.co/blog/timm-transformers). Also allows timm models to work with...
@khalidsaifullaah yeah, it's not working quite as efficiently as it should. I feel my current isend/irecv impl, while in theory should be reasonable, it appears it may not a well...
@khalidsaifullaah I'm experimenting with diff impl of the loss to see if any scale better in #971 ... feel free to try, feedback would be welcome
@khalidsaifullaah @long8v FWIW I wouldn't necessarily say no extra overhead as the world size increases is the passing criteria, I feel with gradient buffers, allocator behaviour, etc there's still likely...
@alita-moore hmm, yeah, might be a concern, have you compared the results... force a situation where the padding is needed (it's not usually active) and then see how the accuracy...
@alita-moore the models weren't trained with that padding. It won't be active unless you use resize inputs, set `strict_img_size=False`, `always_partition=True`, etc... these are non-standard settings to allow flexibility for some...