Ross Wightman
Ross Wightman
@EIFY did you try forcing the non reentrant checkpointing? could look to change the default if that works...
@thepowerfuldeez thanks for the PR, will say that we need to do this one carefully as it impacts the output interface. I recognize that people want this, but it's been...
quite possibly a dataloading / efficiency problem.. I wouldn't recommend csv based datasets .. can you compare single GPU vs 2 GPU stats? and ignore GPU utilization %, what's the...
I never got around to hooking that up to timm based models. It's possible, but requires a bit of extra code to handle the resizing properly....
@mitchellnw wouldn't it be better to highlight the best of each and use the 336, 384, and 378 res results for openai, siglip, dfn ?
@Akshay1-6180 the original OpenAI CLIP model has no bias on the final vision and text tower projections, so this was to stick closer to that... but, no reason it wouldn't...
The issue is with --lock-text you mean? text locking needs support for the base model, see #523 ... that issue is almost working but think it needs a few changes,...
@EIFY I don't think this is quite the case, in an autocast context it returns float32 because it's upcast to float32 when AMP . But we aren't using this when...
@estherxue does it behave differently than with normal CLIP (infonce) loss on the exact setup?
we've done a lot of large scale training, long durations, big datasets and never found any noteworthy issues with dataloader memory leaks and the webdataset code. We don't use csv...