Ross Wightman comments

Results 519 comments of


                                            Ross Wightman

Feature Request: Simulating Larger Batch Sizes

@rsomani95 I was working w/ the grad caching a bit, but it is a bit messy and I was never satifisfied it was working 100% as it should. It also...

create_model_and_transforms might give wrong normalizations

@mattdeitke I'm not following this exactly, the L/14 LAION-2B (`laion2b_s32b_b82k`) is the only model that should be returning transforms with 0.5, 0.5, 0.5, are you seeing that for other models?

create_model_and_transforms might give wrong normalizations

@mattdeitke I can assure you that other models here were not trained with 0.5, that option was only added at the time of my last training of L/14 (it wasn't...

timm: unknown model convnext_xlarge

@lopho in released timm that model only exists as a in22k fine-tuned variant, but that's changing in a coming PR, I left in here as I was testing locally and...

Make HF clip support models using an HF text encoder

I have not spent time on this, it looks like https://huggingface.co/docs/transformers/model_doc/vision-text-dual-encoder might be the approach. Would need to remap the checkpoints from OpenCLIP (I have something hacked together here for...

Resuming while using the same experiment folder

@mehdidc The exists check and error is there to prevent a mistake where one stomps over a previous run with a new one (ie it'd overwrite old checkpoints, etc if...

Resuming while using the same experiment folder

@rom1504 yup, done and working quite nicely with lots of testing on both stability and juwels over the past week and a bit. I tested extensively with tensorboard enabled. I...

support layer decay and different lr for text/visual encoder

@Quan-Sun @gabrielilharco I think the changes are reasonable, I use layer decay extensively in timm fine-tuning these days, and I feel it'd be useful here, especially when initializing one or...

support layer decay and different lr for text/visual encoder

FWIW, timm's LD impl is here https://github.com/rwightman/pytorch-image-models/blob/e98c93264cde1657b188f974dc928b9d73303b18/timm/optim/optim_factory.py#L92-L153 ... all models have a fn that returns the group metadata, and the grouper fn can be used, so that would be basis...

Consider pydantic config classes

We should probably let the dust settle on the major PRs being added right now, see what their demands will be in terms of train code and modelling structure before...