Ross Wightman comments

Results 522 comments of


                                            Ross Wightman

Consider pydantic config classes

@lucidrains thinking about possible config designs, curious if you have a full example of what the pydantic based scheme would look like? Does it allow easy interaction with human readable...

Lint.

@rom1504 for black, adding `--skip-string-normalization` is a little less opinionated and reduces diff quite a bit... But yeah, should probably focus on the major PR and some refactoring / design...

Better Hugging Face integration -- library tag, community models

@rom1504 from this point, CLIP models could be supported better in the Hub UI by also adding a community inference pipeline To get them natively in Transformers, that's another step....

OOM with batch size 1 when with ViT-bigG on 40GB GPU

I think we've got two 'easy' options right now, DeepSpeed Zero (PR for this #264 might be worth testing) or PyTorch native FSDP. Talking w/ someone close to TPUs &...

Cannot reproduce your work

@CloudRR yeah, something is wrong there but really hard to say what it is. BTW that 20% graph is also low, is that actually the one in the README for...

autoresume if experiment exists

@mitchellnw heh, I was actually thinking about this an hour ago... * saving latest as done now is a bit error prone / slightly wasteful, it's done after the numbered...

Pretrained model with global_average_pool=True?

@hetong007 no immediate plans to train such a model but a possibilkity. open to contributions but will close this for now

`logit_scale` and resume

@mitchellnw that's interesting, I haven't observed that before in previous runs. I went back to check across some old resumes and even in the overlap (where there were logs before...

`logit_scale` and resume

@mitchellnw coming back to this one, I don't feel the explanation makes sense, logit scale dips should have no correlation with the end of SCI in terms of dataset randomness....

PoC for a resume that searches for last checkpoint if `--resume latest` arg set

@rom1504 args.checkpoint_path is constrained to the current name by default `args.checkpoint_path = os.path.join(args.logs, args.name, "checkpoints")` but the get_latest_checkpoint fn could be used to search across multiple folders if you passed...