Ross Wightman
Ross Wightman
long time, but no hard coded needed anymore, passed through to folder, huggingface, webdataset, and tfds dataset readers .. only torchvision datasets aren't easily supported
@banda-larga I actually looked at this around xmas time and was surprised, the checkpoints were awful (very poor validation). I tried all sensible input normalizations, etc but always really bad...
actually not really an issue / bug due to head order, fewer layers than described in the issue are chopped off and don't feel it makes sense to change
supported on main branch now w/ NHWC output (see #1438 for more)
`bulk_runner.py` does this, been using it for mass benchmark and validation for a while
@JustinMBrown it's a reasonable idea, only issue is that it ends up being a big change, ALL pretrained checkpoints right now are bare state_dict with no extra layer in the...
@lucidrains inconclusive so far, managed to almost match some recent adamw results for large fine-tune, but took a fair bit of search. I feel unless very resource contained adamw still...
@rom1504 k, will try and look at it soon
overall things look pretty good, I'm trying to get over a mental block re the loss naming, I realize why the feature_a/b changes were made to the loss but I...
@gpucce so discussing here so I might possibly combine this with #660 checks, this was days before my second child was born so yeah, it got lost in the stack...