Ross Wightman
Ross Wightman
I tried this again, still not working, get past one layer of the export and another is broken. I tried newer dynamo export but that is also broken (and broken...
@leng-yue I noticed that, but `# the overhead is compensated only for a drop path rate larger than 0.1` suggested the impact isn't that great if you need at least...
@leng-yue sorry I've got quite a few other tasks to plow through so haven't had a chance to look more closely at this, I do want to test and weight...
@leng-yue yeah, I suppose a new Block would mitigate risk concerns for now, and also fix the breakage of other blocks that don't current support it. Can figure out how...
@zw615 there was lots of nice updates to XLA on the horizon when I was still using it regularly via the `bits_and_tpu` brach. I was excited to start testing PJRT...
yeah, noticed this one, it is timm oriented but as always, baked in square image size assumptions and put the downsample at the end of the blocks so needs a...
@adamjstewart this one https://github.com/huggingface/pytorch-image-models/blob/main/results/benchmark-infer-amp-nchw-pt113-cu117-rtx3090.csv#L2 .. tinynet_e
@adamjstewart if you need even smaller we could figure out a model to add an even smaller one to, if it needs weights with an imagenet classifier would need to...
@adamjstewart So, 1-2 layer models don't make sense as it breaks the model APIs and all assumptions re feature maps, etc. I have made some much smaller models in the...
Another thing that could be done within test modules, create & register own defs... e.g. If one had a unit test module, within that can define test specific models outside...