Ross Wightman
Ross Wightman
@bluesky314 I would like to try this, but need to get obj detection training running first, a bit busy for a while so not sure when I'll get to it...
@bluesky314 yeah, it should be fairly straight forward, but still making big improvements in the core model/post processing. One concern I have with the segmentation with the Tensowflow SAME equivalent...
@dmatos2012 thanks for the impl, unfortunately I can't merge. Yolov5 is GPL-3 license so I can't include any code from that project here as it would be in conflict with...
This isn't a bug, it's just functionality not implemented since it's non-trivial. See #89 and #32 ... I'll leave this one open so another issue isn't created. I have no...
@irinushirka colab isn't a normal filesystem, it's a FUSE filesystem on top of cloud storage and doesn't support hardlinks which the saver relies on for robust checkpoint saving (crash recovery)....
To be more specific GroupNorm w/ groups=1 normalizes over C, H, W. LayerNorm as used in transformers normalizes over the channel dimension only. Since PyTorch LN doesn't natively support 2d...
@sacmehta the equivalence for GN and LN as per the paper is for NCHW tensors when LN is performed over all of C, H, W (minus the affine part as...
@sacmehta thanks for the update, looks like the channels-only LN is definitely not stable in this architecture.
FYI, item 2 essentially means that all training ends up as ResampledShards() as the distributed worker all get seeded differently (I confirmed this with a test)
@poor1017 you don't have enough shard files to distribute amongst the dataloader workers across all train processes. If a train process ends up with no shards, it will hangs training...