OliverRensu
OliverRensu
Hi We follow the code of PVT for detection and segmentation https://github.com/whai362/PVT/tree/v2/detection
就翻译成分流就好~ wangqiangJN ***@***.***> 于2022年6月29日周三 11:04写道: > 请问Shunted-Self-Attention中Shunted中文翻译成什么会比较准确呢 > > — > Reply to this email directly, view it on GitHub > , or > unsubscribe > > . > You...
please use torch.utils.checkpoint and we train on 8 a6000 48g gpu. We will consider releasing the code 在 2022年4月10日星期日,Hzj199 ***@***.***> 写道: > I used the shunted transformer in mmdetection and...
把 --input-size 设为256 或者128 不过可能需要稍微finetune 一下 wangqiangJN ***@***.***> 于2022年6月20日周一 16:00写道: > 请问我的图片大小是可能是256或者128,不是224,代码里面怎么修改呢,想使用Shunted-S作为backbone > > — > Reply to this email directly, view it on GitHub > , or > unsubscribe...
We will release the pretrained model at the end of this month.
Hi, The ckpt of Shunted-B has been released at the https://drive.google.com/drive/folders/15iZKXFT7apjUSoN2WUMAbb0tvJgyh3YP?usp=sharing xiatao ***@***.***> 于2022年1月26日周三 18:47写道: > We will release the pretrained model at the end of this month. > >...
Hi, How about sending more details to my email ***@***.*** we can discuss them. xiatao ***@***.***> 于2022年2月10日周四 15:02写道: > Thanks for your pretrained model. I train the segformer in >...
Theoretically, we can split H heads into H modes. The key is how to choose the down-sampling rate $r$. For example, we have two modes and choose r=4,8 at stage...
> 1. In ViT, W is also different for different heads, but implemented by one linear layer which makes it similar to shared weight. For example, W is (*, 512)...
模型不一样 参数量和显存占用不一定正比。可以考虑多卡或者gradient accumulation? 在 2022年11月24日星期四,XF-TYT ***@***.***> 写道: > 问题:在自己的数据集上进行训练,GPU是3080TI,batchsize均为32,用swin-B训练显存占用10. > 7G左右,但是用shunted-B训练时显存直接溢出了,无法训练。想问一下原因可能是什么? > > ps:我用summary统计了一下两个模型的参数量,swin-B的参数量约为86M,shunted- > B的参数量约为39M。两个模型均只采用forward_features部分。 > > — > Reply to this email directly, view it on GitHub >...