vit-pytorch icon indicating copy to clipboard operation
vit-pytorch copied to clipboard

Vit3D for Brain CTscans

Open Meddebma opened this issue 3 years ago • 6 comments

Dear lucidrains, Thank you so much for this amazing code. I've tried to adapt the ViT to my 3D Brain CT images but unfortunately the training did not work as wished. I would be very greatful if you could check my notebook and tell me whether you see a mistake. I use MONAI Framework usually and got an accuracy of 0.91 with DenseNet121 for the classification task.

https://github.com/Meddebma/pyradiomics/blob/master/Classification_HBI_ViT.ipynb

thank you very much!

Meddebma avatar Nov 26 '21 15:11 Meddebma

@Meddebma you should try extending some of the more recent ViT variants to 3d, the ones with hierarchy, position generating convolutions, as well as local / global inductive biases. they should train a lot faster

lucidrains avatar Nov 26 '21 18:11 lucidrains

@lucidrains thanks a lot, do you have a finished notebook for 3d images? I cant find it

Meddebma avatar Nov 26 '21 18:11 Meddebma

@Meddebma no, this repository is just for images, but you can extend any one of these architectures by simply accounting for the extra dimension

lucidrains avatar Nov 26 '21 20:11 lucidrains

dear @lucidrains,

I am trying to convert the crossformer to 3D but I am honestly struggling with it, could you please help me out with the lines I should change? Or do you recommend another architecture for 3d Brain slices?

Thank you very much.

https://github.com/lucidrains/vit-pytorch/blob/79c864d7964e27043c2c4bc42627ba13b2eea9cf/vit_pytorch/crossformer.py#L204

Meddebma avatar Nov 29 '21 16:11 Meddebma

@Meddebma crossformer may be tricky, as you'd need to extend the dynamic positional embedding bias to 3d

lucidrains avatar Nov 30 '21 19:11 lucidrains

@Meddebma just do the original ViT, but do four stages with downsampling, and perhaps use 3d convs in the feedforwards

lucidrains avatar Nov 30 '21 19:11 lucidrains