ElasticFace icon indicating copy to clipboard operation
ElasticFace copied to clipboard

Cuda error( core dump )

Open xuanson97hg opened this issue 3 years ago • 3 comments

when I train with my custom data set if I keep default config.num_classes and config.num_image it work and if I set config.num_classes and config.num_image by my classes and number image of my dataset it error with frame and core dump

xuanson97hg avatar Nov 24 '21 03:11 xuanson97hg

how many identities dose your training dataset has? you may consider a smaller batch size to fit your model in the GPU

fdbtrs avatar Dec 16 '21 14:12 fdbtrs

how many identities dose your training dataset has? you may consider a smaller batch size to fit your model in the GPU

i try with smaller batch then code run fill minute and core dump again . my dataset has 12k image

xuanson97hg avatar Jan 06 '22 03:01 xuanson97hg

Resnet 100 with 512 batch requires approximately 4 gpus each of 16GB. You need to set the batch size based on number of gpu and gpu memory

fdbtrs avatar Jan 30 '22 22:01 fdbtrs