Fewshot_Detection Training about COCO

Training about COCO

Open lzhnb opened this issue 5 years ago • 1 comments

I've converted the coco_dataset into the voc style, the flag txt in ImageSets for each class and rewrite the label and label_1c for coco which generates labels' txt.

I think it's not easy like this

Actually, the data split for coco is also released in folder "data". Just change the dataset config and number of classes, everything should be good.

Though you give the process_coco.py, it dosen't work, and i think it misses the flag txt for each class

And the batchsize setting:

batch=64
subdivisions=8

will fill with memory and raise the out of memory error.

Now I change the batch_size setting:

batch=8
subdivisions=8

even though the smallest batch_size:

batch=4
subdivisions=4

It can forward successfuly, but i met the same out of memory in backward:

THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
  File "train_meta.py", line 344, in <module>
    train(epoch)
  File "train_meta.py", line 242, in train
    loss.backward()
  File "/home/aringsan/anaconda2/envs/pytorch2/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/home/aringsan/anaconda2/envs/pytorch2/lib/python2.7/site-packages/torch/autograd/__init__.py", line 99, in backward
    variables, grad_variables, retain_graph)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:58

I want to know your coco_setting and your hardware, I use 4 Titan XP whose memory is 12G

Thanks

Nov 06 '19 08:11 lzhnb

I've converted the coco_dataset into the voc style, the flag txt in ImageSets for each class and rewrite the label and label_1c for coco which generates labels' txt.

I think it's not easy like this
Actually, the data split for coco is also released in folder "data". Just change the dataset config and number of classes, everything should be good.
Though you give the process_coco.py, it dosen't work, and i think it misses the flag txt for each class

And the batchsize setting:
batch=64
subdivisions=8
will fill with memory and raise the out of memory error.

Now I change the batch_size setting:
batch=8
subdivisions=8
even though the smallest batch_size:
batch=4
subdivisions=4
It can forward successfuly, but i met the same out of memory in backward:
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
  File "train_meta.py", line 344, in <module>
    train(epoch)
  File "train_meta.py", line 242, in train
    loss.backward()
  File "/home/aringsan/anaconda2/envs/pytorch2/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/home/aringsan/anaconda2/envs/pytorch2/lib/python2.7/site-packages/torch/autograd/__init__.py", line 99, in backward
    variables, grad_variables, retain_graph)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:58
I want to know your coco_setting and your hardware, I use 4 Titan XP whose memory is 12G

Thanks

Hello, I also encounter the same wth you, have you solved it? Thanks very much!

Mar 18 '20 06:03 futureisatyourhand

Fewshot_Detection Fewshot_Detection copied to clipboard

Training about COCO

Fewshot_Detection
Fewshot_Detection copied to clipboard