keypoint_rcnn_training_pytorch icon indicating copy to clipboard operation
keypoint_rcnn_training_pytorch copied to clipboard

How to train my own dataset?

Open madenbr opened this issue 3 years ago • 11 comments

Congratulations for this job. It is nice project. I follow your directory for train. It works when I have two keypoints. Assuming I have 10 points but It didn't work when I didn't mark all the keypoints in the images. How i can ?

madenbr avatar Feb 16 '22 15:02 madenbr

I've trained this model with 8 keypoints, and it works very good

It's important to have a large dataset to train the model well

alexppppp avatar Feb 16 '22 16:02 alexppppp

The model is work well, if all keypoints are marked. if there is not all keypoint in annotated image, when txt files convert to json some keypoint is null. I start training, it stop.

madenbr avatar Feb 16 '22 19:02 madenbr

There are two ways to solve the problem: a). either mark all unmarked keypoints b). or remove images where not all keypoints are marked

alexppppp avatar Feb 16 '22 19:02 alexppppp

I annotated all point on images but I getting this error.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_8167/1220785760.py in <module>
     11 
     12 model = get_model(num_keypoints = 5)
---> 13 model.to(device)
     14 
     15 params = [p for p in model.parameters() if p.requires_grad]

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in to(self, *args, **kwargs)
    897             return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    898 
--> 899         return self._apply(convert)
    900 
    901     def register_backward_hook(

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    568     def _apply(self, fn):
    569         for module in self.children():
--> 570             module._apply(fn)
    571 
    572         def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    568     def _apply(self, fn):
    569         for module in self.children():
--> 570             module._apply(fn)
    571 
    572         def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    568     def _apply(self, fn):
    569         for module in self.children():
--> 570             module._apply(fn)
    571 
    572         def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    591             # `with torch.no_grad():`
    592             with torch.no_grad():
--> 593                 param_applied = fn(param)
    594             should_use_set_data = compute_should_use_set_data(param, param_applied)
    595             if should_use_set_data:

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in convert(t)
    895                 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
    896                             non_blocking, memory_format=convert_to_format)
--> 897             return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    898 
    899         return self._apply(convert)

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

madenbr avatar Feb 17 '22 12:02 madenbr

Don't know, I didn't get such error

If you share your notebook and dataset, I can check it once I have a free time

alexppppp avatar Feb 17 '22 14:02 alexppppp

How can I send my dataset? Mail or drive?

madenbr avatar Feb 18 '22 09:02 madenbr

What is your email?

alexppppp avatar Mar 10 '22 16:03 alexppppp

I annotated all point on images but I getting this error.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_8167/1220785760.py in <module>
     11 
     12 model = get_model(num_keypoints = 5)
---> 13 model.to(device)
     14 
     15 params = [p for p in model.parameters() if p.requires_grad]

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in to(self, *args, **kwargs)
    897             return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    898 
--> 899         return self._apply(convert)
    900 
    901     def register_backward_hook(

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    568     def _apply(self, fn):
    569         for module in self.children():
--> 570             module._apply(fn)
    571 
    572         def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    568     def _apply(self, fn):
    569         for module in self.children():
--> 570             module._apply(fn)
    571 
    572         def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    568     def _apply(self, fn):
    569         for module in self.children():
--> 570             module._apply(fn)
    571 
    572         def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    591             # `with torch.no_grad():`
    592             with torch.no_grad():
--> 593                 param_applied = fn(param)
    594             should_use_set_data = compute_should_use_set_data(param, param_applied)
    595             if should_use_set_data:

~/anaconda3/envs/point/lib/python3.8/site-packages/torch/nn/modules/module.py in convert(t)
    895                 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
    896                             non_blocking, memory_format=convert_to_format)
--> 897             return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    898 
    899         return self._apply(convert)

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

@madenburak you are getting this error because you don't have enough memory in your system to process the batch of images.

sowmyakavali avatar Jun 29 '22 07:06 sowmyakavali

@madenburak how did you mark annotations not visible in the image? I set annotations not visible in the image [0,0,0]. because [x,y,visibility] visibility =0 means that the keypoint is not visible.

HanSeulChung avatar Apr 12 '23 07:04 HanSeulChung

@alexppppp

[[530, 555, 4400, 2025]]

ValueError Traceback (most recent call last) in <cell line: 21>() 20 21 for epoch in range(num_epochs): ---> 22 train_one_epoch(model, optimizer, data_loader_train, device, epoch, print_freq=1000) 23 lr_scheduler.step() 24 evaluate(model, data_loader_test, device)

13 frames /usr/local/lib/python3.9/dist-packages/albumentations/core/keypoints_utils.py in convert_keypoint_to_albumentations(keypoint, source_format, rows, cols, check_validity, angle_in_degrees) 197 198 if source_format == "xy": --> 199 if len(keypoint[:2])== 0 | len(keypoint[2:])==0: 200 (x, y), tail = [0,0], tuple(0, 0) 201 else:

ValueError: not enough values to unpack (expected 2, got 0)

My keypoints are 5. and There are keypoints not visible in the image. so after annotation i change empty list to [0,0,0]. what can i do?.....

HanSeulChung avatar Apr 12 '23 07:04 HanSeulChung

Please see your dataset. If the value in annotation of dataset is empty, the above error is occur. I deleted the files that have empty value and then it is worked.

ericfried1204 avatar Jan 10 '24 14:01 ericfried1204