pytorch-dense-correspondence icon indicating copy to clipboard operation
pytorch-dense-correspondence copied to clipboard

Can't train tutorial on shoes.

Open DanielVlasic opened this issue 6 years ago • 5 comments

I downloaded the shoe data (https://github.com/RobotLocomotion/pytorch-dense-correspondence/blob/master/config/dense_correspondence/dataset/composite/shoes_all.yaml) and tried going through the tutorial training with it.

I've received a "warning, empty mask b”, followed by “float division by zero” error.

Also, not sure which training config is appropriate for the shoe data.

DanielVlasic avatar Sep 17 '19 21:09 DanielVlasic

Looks like one of the masks might be empty, one of the logs may be corrupted. Could you post the full error message? In general training the shoes can be done in the same way as the caterpillar in the tutorial if you want a class-consistent shoe network.

For some of the code used in the shoe experiments you can take a look at https://github.com/RobotLocomotion/pytorch-dense-correspondence/blob/master/dense_correspondence/experiments/shoes_consistent/training_shoes.ipynb.

manuelli avatar Sep 17 '19 23:09 manuelli

Here are some details.

I executed the tutorial: dense_correspondence/training/training_tutorial.ipynb.

I set the config to: config_filename = os.path.join(utils.getDenseCorrespondenceSourceDir(), 'config', 'dense_correspondence', 'dataset', 'composite', 'shoes_all.yaml')

Here is the full output of the training cell:

training descriptor of dimension 3 using SINGLE_OBJECT_WITHIN_SCENE logging_dir: /home/dvlasic/data/pdc/trained_models/tutorials/shoes_3 Downloading: "https://download.pytorch.org/models/resnet34-333f7ec4.pth" to /home/dvlasic/.cache/torch/checkpoints/resnet34-333f7ec4.pth 100.0% /usr/local/lib/python2.7/dist-packages/torch/nn/functional.py:2622: UserWarning: nn.functional.upsample_bilinear is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample_bilinear is deprecated. Use nn.functional.interpolate instead.") /home/dvlasic/code/modules/dense_correspondence_manipulation/utils/utils.py:258: RuntimeWarning: invalid value encountered in arccos theta = 2np.arccos(2 * np.dot(q,r)**2 - 1) /home/dvlasic/code/modules/dense_correspondence_manipulation/utils/utils.py:258: RuntimeWarning: invalid value encountered in arccos theta = 2np.arccos(2 * np.dot(q,r)**2 - 1)

empty data, continuing

/home/dvlasic/code/modules/dense_correspondence_manipulation/utils/utils.py:258: RuntimeWarning: invalid value encountered in arccos theta = 2*np.arccos(2 * np.dot(q,r)**2 - 1)

empty data, continuing

empty data, continuing

empty data, continuing

warning, empty mask b

ZeroDivisionError Traceback (most recent call last) in () 5 print "training descriptor of dimension %d" %(d) 6 train = DenseCorrespondenceTraining(dataset=dataset, config=train_config) ----> 7 train.run() 8 print "finished training descriptor of dimension %d" %(d)

/home/dvlasic/code/dense_correspondence/training/training.pyc in run(self, loss_current_iteration, use_pretrained) 340 masked_non_matches_a, masked_non_matches_b, 341 background_non_matches_a, background_non_matches_b, --> 342 blind_non_matches_a, blind_non_matches_b) 343 344

/home/dvlasic/code/dense_correspondence/loss_functions/loss_composer.pyc in get_loss(pixelwise_contrastive_loss, match_type, image_a_pred, image_b_pred, matches_a, matches_b, masked_non_matches_a, masked_non_matches_b, background_non_matches_a, background_non_matches_b, blind_non_matches_a, blind_non_matches_b) 31 masked_non_matches_a, masked_non_matches_b, 32 background_non_matches_a, background_non_matches_b, ---> 33 blind_non_matches_a, blind_non_matches_b) 34 35 if (match_type == SpartanDatasetDataType.SINGLE_OBJECT_ACROSS_SCENE).all():

/home/dvlasic/code/dense_correspondence/loss_functions/loss_composer.pyc in get_within_scene_loss(pixelwise_contrastive_loss, image_a_pred, image_b_pred, matches_a, matches_b, masked_non_matches_a, masked_non_matches_b, background_non_matches_a, background_non_matches_b, blind_non_matches_a, blind_non_matches_b) 82 matches_a, matches_b, 83 masked_non_matches_a, masked_non_matches_b, ---> 84 M_descriptor=pcl._config["M_masked"]) 85 86 if pcl._config["use_l2_pixel_loss_on_background_non_matches"]:

/home/dvlasic/code/dense_correspondence/loss_functions/pixelwise_contrastive_loss.pyc in get_loss_matched_and_non_matched_with_l2(self, image_a_pred, image_b_pred, matches_a, matches_b, non_matches_a, non_matches_b, M_descriptor, M_pixel, non_match_loss_weight, use_l2_pixel_loss) 83 84 ---> 85 match_loss, _, _ = PCL.match_loss(image_a_pred, image_b_pred, matches_a, matches_b) 86 87

/home/dvlasic/code/dense_correspondence/loss_functions/pixelwise_contrastive_loss.pyc in match_loss(image_a_pred, image_b_pred, matches_a, matches_b) 163 matches_b_descriptors = matches_b_descriptors.unsqueeze(0) 164 --> 165 match_loss = 1.0 / num_matches * (matches_a_descriptors - matches_b_descriptors).pow(2).sum() 166 167 return match_loss, matches_a_descriptors, matches_b_descriptors

ZeroDivisionError: float division by zero

DanielVlasic avatar Sep 18 '19 14:09 DanielVlasic

Thanks, I can fix this

peteflorence avatar Sep 19 '19 19:09 peteflorence

Hi Daniel, sorry to be so slow. Does this commit fix your issue? https://github.com/RobotLocomotion/pytorch-dense-correspondence/commit/ad541fca840f8a07c2bd42b08564bd645341faa8 We have fixed this issue in our private branch, I think this should be all you need. Let me know if doesn't work.

peteflorence avatar Oct 04 '19 17:10 peteflorence

Also I am working on getting the new code open sourced, should be soon.

peteflorence avatar Oct 04 '19 17:10 peteflorence