detecto
detecto copied to clipboard
Question: AssertionError
Describe the question
I Doubt it's a bug. Just a question.
I am new to pytorch etc. I get this error with my images. but not with other images I got from a public dataset.
AssertionError:
It doesn't say much. There is nothing after the colon. It simply says AssertionError: -- that's it -- nothing more. How can I figure out what I am doing wrong?
I wonder if it has to do with my images. They are all the same size. I hoped the default transformation would be enough.
Code and Data Share the code that caused the error as well as an example image/label from your dataset.
I have started with your colab notebook at: https://colab.research.google.com/drive/1ISaTV5F-7b4i2QqtjTa7ToDPQ2k8qEe0 I have used a few of my images.
code:
losses = model.fit(loader, val_dataset, epochs=3, verbose=True)
output:
It looks like you're training your model on a CPU. Consider switching to a GPU; otherwise, this method can take hours upon hours or even days to finish. For more information, see https://detecto.readthedocs.io/en/latest/usage/quickstart.html#technical-requirements
Epoch 1 of 3
Begin iterating over training dataset
0%| | 0/15 [00:00<?, ?it/s]
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
/tmp/ipykernel_77/1923109836.py in <cell line: 13>()
11 # Train the model! This step can take a while, so make sure you
12 # the GPU is turned on in Edit -> Notebook settings
---> 13 losses = model.fit(loader, val_dataset, epochs=3, verbose=True)
14
15 # Plot the accuracy over time
~/.conda/envs/default/lib/python3.9/site-packages/detecto/core.py in fit(self, dataset, val_dataset, epochs, learning_rate, momentum, weight_decay, gamma, lr_step_size, verbose)
521 # Calculate the model's loss (i.e. how well it does on the current
522 # image and target, with a lower loss being better)
--> 523 loss_dict = self._model(images, targets)
524 total_loss = sum(loss for loss in loss_dict.values())
525
~/.conda/envs/default/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1109 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
~/.conda/envs/default/lib/python3.9/site-packages/torchvision/models/detection/generalized_rcnn.py in forward(self, images, targets)
97 features = OrderedDict([("0", features)])
98 proposals, proposal_losses = self.rpn(images, features, targets)
---> 99 detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
100 detections = self.transform.postprocess(detections, images.image_sizes, original_image_sizes) # type: ignore[operator]
101
~/.conda/envs/default/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1109 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
~/.conda/envs/default/lib/python3.9/site-packages/torchvision/models/detection/roi_heads.py in forward(self, features, proposals, image_shapes, targets)
749 matched_idxs = None
750
--> 751 box_features = self.box_roi_pool(features, proposals, image_shapes)
752 box_features = self.box_head(box_features)
753 class_logits, box_regression = self.box_predictor(box_features)
~/.conda/envs/default/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1109 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
~/.conda/envs/default/lib/python3.9/site-packages/torchvision/ops/poolers.py in forward(self, x, boxes, image_shapes)
325 x_filtered = _filter_input(x, self.featmap_names)
326 if self.scales is None or self.map_levels is None:
--> 327 self.scales, self.map_levels = _setup_scales(
328 x_filtered, image_shapes, self.canonical_scale, self.canonical_level
329 )
~/.conda/envs/default/lib/python3.9/site-packages/torchvision/ops/poolers.py in _setup_scales(features, image_shapes, canonical_scale, canonical_level)
121 original_input_shape = (max_x, max_y)
122
--> 123 scales = [_infer_scale(feat, original_input_shape) for feat in features]
124 # get the levels in the feature map by leveraging the fact that the network always
125 # downsamples by a factor of 2 at each level.
~/.conda/envs/default/lib/python3.9/site-packages/torchvision/ops/poolers.py in <listcomp>(.0)
121 original_input_shape = (max_x, max_y)
122
--> 123 scales = [_infer_scale(feat, original_input_shape) for feat in features]
124 # get the levels in the feature map by leveraging the fact that the network always
125 # downsamples by a factor of 2 at each level.
~/.conda/envs/default/lib/python3.9/site-packages/torchvision/ops/poolers.py in _infer_scale(feature, original_size)
105 scale = 2 ** float(torch.tensor(approx_scale).log2().round())
106 possible_scales.append(scale)
--> 107 assert possible_scales[0] == possible_scales[1]
108 return possible_scales[0]
109
AssertionError:
Environment:
- OS: Ubuntu 20.04 AWS Sagemaker studio lab
- Python version: Python 3.9.12
- Detecto version: 1.2.2 torch 1.11.0 torchvision 0.12.0
Additional context
All the files including code, images and xml files are here: https://github.com/dgleba/ml635e
Specifically: https://github.com/dgleba/ml635e/blob/main/ir.ipynb and https://github.com/dgleba/ml635e/tree/main/ir
The example I implemented in https://github.com/dgleba/ml635e/blob/main/e03cast.ipynb and its associated cast03
folder works OK.
I was trying to narrow down the cause of the error.
I had the same error in both detecto and pytorch without detecto.
I resized the images from 260 x 7990 to a smaller size.
I resized the images in pytorch without detecto. The error went away and the training completed.
I found the largest size that would work was 260 x 7500.
I figure that the error would be resolved in detecto as well.
It seems like the issue is something in pytorch or torch vision.