deepethogram icon indicating copy to clipboard operation
deepethogram copied to clipboard

[COLAB] Error decoding frame from video at Train Flow Generator step in colab:

Open karinmcode opened this issue 1 year ago • 0 comments

Hi,

I get this error after this line flow_generator = flow_generator_train(cfg) of the colab notebook:

... ValueError: error decoding frame 3885 from video /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/DATA/CAM1_m523_211109_001/CAM1_m523_211109_001.mp4 ...

What I tried:

  • I tried replacing my data with the test data (testing_deepethogram) provided on github ==> A different error message pop up after 70% training
IndexError                                Traceback (most recent call last)
[<ipython-input-19-1d7b632134f1>](https://localhost:8080/#) in <cell line: 1>()
----> 1 flow_generator = flow_generator_train(cfg)

51 frames
[/usr/local/lib/python3.10/dist-packages/kornia/augmentation/_2d/base.py](https://localhost:8080/#) in generate_transformation_matrix(self, input, params, flags)
     81         else:
     82             trans_matrix_A = self.identity_matrix(in_tensor)
---> 83             trans_matrix_B = self.compute_transformation(in_tensor[to_apply], params=params, flags=flags)
     84 
     85             if is_autocast_enabled():

IndexError: The shape of the mask [352] at index 0 does not match the shape of the indexed tensor [308, 3, 224, 224] at index 0
  • I tried different video files. ==> The same problem occurs
  • I played the videos until the frames that caused problems. ==> The video files are not corrupted. They also load and display fine in the GUI
  • I restarted the runtime ==> no difference
  • I tried upgrading ffmpeg and openCV. ==> no difference

Full error:

[2023-10-21 21:00:19,824] INFO [deepethogram.projects.convert_config_paths_to_absolute:1135] cwd in absolute: /content
[2023-10-21 21:00:19,830] INFO [deepethogram.projects.convert_config_paths_to_absolute:1178] after absolute: {'class_names': ['background', 'resting', 'adjusting', 'walking', 'running', 'grooming', 'sniffing', 'sound'], 'config_file': '/content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/project_config.yaml', 'data_path': '/content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/DATA', 'labeler': None, 'model_path': '/content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/models', 'name': 'test3', 'path': '/content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram', 'pretrained_path': '/content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/models/pretrained_models'}
[2023-10-21 21:00:19,867] INFO [deepethogram.flow_generator.train.flow_generator_train:54] args: /usr/local/lib/python3.10/dist-packages/colab_kernel_launcher.py -f /root/.local/share/jupyter/runtime/kernel-0ecd2f06-7eed-429e-8f1f-7a3dea22e220.json
[2023-10-21 21:00:19,872] INFO [deepethogram.flow_generator.train.flow_generator_train:62] configuration used ~~~~~
[2023-10-21 21:00:19,892] INFO [deepethogram.flow_generator.train.flow_generator_train:63] split:
  reload: true
  file: null
  train_val_test:
  - 0.8
  - 0.2
  - 0.0
compute:
  fp16: false
  num_workers: 2
  batch_size: 32
  min_batch_size: 8
  max_batch_size: 512
  distributed: false
  gpu_id: 0
  dali: false
  metrics_workers: 0
reload:
  overwrite_cfg: false
  latest: false
notes: null
log:
  level: info
augs:
  brightness: 0.25
  contrast: 0.1
  hue: 0.1
  saturation: 0.1
  color_p: 0.5
  grayscale: 0.5
  crop_size: null
  resize:
  - 224
  - 224
  dali: false
  random_resize: false
  pad: null
  LR: 0.5
  UD: 0.0
  degrees: 10
  normalization:
    'N': 65286144
    mean:
    - 0.38076755964346387
    - 0.38076755964346387
    - 0.38076755964346387
    std:
    - 0.2487480534020329
    - 0.2487480534020329
    - 0.2487480534020329
train:
  lr: 0.0001
  scheduler: plateau
  num_epochs: 10
  steps_per_epoch:
    train: 1000
    val: 200
    test: 20
  min_lr: 5.0e-07
  stopping_type: learning_rate
  milestones:
  - 50
  - 100
  - 150
  - 200
  - 250
  - 300
  weight_loss: true
  patience: 3
  early_stopping_begins: 0
  viz_metrics: true
  viz_examples: 10
  reduction_factor: 0.1
  loss_weight_exp: 1.0
  loss_gamma: 1.0
  label_smoothing: 0.05
  oversampling_exp: 0.0
  regularization:
    style: l2_sp
    alpha: 1.0e-05
    beta: 0.001
flow_generator:
  type: flow_generator
  flow_loss: MotionNet
  flow_max: 10
  input_images: 11
  flow_sparsity: false
  smooth_weight_multiplier: 1.0
  sparsity_weight: 0.0
  loss: MotionNet
  max: 5
  n_rgb: 11
  arch: TinyMotionNet
  weights: pretrained
  'n': 10
feature_extractor:
  arch: resnet18
  n_flow: 10
  n_rgb: 1
cmap: deepethogram
control_arrow_jump: 31
label_view_width: 31
postprocessor:
  min_bout_length: 1
  type: min_bout_per_behavior
prediction_opacity: 0.2
project:
  class_names:
  - background
  - resting
  - adjusting
  - walking
  - running
  - grooming
  - sniffing
  - sound
  config_file: /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version
    230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/project_config.yaml
  data_path: /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version
    230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/DATA
  labeler: null
  model_path: /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version
    230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/models
  name: test3
  path: /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers
    requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram
  pretrained_path: /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper
    version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/models/pretrained_models
run:
  type: train
  model: flow_generator
  dir: /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers
    requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/models/231021_210019_flow_generator_train
sequence:
  filter_length: 15
unlabeled_alpha: 0.1
vertical_arrow_jump: 3

[2023-10-21 21:00:26,104] INFO [deepethogram.flow_generator.train.flow_generator_train:67] Total trainable params: 1,951,784
[2023-10-21 21:00:41,911] INFO [deepethogram.projects.get_weightfile_from_cfg:1068] loading pretrained weights: /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/models/pretrained_models/200221_115158_TinyMotionNet/checkpoint.pt
[2023-10-21 21:00:41,916] INFO [deepethogram.utils.load_state:341] loading from checkpoint file /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/models/pretrained_models/200221_115158_TinyMotionNet/checkpoint.pt...
reloading weights...
[2023-10-21 21:00:43,834] INFO [deepethogram.flow_generator.train.get_metrics:364] key metric is SSIM
[2023-10-21 21:00:43,935] INFO [deepethogram.data.augs.get_gpu_transforms:246] GPU transforms: {'train': Sequential(
  (0): ToFloat()
  (1): VideoSequential(
    (RandomHorizontalFlip_0): RandomHorizontalFlip(p=0.5, p_batch=1.0, same_on_batch=False)
    (RandomRotation_1): RandomRotation(degrees=10, p=0.5, p_batch=1.0, same_on_batch=False, resample=bilinear, align_corners=True)
    (ColorJitter_2): ColorJitter(brightness=0.25, contrast=0.1, saturation=0.1, hue=0.1, p=0.5, p_batch=1.0, same_on_batch=False)
    (RandomGrayscale_3): RandomGrayscale(p=0.5, p_batch=1.0, same_on_batch=False)
  )
  (2): NormalizeVideo()
  (3): StackClipInChannels()
), 'val': Sequential(
  (0): ToFloat()
  (1): NormalizeVideo()
  (2): StackClipInChannels()
), 'test': Sequential(
  (0): ToFloat()
  (1): NormalizeVideo()
  (2): StackClipInChannels()
), 'denormalize': Sequential(
  (0): UnstackClip()
  (1): DenormalizeVideo()
)}
[2023-10-21 21:00:43,936] INFO [deepethogram.base.__init__:95] scheduler mode: min
[2023-10-21 21:00:44,198] INFO [deepethogram.losses.get_regularization_loss:204] Regularization: L2_SP. Pretrained file: /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/models/pretrained_models/200221_115158_TinyMotionNet/checkpoint.pt alpha: 1e-05 beta: 0.001
[2023-10-21 21:00:44,246] INFO [deepethogram.flow_generator.losses.__init__:178] Using MotionNet Loss with settings: smooth_weights: [0.01, 0.02, 0.04, 0.08, 0.16] flow_sparsity: False sparsity_weight: 0.0
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/connectors/callback_connector.py:90: LightningDeprecationWarning: Setting `Trainer(progress_bar_refresh_rate=1)` is deprecated in v1.5 and will be removed in v1.7. Please pass `pytorch_lightning.callbacks.progress.TQDMProgressBar` with `refresh_rate` directly to the Trainer's `callbacks` argument instead. Or, to disable the progress bar pass `enable_progress_bar = False` to the Trainer.
  rank_zero_deprecation(
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/connectors/data_connector.py:88: LightningDeprecationWarning: `reload_dataloaders_every_epoch` is deprecated in v1.4 and will be removed in v1.6. Please use `reload_dataloaders_every_n_epochs` in Trainer.
  rank_zero_deprecation(
[2023-10-21 21:00:44,258] INFO [pytorch_lightning.utilities.distributed._info:93] GPU available: True, used: True
[2023-10-21 21:00:44,259] INFO [pytorch_lightning.utilities.distributed._info:93] TPU available: False, using: 0 TPU cores
[2023-10-21 21:00:44,264] INFO [pytorch_lightning.utilities.distributed._info:93] IPU available: False, using: 0 IPUs
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/configuration_validator.py:275: LightningDeprecationWarning: The `on_keyboard_interrupt` callback hook was deprecated in v1.5 and will be removed in v1.7. Please use the `on_exception` callback hook instead.
  rank_zero_deprecation(
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/configuration_validator.py:291: LightningDeprecationWarning: Base `Callback.on_train_batch_start` hook signature has changed in v1.5. The `dataloader_idx` argument will be removed in v1.7.
  rank_zero_deprecation(
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/configuration_validator.py:291: LightningDeprecationWarning: Base `Callback.on_train_batch_end` hook signature has changed in v1.5. The `dataloader_idx` argument will be removed in v1.7.
  rank_zero_deprecation(
[2023-10-21 21:00:44,282] INFO [pytorch_lightning.accelerators.gpu.set_nvidia_flags:59] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2023-10-21 21:00:52,250] INFO [deepethogram.base.configure_optimizers:227] learning rate: 0.0001
[2023-10-21 21:00:52,258] WARNING [pytorch_lightning.loggers.tensorboard._get_next_version:298] Missing logger folder: /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/models/231021_210019_flow_generator_train/default
[2023-10-21 21:00:52,263] INFO [pytorch_lightning.callbacks.model_summary.summarize:73] 
  | Name      | Type          | Params
--------------------------------------------
0 | model     | TinyMotionNet | 2.0 M 
1 | criterion | MotionNetLoss | 0     
--------------------------------------------
2.0 M     Trainable params
0         Non-trainable params
2.0 M     Total params
7.807     Total estimated model params size (MB)
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/data_loading.py:659: UserWarning: Your `val_dataloader` has `shuffle=True`, it is strongly recommended that you turn this off for val/test/predict dataloaders.
  rank_zero_warn(`

`Epoch 0: 3%
8/284 [01:27<50:25, 10.96s/it, loss=0.323, v_num=0]`

`---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-12-1d7b632134f1>](https://localhost:8080/#) in <cell line: 1>()
----> 1 flow_generator = flow_generator_train(cfg)

22 frames
[/usr/local/lib/python3.10/dist-packages/torch/_utils.py](https://localhost:8080/#) in reraise(self)
    692             # instantiate since we don't know how to
    693             raise RuntimeError(msg) from None
--> 694         raise exception
    695 
    696 

AttributeError: Caught AttributeError in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/deepethogram/data/datasets.py", line 342, in __getitem__
    image = reader[i + start_frame]
  File "/usr/local/lib/python3.10/dist-packages/vidio/read.py", line 70, in __getitem__
    return self.read(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/vidio/read.py", line 115, in read
    raise ValueError('error decoding frame {} from video {}'.format(framenum, self.filename))
ValueError: error decoding frame 3885 from video /content/drive/MyDrive/Research/Schneider lab/Paper/Karin paper version 230911/Reviewers requests/Fig 5 Automated behavior classification/DeepEthogram/test3_deepethogram/DATA/CAM1_m523_211109_001/CAM1_m523_211109_001.mp4

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.10/dist-packages/deepethogram/data/datasets.py", line 415, in __getitem__
    return self.dataset[index]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataset.py", line 302, in __getitem__
    return self.datasets[dataset_idx][sample_idx]
  File "/usr/local/lib/python3.10/dist-packages/deepethogram/data/datasets.py", line 344, in __getitem__
    image = self._zeros_image.copy().transpose(1, 2, 0)
AttributeError: 'NoneType' object has no attribute 'copy'

karinmcode avatar Oct 21 '23 21:10 karinmcode