hls-foundation-os
hls-foundation-os copied to clipboard
Config for irrigation_scenes and custom SpatioTemporalDataset loader
A mmsegmentation configuration file for the irrigation_scenes dataset on https://huggingface.co/datasets/ibm-nasa-geospatial/hls_irrigation_scenes.
As this is a time-series dataset with data from four months stored in four different folders, a custom SpatioTemporalDataset class (subclassed from GeospatialDataset) and LoadSpatioTemporalImagesFromFile class (subclassed from LoadGeospatialImageFromFile) was created to perform the data loading. Training with only the first 3 months (June, July, August) for now. Also updated the fine-tuning-examples/README.md to mention how to run the irrigation_scenes setup.
Xref original work at https://github.com/NASA-IMPACT/hls-foundation/pull/30 and https://github.com/NASA-IMPACT/hls-foundation/pull/35
P.S. This is the same branch as #4, but that one got closed somehow during the private->public conversion of the repo.
Getting a TypeError: imgs must be a list, but got <class 'torch.Tensor'>
on the validation stage in the forward_test function:
2023-08-03 17:03:11,629 - mmseg - INFO - workflow: [('train', 1)], max: 5000 iters
2023-08-03 17:03:11,629 - mmseg - INFO - Checkpoints will be saved to finetune_weights/irrigation_scenes/test_1/test_1 by HardDiskBackend.
2023-08-03 17:03:22,052 - mmcv - INFO - Reducer buckets have been rebuilt in this iteration.
2023-08-03 17:03:58,935 - mmseg - INFO - Iter [20/5000] lr: 1.893e-07, eta: 3:15:17, time: 2.353, data_time: 0.046, memory: 6031, decode.loss_ce: 3.3195, decode.acc_seg: 6.3938, aux.loss_ce: 3.4039, aux.acc_seg: 1.7806, loss: 6.7234
[ ] 0/281, elapsed: 0s, ETA:Traceback (most recent call last):
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/.mim/tools/train.py", line 242, in <module>
main()
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/.mim/tools/train.py", line 231, in main
train_segmentor(
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/apis/train.py", line 194, in train_segmentor
runner.run(data_loaders, cfg.workflow)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/iter_based_runner.py", line 134, in run
iter_runner(iter_loaders[i], **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/iter_based_runner.py", line 67, in train
self.call_hook('after_train_iter')
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
getattr(hook, fn_name)(self)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/hooks/evaluation.py", line 262, in after_train_iter
self._do_evaluate(runner)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/core/evaluation/eval_hooks.py", line 117, in _do_evaluate
results = multi_gpu_test(
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/apis/test.py", line 208, in multi_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmcv/runner/fp16_utils.py", line 110, in new_func
return old_func(*args, **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/models/segmentors/base.py", line 110, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/home/username/mambaforge/envs/hls/lib/python3.9/site-packages/mmseg/models/segmentors/base.py", line 74, in forward_test
raise TypeError(f'{name} must be a list, but got '
TypeError: imgs must be a list, but got <class 'torch.Tensor'>
This is the same one reported before at https://github.com/NASA-IMPACT/hls-foundation/pull/30#issuecomment-1603652525, which was fixed with some hacky workarounds to modify the default collate function in mmsegmentation's code here:
https://github.com/NASA-IMPACT/hls-foundation/blob/35edfb54057b18d2840b0e674277248797208b6f/mmsegmentation/mmseg/models/segmentors/base.py#L72-L75
Doesn't look possible to apply the same old workaround here anymore, so would need to find a different solution. Xref upstream issue at https://github.com/open-mmlab/mmsegmentation/issues/2410