mmpretrain
mmpretrain copied to clipboard
[Bug] Error when training model - TypeError: BaseDataset.__init__() got an unexpected keyword argument 'split'
Branch
main branch (mmpretrain version)
Describe the bug
I have tried to train a model on a custom dataset using the mmpretrain library.
First I cloned the repository, then I created a dataset folder with the following structure:
- data -- custom_dataset --- train --- test --- val
Next I followed the documentation (https://mmpretrain.readthedocs.io/en/latest/user_guides/train.html) on how to train a classification model on a custom dataset.
I created a new configuration file:
configs/mobilenet_v2/mobilenet-v2_finetune.py
_base_ = [
'../_base_/models/mobilenet_v2_1x.py',
'../_base_/datasets/imagenet_bs32_pil_resize.py',
'../_base_/schedules/imagenet_bs256_epochstep.py',
'../_base_/default_runtime.py'
]
# model settings
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/mobilenet_v2/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# data settings
data_root = 'data/custom_dataset'
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
))
val_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='val',
))
test_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='test',
))
# schedule settings
optim_wrapper = dict(
optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
param_scheduler = dict(
type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
I then tried to train the model on my custom dataset with the command python ./tools/train.py ./configs/mobilenet_v2/mobilenet-v2_finetune.py
Then I get the following error:
C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda\__init__.py:107: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ..\c10\cuda\CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
12/30 17:42:16 - mmengine - INFO -
------------------------------------------------------------
System environment:
sys.platform: win32
Python: 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
CUDA available: False
numpy_random_seed: 1691281147
MSVC: Microsoft (R) C/C++-Optimierungscompiler Version 19.26.28806 für x64
GCC: n/a
PyTorch: 2.0.1+cu117
PyTorch compiling details: PyTorch built with:
- C++ Version: 199711
- MSVC 193431937
- Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
- OpenMP 2019
- LAPACK is enabled (usually provided by MKL)
- CPU capability usage: AVX2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj /FS -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=OFF, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,
TorchVision: 0.15.2+cu117
OpenCV: 4.7.0
MMEngine: 0.10.2
Runtime environment:
cudnn_benchmark: False
mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
dist_cfg: {'backend': 'nccl'}
seed: 1691281147
deterministic: False
Distributed launcher: none
Distributed training: False
GPU number: 1
------------------------------------------------------------
12/30 17:42:16 - mmengine - INFO - Config:
auto_scale_lr = dict(base_batch_size=256)
data_preprocessor = dict(
mean=[
123.675,
116.28,
103.53,
],
num_classes=1000,
std=[
58.395,
57.12,
57.375,
],
to_rgb=True)
data_root = 'data/custom_dataset'
dataset_type = 'ImageNet'
default_hooks = dict(
checkpoint=dict(interval=1, type='CheckpointHook'),
logger=dict(interval=100, type='LoggerHook'),
param_scheduler=dict(type='ParamSchedulerHook'),
sampler_seed=dict(type='DistSamplerSeedHook'),
timer=dict(type='IterTimerHook'),
visualization=dict(enable=False, type='VisualizationHook'))
default_scope = 'mmpretrain'
env_cfg = dict(
cudnn_benchmark=False,
dist_cfg=dict(backend='nccl'),
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0))
launcher = 'none'
load_from = None
log_level = 'INFO'
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
checkpoint=
'https://download.openmmlab.com/mmclassification/v0/mobilenet_v2/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth',
prefix='backbone',
type='Pretrained'),
type='MobileNetV2',
widen_factor=1.0),
head=dict(
in_channels=1280,
loss=dict(loss_weight=1.0, type='CrossEntropyLoss'),
num_classes=10,
topk=(
1,
5,
),
type='LinearClsHead'),
neck=dict(type='GlobalAveragePooling'),
type='ImageClassifier')
optim_wrapper = dict(
optimizer=dict(lr=0.01, momentum=0.9, type='SGD', weight_decay=0.0001))
param_scheduler = dict(
by_epoch=True,
gamma=0.1,
milestones=[
15,
],
step_size=1,
type='MultiStepLR')
randomness = dict(deterministic=False, seed=None)
resume = False
test_cfg = dict()
test_dataloader = dict(
batch_size=32,
collate_fn=dict(type='default_collate'),
dataset=dict(
ann_file='',
data_prefix='test',
data_root='data/custom_dataset',
pipeline=[
dict(type='LoadImageFromFile'),
dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
dict(crop_size=224, type='CenterCrop'),
dict(type='PackInputs'),
],
split='val',
type='CustomDataset'),
num_workers=5,
persistent_workers=True,
pin_memory=True,
sampler=dict(shuffle=False, type='DefaultSampler'))
test_evaluator = dict(
topk=(
1,
5,
), type='Accuracy')
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
dict(crop_size=224, type='CenterCrop'),
dict(type='PackInputs'),
]
train_cfg = dict(by_epoch=True, max_epochs=300, val_interval=1)
train_dataloader = dict(
batch_size=32,
collate_fn=dict(type='default_collate'),
dataset=dict(
ann_file='',
data_prefix='train',
data_root='data/custom_dataset',
pipeline=[
dict(type='LoadImageFromFile'),
dict(backend='pillow', scale=224, type='RandomResizedCrop'),
dict(direction='horizontal', prob=0.5, type='RandomFlip'),
dict(type='PackInputs'),
],
split='train',
type='CustomDataset'),
num_workers=5,
persistent_workers=True,
pin_memory=True,
sampler=dict(shuffle=True, type='DefaultSampler'))
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(backend='pillow', scale=224, type='RandomResizedCrop'),
dict(direction='horizontal', prob=0.5, type='RandomFlip'),
dict(type='PackInputs'),
]
val_cfg = dict()
val_dataloader = dict(
batch_size=32,
collate_fn=dict(type='default_collate'),
dataset=dict(
ann_file='',
data_prefix='val',
data_root='data/custom_dataset',
pipeline=[
dict(type='LoadImageFromFile'),
dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
dict(crop_size=224, type='CenterCrop'),
dict(type='PackInputs'),
],
split='val',
type='CustomDataset'),
num_workers=5,
persistent_workers=True,
pin_memory=True,
sampler=dict(shuffle=False, type='DefaultSampler'))
val_evaluator = dict(
topk=(
1,
5,
), type='Accuracy')
vis_backends = [
dict(type='LocalVisBackend'),
]
visualizer = dict(
type='UniversalVisualizer', vis_backends=[
dict(type='LocalVisBackend'),
])
work_dir = './work_dirs\\mobilenet-v2_finetune'
12/30 17:42:21 - mmengine - INFO - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.
12/30 17:42:21 - mmengine - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) RuntimeInfoHook
(BELOW_NORMAL) LoggerHook
--------------------
before_train:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(VERY_LOW ) CheckpointHook
--------------------
before_train_epoch:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(NORMAL ) DistSamplerSeedHook
--------------------
before_train_iter:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
--------------------
after_train_iter:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook
--------------------
after_train_epoch:
(NORMAL ) IterTimerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook
--------------------
before_val:
(VERY_HIGH ) RuntimeInfoHook
--------------------
before_val_epoch:
(NORMAL ) IterTimerHook
--------------------
before_val_iter:
(NORMAL ) IterTimerHook
--------------------
after_val_iter:
(NORMAL ) IterTimerHook
(NORMAL ) VisualizationHook
(BELOW_NORMAL) LoggerHook
--------------------
after_val_epoch:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook
--------------------
after_val:
(VERY_HIGH ) RuntimeInfoHook
--------------------
after_train:
(VERY_HIGH ) RuntimeInfoHook
(VERY_LOW ) CheckpointHook
--------------------
before_test:
(VERY_HIGH ) RuntimeInfoHook
--------------------
before_test_epoch:
(NORMAL ) IterTimerHook
--------------------
before_test_iter:
(NORMAL ) IterTimerHook
--------------------
after_test_iter:
(NORMAL ) IterTimerHook
(NORMAL ) VisualizationHook
(BELOW_NORMAL) LoggerHook
--------------------
after_test_epoch:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
--------------------
after_test:
(VERY_HIGH ) RuntimeInfoHook
--------------------
after_run:
(BELOW_NORMAL) LoggerHook
--------------------
Traceback (most recent call last):
File "C:\Users\tilof\PycharmProjects\DeepLearningProjects\OpenMMLab\mmpretrain\tools\train.py", line 162, in <module>
main()
File "C:\Users\tilof\PycharmProjects\DeepLearningProjects\OpenMMLab\mmpretrain\tools\train.py", line 158, in main
runner.train()
File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\runner.py", line 1728, in train
self._train_loop = self.build_train_loop(
File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\runner.py", line 1527, in build_train_loop
loop = EpochBasedTrainLoop(
File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\loops.py", line 44, in __init__
super().__init__(runner, dataloader)
File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\base_loop.py", line 26, in __init__
self.dataloader = runner.build_dataloader(
File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\runner.py", line 1370, in build_dataloader
dataset = DATASETS.build(dataset_cfg)
File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\registry\registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\registry\build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmpretrain\datasets\custom.py", line 207, in __init__
super().__init__(
TypeError: BaseDataset.__init__() got an unexpected keyword argument 'split'
Environment
{'sys.platform': 'win32', 'Python': '3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 ' '64 bit (AMD64)]', 'CUDA available': False, 'numpy_random_seed': 2147483648, 'MSVC': 'Microsoft (R) C/C++-Optimierungscompiler Version 19.26.28806 für x64', 'GCC': 'n/a', 'PyTorch': '2.0.1+cu117', 'TorchVision': '0.15.2+cu117', 'OpenCV': '4.7.0', 'MMEngine': '0.10.2', 'MMCV': '2.1.0', 'MMPreTrain': '1.1.1+e95d9ac'}
Other information
No response
I had the same problem following the guide How to Pretrain with Custom Dataset.
The problem is that the dataset you are overriding has a split
argument (_base_/datasets/imagenet_bs32_pil_resize.py#L32
) which doesn't work with the CustomDataset
.
The solution I found was to copy all the arguments and add an extra _delete_=True
(doc). Something like this (to repeat for other datasets):
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', scale=224, backend='pillow'),
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
dict(type='PackInputs'),
]
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
pipeline=train_pipeline,
_delete_=True,
))
Hi, @leon-costa,
I'm trying but not working, Is there any way to fix the above problem?
Hi everyone, any update? I am also having exact same problem with CustomDataset
I have made it worked.
@leon-costa 's solution and the link he gave https://mmpretrain.readthedocs.io/en/latest/user_guides/config.html#ignore-some-fields-in-the-base-configs helped me better understand the problem.
In my case I have removed the
'../base/datasets/imagenet_bs32_pil_resize.py', from my config's base,
then applied required dict settings (of course without split) for dataset into my config. Then it worked. thanks all for guiding
@TNodeCode Just remove split args of each dataloader config.
train_dataloader = dict(
batch_size=32,
collate_fn=dict(type='default_collate'),
dataset=dict(
ann_file='',
data_prefix='train',
data_root='data/custom_dataset',
pipeline=[
dict(type='LoadImageFromFile'),
dict(backend='pillow', scale=224, type='RandomResizedCrop'),
dict(direction='horizontal', prob=0.5, type='RandomFlip'),
dict(type='PackInputs'),
],
split='train', <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< remove (same as val_dataloader)
type='CustomDataset'),
num_workers=5,
persistent_workers=True,
pin_memory=True,
sampler=dict(shuffle=True, type='DefaultSampler'))
split
option is only used with datasets that have implemented the split feature, so if the split
feature has not been specifically configured when using a custom dataset, it can be removed.
A prominent dataset that utilizes this feature is ImageNet
.
An alternative solution is to subclass CustomDataset and just throw away the split arg:
from mmpretrain.registry import DATASETS
from mmpretrain.datasets.custom import CustomDataset
@DATASETS.register_module()
class CustomDataset2(CustomDataset):
def __init__(self, split=None, **kwargs):
super(CustomDataset2, self).__init__(**kwargs)