Support grayscale (one-channel) images

Open lucia-leonie opened this issue 3 months ago • 1 comments

By specifying num_channels = 1 in the experiment file (instead of the default 3), these changes support training with grayscale images:

num_channels parameter in COCO and VOC dataset classes with adapted load_img function, which reads grayscale images if num_channels == 1
Focus in stem of backbone allows for in_channels to be 1 instead of being hard-coded to 3
model_info in model_utils adapts img to fit the in_channels of the stem of the backbone
data augmentation / mosaic detection files allow for channel dimension to be absent

Apart from these specific changes to support the main goal of this PR (to allow grayscale images directly), there are two minor modifications:

in coco dataset do not remove the info field in the remove_useless_info function. In the pycocotools version 2.0.9 (and subsequent ones) in the loadRes function this line was added res.dataset['info'] = copy.deepcopy(self.dataset['info']) which leads to a KeyError in the evaluation, if the info field is missing
specify the name parameter in the get_dataset and get_eval_dataset in the Experiment class in yolox_base.py to allow classes, which inherit this experiment class, to easily specify a custom image folder (e.g. by overwriting self.test_name in the init function)

Sep 09 '25 21:09 lucia-leonie

All committers have signed the CLA.

Sep 09 '25 21:09 CLAassistant