YOLOX icon indicating copy to clipboard operation
YOLOX copied to clipboard

Reproducibility in YOLOX

Open Ibraheem951 opened this issue 2 years ago • 4 comments

Presently, YOLOX code is not reproducible and outputs are different everytime. One of the reasons is that seed is randomly set for workers in yolox/data/dataloading.py file. Is there a way to make the pipeline reproducible?

Ibraheem951 avatar Feb 20 '23 11:02 Ibraheem951

Could you provide your command?

If you want a reproducible code, the following gist might help you.

def seed_everything(seed: int):
    import random, os
    import numpy as np
    import torch
    
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = True
    
seed_everything(42)

FateScript avatar Feb 28 '23 08:02 FateScript

Same problem here, i can't reproduce my results with exactly same configurations (exp file) with seed used. What I have tried:

  1. Use seed in exp file: self.seed = 42
  2. Overwrite exp's seed in command line: python -m yolox.tools.train -f exps/example/custom/my_custom_exp.py -d 1 -b 64 -o -c pretrains/yolox_nano.pth seed 42
  3. Try @FateScript 's seed_everything() like above in tools/train.py
if exp.seed is not None:
        exp.seed = int(exp.seed)
        seed_everything(exp.seed)
        warnings.warn(
            "You have chosen to seed training. This will turn on the CUDNN deterministic setting, "
            "which can slow down your training considerably! You may see unexpected behavior "
            "when restarting from checkpoints."
        )
  1. Change worker init function in yolox/data/dataloading.py and yolox/exp/yolox_base.py from
# yolox/data/dataloading.py
def worker_init_reset_seed(worker_id):
    seed = uuid.uuid4().int % 2**32
    random.seed(seed)
    torch.set_rng_state(torch.manual_seed(seed).get_state())
    np.random.seed(seed)

# yolox/exp/yolox_base.py
dataloader_kwargs["worker_init_fn"] = worker_init_reset_seed

to


# yolox/data/dataloading.py
def worker_init_reset_seed(seed):
    def _seed_all(worker_id):
        from yolox.utils import seed_everything
        worker_seed = worker_id + seed        
        seed_everything(worker_seed)
        print(f'DO WORKER INIT RESET SEED WITH SEED = {worker_seed}, WORKER_ID = {worker_id}')
    return _seed_all

# yolox/exp/yolox_base.py
dataloader_kwargs["worker_init_fn"] = worker_init_reset_seed(self.seed)

Any workaround ? Many thanks !

dangnh0611 avatar Mar 07 '23 10:03 dangnh0611

I'm facing the same problem. Have you since found a way to make the training reproducable @daquexian @Ibraheem951?

wiedermaschinisch avatar Aug 16 '23 21:08 wiedermaschinisch

it seems that you need to set

cudnn.benchmark = False

Or else it will conflict with cudnn.deterministic = True and cause different result for each training.

pasin-k avatar Dec 04 '23 03:12 pasin-k