sacred Error with @ex.main and if __name__ == '__main_

When I use @ex.main and if __name__ == '__main__':, MongoObserver collect no data.

There is minimal code to reproduce my error:

from sacred import Experiment
from sacred.observers import MongoObserver
ex = Experiment('OBB_Swin')
ex.observers.append(MongoObserver(url='localhost:27017', db_name='OBB'))

@ex.main
def my_main():
    print('test')

if __name__ == '__main__':
    # ex.run_commandline()          # correct 
    # ex.run()                      # correct 
    my_main()

Looking forward to your reply!

Dec 22 '21 08:12 HanGuangXin

There are some reasons I can't use ex.run_commandline() and ex.run(). For ex.run_commandline(), It can't work with an existing argparse. And for ex.run(), it can't work with multiple GPU training (for example: python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT $(dirname "$0")/train.py)

Dec 22 '21 08:12 HanGuangXin

Hi @HanGuangXin! Happy new year! Unfortunately, you have to use ex.run (or ex.run_commandline) for everything to work. ex.run contains the code to set up the configuration and observers. @ex.main doesn't modify my_main, it just registers it as the default main function for ex.run.

For the multi-GPU training: what exactly is not working and do you know why?

Jan 06 '22 07:01 thequilo

+1 Multi-GPU is used more and more frequently nowadays but does not work with sacred. Because the there are additional stuff in the command line to start python, just like what @HanGuangXin mentioned: python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT $(dirname "$0")/train.py

Apr 07 '22 02:04 Guptajakala

+1 Making scared work alongside torch multiprocessing is an absolute pain.

Feb 22 '23 22:02 BDHU

Error with @ex.main and if __name__ == '__main__':

Error with @ex.main and if name == 'main':