rlberry icon indicating copy to clipboard operation
rlberry copied to clipboard

[WIP] (feat) Seeding torch & rlberry

Open TimotheeMathieu opened this issue 2 years ago • 3 comments

Attempt of a fix for torch random generator compatibility.

In this PR I removed set_external_seed from agent_manager by tracking torch rng's state.

This PR would need some more testing. I put it out there for discussion's sake with issue #157 .

TimotheeMathieu avatar Mar 24 '22 15:03 TimotheeMathieu

@mmcenta I merged your PR here to check that it all goes green (I will remove this commit before merging to main), it seems OK. I think we need to test it some more to see what is not going as planned. Mainly I changed the safe_reseed to include a reseed of torch seed.

TimotheeMathieu avatar Mar 24 '22 18:03 TimotheeMathieu

As indication: here is a little profiling of a call of an Agent Manager fitting a DQN environment :

[INFO] init new seeder at: agent_manager.py   __init__ 
[INFO] spawned at: agent_manager.py   __init__ 
[INFO] init new seeder at: seeder.py   <listcomp> 
[INFO] spawned at: agent_manager.py   _set_init_kwargs 
[INFO] init new seeder at: seeder.py   <listcomp> 
[INFO] spawned at: agent_manager.py   _reset_agent_handlers 
[INFO] init new seeder at: seeder.py   <listcomp> 
[INFO] spawned at: agent_manager.py   fit 
[INFO] init new seeder at: seeder.py   <listcomp> 
[INFO] init new seeder at: agent.py   __init__ 
[INFO] init new seeder at: model.py   __init__ 
[INFO] init new seeder at: box.py   __init__ 
[INFO] init new seeder at: discrete.py   __init__ 
[INFO] init new seeder at: model.py   reseed 
[INFO] reseed at: box.py   reseed seed is (0, 1, 0, 0, 0) 
[INFO] reseed at: discrete.py   reseed seed is (0, 1, 0, 0, 1) 
[INFO] reseed at: box.py   reseed seed is (0, 1, 0, 1) 
[INFO] reseed at: discrete.py   reseed seed is (0, 1, 0, 2) 
[INFO] init new seeder at: model.py   __init__ 
[INFO] init new seeder at: box.py   __init__ 
[INFO] init new seeder at: discrete.py   __init__ 
[INFO] init new seeder at: model.py   reseed 
[INFO] reseed at: box.py   reseed seed is (0, 1, 0, 3, 0) 
[INFO] reseed at: discrete.py   reseed seed is (0, 1, 0, 3, 1) 
[INFO] reseed at: box.py   reseed seed is (0, 1, 0, 4) 
[INFO] reseed at: discrete.py   reseed seed is (0, 1, 0, 5) 
[INFO] init new seeder at: model.py   __init__ 
[INFO] init new seeder at: agent.py   reseed 
[INFO] init new seeder at: basewrapper.py   reseed 
[INFO] init new seeder at: model.py   reseed 
[INFO] reseed at: box.py   reseed seed is (0, 3, 0, 0, 0, 0) 
[INFO] reseed at: discrete.py   reseed seed is (0, 3, 0, 0, 0, 1) 
[INFO] reseed at: box.py   reseed seed is (0, 3, 0, 0, 1) 
[INFO] reseed at: discrete.py   reseed seed is (0, 3, 0, 0, 2) 
[INFO] reseed at: box.py   reseed seed is (0, 3, 0, 1) 
[INFO] reseed at: discrete.py   reseed seed is (0, 3, 0, 2) 
[INFO] init new seeder at: model.py   reseed 
[INFO] reseed at: box.py   reseed seed is (0, 3, 0, 3, 0) 
[INFO] reseed at: discrete.py   reseed seed is (0, 3, 0, 3, 1) 
[INFO] reseed at: box.py   reseed seed is (0, 3, 0, 4) 
[INFO] reseed at: discrete.py   reseed seed is (0, 3, 0, 5) 
[INFO] init new seeder at: agent.py   __init__ 
[INFO] init new seeder at: model.py   __init__ 
[INFO] init new seeder at: box.py   __init__ 
[INFO] init new seeder at: discrete.py   __init__ 
[INFO] init new seeder at: model.py   reseed 
[INFO] reseed at: box.py   reseed seed is (0, 1, 1, 0, 0) 
[INFO] reseed at: discrete.py   reseed seed is (0, 1, 1, 0, 1) 
[INFO] reseed at: box.py   reseed seed is (0, 1, 1, 1) 
[INFO] reseed at: discrete.py   reseed seed is (0, 1, 1, 2) 
[INFO] init new seeder at: model.py   __init__ 
[INFO] init new seeder at: box.py   __init__ 
[INFO] init new seeder at: discrete.py   __init__ 
[INFO] init new seeder at: model.py   reseed 
[INFO] reseed at: box.py   reseed seed is (0, 1, 1, 3, 0) 
[INFO] reseed at: discrete.py   reseed seed is (0, 1, 1, 3, 1) 
[INFO] reseed at: box.py   reseed seed is (0, 1, 1, 4) 
[INFO] reseed at: discrete.py   reseed seed is (0, 1, 1, 5) 
[INFO] init new seeder at: model.py   __init__ 
[INFO] init new seeder at: basewrapper.py   reseed 
[INFO] init new seeder at: model.py   reseed 
[INFO] reseed at: box.py   reseed seed is (0, 2, 0, 0, 0) 
[INFO] reseed at: discrete.py   reseed seed is (0, 2, 0, 0, 1) 
[INFO] reseed at: box.py   reseed seed is (0, 2, 0, 1) 
[INFO] reseed at: discrete.py   reseed seed is (0, 2, 0, 2) 
[INFO] reseed at: box.py   reseed seed is (0, 2, 1) 
[INFO] reseed at: discrete.py   reseed seed is (0, 2, 2) 

syntax at the end is : description of action on seeder, filename and function of caller, and seed if this is a reseed.

the profiling is done by adding some traceback utility in Seeder.py using the inspect library. @mmcenta maybe we could use this for metadata extraction. Maybe this could also include torch rng state.

TimotheeMathieu avatar Mar 26 '22 09:03 TimotheeMathieu

I was thinking about recording this metadata by using Python's logging library since it has all the features we need. I also noticed today that you might need to fork the RNG states when working on other devices (e.g. a GPU) - but I am not exactly sure how to implement this forking with rlberry.

mmcenta avatar Mar 28 '22 13:03 mmcenta