rlberry
rlberry copied to clipboard
[WIP] (feat) Seeding torch & rlberry
Attempt of a fix for torch random generator compatibility.
In this PR I removed set_external_seed from agent_manager by tracking torch rng's state.
This PR would need some more testing. I put it out there for discussion's sake with issue #157 .
@mmcenta I merged your PR here to check that it all goes green (I will remove this commit before merging to main), it seems OK. I think we need to test it some more to see what is not going as planned. Mainly I changed the safe_reseed to include a reseed of torch seed.
As indication: here is a little profiling of a call of an Agent Manager fitting a DQN environment :
[INFO] init new seeder at: agent_manager.py __init__
[INFO] spawned at: agent_manager.py __init__
[INFO] init new seeder at: seeder.py <listcomp>
[INFO] spawned at: agent_manager.py _set_init_kwargs
[INFO] init new seeder at: seeder.py <listcomp>
[INFO] spawned at: agent_manager.py _reset_agent_handlers
[INFO] init new seeder at: seeder.py <listcomp>
[INFO] spawned at: agent_manager.py fit
[INFO] init new seeder at: seeder.py <listcomp>
[INFO] init new seeder at: agent.py __init__
[INFO] init new seeder at: model.py __init__
[INFO] init new seeder at: box.py __init__
[INFO] init new seeder at: discrete.py __init__
[INFO] init new seeder at: model.py reseed
[INFO] reseed at: box.py reseed seed is (0, 1, 0, 0, 0)
[INFO] reseed at: discrete.py reseed seed is (0, 1, 0, 0, 1)
[INFO] reseed at: box.py reseed seed is (0, 1, 0, 1)
[INFO] reseed at: discrete.py reseed seed is (0, 1, 0, 2)
[INFO] init new seeder at: model.py __init__
[INFO] init new seeder at: box.py __init__
[INFO] init new seeder at: discrete.py __init__
[INFO] init new seeder at: model.py reseed
[INFO] reseed at: box.py reseed seed is (0, 1, 0, 3, 0)
[INFO] reseed at: discrete.py reseed seed is (0, 1, 0, 3, 1)
[INFO] reseed at: box.py reseed seed is (0, 1, 0, 4)
[INFO] reseed at: discrete.py reseed seed is (0, 1, 0, 5)
[INFO] init new seeder at: model.py __init__
[INFO] init new seeder at: agent.py reseed
[INFO] init new seeder at: basewrapper.py reseed
[INFO] init new seeder at: model.py reseed
[INFO] reseed at: box.py reseed seed is (0, 3, 0, 0, 0, 0)
[INFO] reseed at: discrete.py reseed seed is (0, 3, 0, 0, 0, 1)
[INFO] reseed at: box.py reseed seed is (0, 3, 0, 0, 1)
[INFO] reseed at: discrete.py reseed seed is (0, 3, 0, 0, 2)
[INFO] reseed at: box.py reseed seed is (0, 3, 0, 1)
[INFO] reseed at: discrete.py reseed seed is (0, 3, 0, 2)
[INFO] init new seeder at: model.py reseed
[INFO] reseed at: box.py reseed seed is (0, 3, 0, 3, 0)
[INFO] reseed at: discrete.py reseed seed is (0, 3, 0, 3, 1)
[INFO] reseed at: box.py reseed seed is (0, 3, 0, 4)
[INFO] reseed at: discrete.py reseed seed is (0, 3, 0, 5)
[INFO] init new seeder at: agent.py __init__
[INFO] init new seeder at: model.py __init__
[INFO] init new seeder at: box.py __init__
[INFO] init new seeder at: discrete.py __init__
[INFO] init new seeder at: model.py reseed
[INFO] reseed at: box.py reseed seed is (0, 1, 1, 0, 0)
[INFO] reseed at: discrete.py reseed seed is (0, 1, 1, 0, 1)
[INFO] reseed at: box.py reseed seed is (0, 1, 1, 1)
[INFO] reseed at: discrete.py reseed seed is (0, 1, 1, 2)
[INFO] init new seeder at: model.py __init__
[INFO] init new seeder at: box.py __init__
[INFO] init new seeder at: discrete.py __init__
[INFO] init new seeder at: model.py reseed
[INFO] reseed at: box.py reseed seed is (0, 1, 1, 3, 0)
[INFO] reseed at: discrete.py reseed seed is (0, 1, 1, 3, 1)
[INFO] reseed at: box.py reseed seed is (0, 1, 1, 4)
[INFO] reseed at: discrete.py reseed seed is (0, 1, 1, 5)
[INFO] init new seeder at: model.py __init__
[INFO] init new seeder at: basewrapper.py reseed
[INFO] init new seeder at: model.py reseed
[INFO] reseed at: box.py reseed seed is (0, 2, 0, 0, 0)
[INFO] reseed at: discrete.py reseed seed is (0, 2, 0, 0, 1)
[INFO] reseed at: box.py reseed seed is (0, 2, 0, 1)
[INFO] reseed at: discrete.py reseed seed is (0, 2, 0, 2)
[INFO] reseed at: box.py reseed seed is (0, 2, 1)
[INFO] reseed at: discrete.py reseed seed is (0, 2, 2)
syntax at the end is : description of action on seeder, filename and function of caller, and seed if this is a reseed.
the profiling is done by adding some traceback utility in Seeder.py using the inspect
library. @mmcenta maybe we could use this for metadata extraction. Maybe this could also include torch rng state.
I was thinking about recording this metadata by using Python's logging
library since it has all the features we need. I also noticed today that you might need to fork the RNG states when working on other devices (e.g. a GPU) - but I am not exactly sure how to implement this forking with rlberry.