dreamerv3 icon indicating copy to clipboard operation
dreamerv3 copied to clipboard

Fully deterministic runs

Open jadkins99 opened this issue 1 year ago • 10 comments

Awesome repo. quick question,

I ran the DMC WalkerWalk experiment 3 different times with the same seeds and got 3 different learning curves. How can I get reproducible experiments?Awesome repo. quick question,

I ran the DMC WalkerWalk experiment 3 different times with the same seeds and got 3 different learning curves. How can I get reproducible experiments? curves curves curves

jadkins99 avatar Apr 15 '23 21:04 jadkins99

Hi, are you asking for fully deterministic runs? I haven't paid much attention to this but I think the agent is already fully deterministic, so you'd probably just have to set the environment seed (make sure if you use more than 1 environment instance, that the environments have different seeds so they produce different data).

danijar avatar Apr 16 '23 19:04 danijar

Okay I will try that. Thank you for the quick response! What exactly is an "environment instance"? I couldn't find a clear definition in the paper.

jadkins99 avatar Apr 18 '23 19:04 jadkins99

Also, how many seeds were the non-Minecraft experiments run for?

jadkins99 avatar Apr 18 '23 19:04 jadkins99

+1 on the question above. Maybe it's not that apparent in the paper, could you also provide some clarification on what the confidence intervals denote in the non-minecraft experiments (DMLab, DMC Proprio, Crafter, etc)? Is it std-error across multiple seeds, or std-error across a window of timesteps with a single seed, or something else?

subho406 avatar Apr 18 '23 19:04 subho406

It's mean/std across seeds and at least 3 seeds per task, often more.

danijar avatar Apr 19 '23 20:04 danijar

Update: I seeded dmc_control here. And still got non-deterministic runs. Are there other non-environment sources of randomness not seeded?

jadkins99 avatar Apr 25 '23 03:04 jadkins99

I found some non-seeded randomness in the repo. Namely here and here. Wouldn't these affect the agent?

jadkins99 avatar Apr 26 '23 02:04 jadkins99

I don't think those two methods are run ever. Could you check e.g. by adding asdf to the two methods to see if it errors?

danijar avatar Apr 27 '23 23:04 danijar

Seeding this it removes randomness from the first 1000 steps, but runs are non-deterministic afterwards.

swannercjj avatar Aug 17 '23 21:08 swannercjj

@jadkins99 @swannercjj I guess you should check this issue: https://github.com/google/jax/issues/13672#issuecomment-1515544978

It seems just seeding on jax.random does not make it as deterministic operation on GPU because of the optimisation in graph compilation process.

Dongyeongkim avatar Jul 15 '24 03:07 Dongyeongkim