Dojo.jl icon indicating copy to clipboard operation
Dojo.jl copied to clipboard

Both ant_ars and halfcheetah_ars broken

Open GlenHenshaw opened this issue 2 years ago • 2 comments

Julia 1.7.2 running on macOS 12.3, Apple M1 architecture. After the training is complete, the examples throw an identical error:

    episode 98 reward_evaluation -0.7817167591651243. Took 36.599132333 seconds
    episode 99 reward_evaluation -13.705568624874973. Took 36.127634708 seconds
    episode 100 reward_evaluation -10.772545859258324. Took 38.790820875 seconds
    rewards = [44.0188651955438, 84.07397185902886, 58.151621089522905, 74.00135312248605, 61.31391504999342]
    mean(train_time_best) = 2776.0249286366666
    std(train_time_best) = 178.33750793333417
    mean(rewards) = 64.31194526331501
    std(rewards) = 15.355529928613961
    WARNING: both Dojo and Base export "close"; uses of it in module Main must be qualified
    ERROR: LoadError: UndefVarError: close not defined
    Stacktrace:
     [1] display_policy(env::Environment{Dojo.Ant, Float64, Mechanism{Float64, 35, 13, 13, 9}, BoxSpace{Float64, 8}, BoxSpace{Float64, 37}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64}; rendering::Bool)
       @ Main ~/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/algorithms/ars.jl:189
     [2] display_policy(env::Environment{Dojo.Ant, Float64, Mechanism{Float64, 35, 13, 13, 9}, BoxSpace{Float64, 8}, BoxSpace{Float64, 37}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64})
       @ Main ~/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/algorithms/ars.jl:173
     [3] top-level scope
       @ ~/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/ant_ars.jl:124
     [4] include(fname::String)
       @ Base.MainInclude ./client.jl:451
     [5] top-level scope
       @ REPL[3]:1
    in expression starting at /Users/glenhenshaw/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/ant_ars.jl:124

GlenHenshaw avatar Mar 22 '22 13:03 GlenHenshaw

I made a change to use Base.close. This should fix the issue you are seeing.

thowell avatar Mar 22 '22 16:03 thowell

I just pulled Dojo#main and the problem is still there. Running halfcheetah_ars.jl:

    episode 28 reward_evaluation 5.2598850727605795. Took 32.763520375 seconds
    episode 29 reward_evaluation 6.304627245502632. Took 31.461139625 seconds
    episode 30 reward_evaluation 4.928108782533084. Took 31.937667875 seconds
    rewards = [8.6721774823914, 37.47027757269155, 70.57463206704156, 48.87879789080837, 63.227295557744455]
    mean(train_time_best) = 173.570155853
    std(train_time_best) = 7.095665020032407
    mean(rewards) = 45.76463611413546std(rewards) = 24.366089389557583
    WARNING: both Dojo and Base export "close"; uses of it in module Main must be qualified
    ERROR: LoadError: UndefVarError: close not defined
    Stacktrace:
     [1] display_policy(env::Environment{Dojo.HalfCheetah, Float64, Mechanism{Float64, 23, 7, 7, 9}, BoxSpace{Float64, 6}, BoxSpace{Float64, 18}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64}; rendering::Bool)
       @ Main ~/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/algorithms/ars.jl:189
     [2] display_policy(env::Environment{Dojo.HalfCheetah, Float64, Mechanism{Float64, 23, 7, 7, 9}, BoxSpace{Float64, 6}, BoxSpace{Float64, 18}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64})
       @ Main ~/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/algorithms/ars.jl:173
     [3] top-level scope
       @ ~/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/halfcheetah_ars.jl:84
     [4] include(fname::String)
       @ Base.MainInclude ./client.jl:451
     [5] top-level scope
       @ REPL[4]:1
    in expression starting at /Users/glenhenshaw/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/halfcheetah_ars.jl:84

GlenHenshaw avatar Mar 23 '22 22:03 GlenHenshaw

Ant ARS should work again, although the hyperparameters might need some tuning for it to walk properly. The halfcheetah example has been removed, but the mechanism still exists, so people could create the example themselves.

janbruedigam avatar Apr 12 '23 07:04 janbruedigam