Dojo.jl
Dojo.jl copied to clipboard
Both ant_ars and halfcheetah_ars broken
Julia 1.7.2 running on macOS 12.3, Apple M1 architecture. After the training is complete, the examples throw an identical error:
episode 98 reward_evaluation -0.7817167591651243. Took 36.599132333 seconds
episode 99 reward_evaluation -13.705568624874973. Took 36.127634708 seconds
episode 100 reward_evaluation -10.772545859258324. Took 38.790820875 seconds
rewards = [44.0188651955438, 84.07397185902886, 58.151621089522905, 74.00135312248605, 61.31391504999342]
mean(train_time_best) = 2776.0249286366666
std(train_time_best) = 178.33750793333417
mean(rewards) = 64.31194526331501
std(rewards) = 15.355529928613961
WARNING: both Dojo and Base export "close"; uses of it in module Main must be qualified
ERROR: LoadError: UndefVarError: close not defined
Stacktrace:
[1] display_policy(env::Environment{Dojo.Ant, Float64, Mechanism{Float64, 35, 13, 13, 9}, BoxSpace{Float64, 8}, BoxSpace{Float64, 37}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64}; rendering::Bool)
@ Main ~/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/algorithms/ars.jl:189
[2] display_policy(env::Environment{Dojo.Ant, Float64, Mechanism{Float64, 35, 13, 13, 9}, BoxSpace{Float64, 8}, BoxSpace{Float64, 37}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64})
@ Main ~/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/algorithms/ars.jl:173
[3] top-level scope
@ ~/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/ant_ars.jl:124
[4] include(fname::String)
@ Base.MainInclude ./client.jl:451
[5] top-level scope
@ REPL[3]:1
in expression starting at /Users/glenhenshaw/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/ant_ars.jl:124
I made a change to use Base.close. This should fix the issue you are seeing.
I just pulled Dojo#main
and the problem is still there. Running halfcheetah_ars.jl
:
episode 28 reward_evaluation 5.2598850727605795. Took 32.763520375 seconds
episode 29 reward_evaluation 6.304627245502632. Took 31.461139625 seconds
episode 30 reward_evaluation 4.928108782533084. Took 31.937667875 seconds
rewards = [8.6721774823914, 37.47027757269155, 70.57463206704156, 48.87879789080837, 63.227295557744455]
mean(train_time_best) = 173.570155853
std(train_time_best) = 7.095665020032407
mean(rewards) = 45.76463611413546std(rewards) = 24.366089389557583
WARNING: both Dojo and Base export "close"; uses of it in module Main must be qualified
ERROR: LoadError: UndefVarError: close not defined
Stacktrace:
[1] display_policy(env::Environment{Dojo.HalfCheetah, Float64, Mechanism{Float64, 23, 7, 7, 9}, BoxSpace{Float64, 6}, BoxSpace{Float64, 18}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64}; rendering::Bool)
@ Main ~/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/algorithms/ars.jl:189
[2] display_policy(env::Environment{Dojo.HalfCheetah, Float64, Mechanism{Float64, 23, 7, 7, 9}, BoxSpace{Float64, 6}, BoxSpace{Float64, 18}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64})
@ Main ~/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/algorithms/ars.jl:173
[3] top-level scope
@ ~/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/halfcheetah_ars.jl:84
[4] include(fname::String)
@ Base.MainInclude ./client.jl:451
[5] top-level scope
@ REPL[4]:1
in expression starting at /Users/glenhenshaw/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/halfcheetah_ars.jl:84
Ant ARS should work again, although the hyperparameters might need some tuning for it to walk properly. The halfcheetah example has been removed, but the mechanism still exists, so people could create the example themselves.