REnforce Example fails when run with --release

The cartpole example works fine when I run the debug build, but if I run with --release, there seems to be a communication problem with the gym server:

$ cargo run cartpole --release
   Compiling renforce v0.1.0 (file:///tmp/REnforce)
    Finished release [optimized] target(s) in 3.54 secs
     Running `target/release/cartpole cartpole`
Training...
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Error { repr: Os { code: 99, message: "Cannot assign requested address"
} })', /checkout/src/libcore/result.rs:860
stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
             at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at /checkout/src/libstd/sys_common/backtrace.rs:71
   2: std::panicking::default_hook::{{closure}}
             at /checkout/src/libstd/sys_common/backtrace.rs:60
             at /checkout/src/libstd/panicking.rs:355
   3: std::panicking::default_hook
             at /checkout/src/libstd/panicking.rs:371
   4: std::panicking::rust_panic_with_hook
             at /checkout/src/libstd/panicking.rs:549
   5: std::panicking::begin_panic
             at /checkout/src/libstd/panicking.rs:511
   6: std::panicking::begin_panic_fmt
             at /checkout/src/libstd/panicking.rs:495
   7: rust_begin_unwind
             at /checkout/src/libstd/panicking.rs:471
   8: core::panicking::panic_fmt
             at /checkout/src/libcore/panicking.rs:69
   9: core::result::unwrap_failed
  10: <cartpole::CartPole as renforce::environment::Environment>::step
  11: <core::iter::Map<I, F> as core::iter::iterator::Iterator>::next
  12: cartpole::main
13: __rust_maybe_catch_panic
             at /checkout/src/libpanic_unwind/lib.rs:98
  14: std::rt::lang_start
             at /checkout/src/libstd/panicking.rs:433
             at /checkout/src/libstd/panic.rs:361
             at /checkout/src/libstd/rt.rs:59
  15: __libc_start_main
  16: _start

Jul 08 '17 05:07 robsmith11

I looked into the issue. I believe to cause is the code running too quickly in release, so the server gets sent too many requests, and can't keep up (The rust bindings for the gym server are not the best), but I could be wrong. I tried seeing if adding something like

while let Err(..) = obs { /* blah */ }

and making multiple attempts until one worked would fix the issue, but that solved nothing. For the moment, the best I can think of is to just only run the gym examples in debug.

Jul 09 '17 00:07 NivenT

That makes sense. The http client is creating a new connection for every request, which is certainly suboptimal. Is there an easy way to make hyper reuse connections? It seems Keep-Alive is enabled by default: https://docs.rs/hyper/0.11.1/hyper/client/struct.Config.html#method.keep_alive

Jul 09 '17 03:07 robsmith11

I don't know Hyper too well, but that's on the latest version. The bindings for the server use an earlier version where is seems you have to set this manually. I tried editing a local copy of the bindings code to include a call to headers.set(Connection::keep_alive());, but that didn't seem to fix things either.

Jul 09 '17 05:07 NivenT

I took another look at this issue to see if I good figure anything out, but no luck. I'm not sure there's a way to avoid this error without editing the server code (I don't know a lot about flask or servers in general so I'm not 100% sure what the options are here), but we might not have to avoid the error.

The error causing this to abort doesn't crash the server, so we should be able to just wait a little bit after receiving it and then continue with buisness as usual. It seems to me that a good long term solution (which would take a while to implement) is to just add better error checking into the library. Define some REnforceError enum, return Results everywhere, and have a variant of the enum specific to this error so the client knows it's not urgent/fatal but to wait a bit.

Jul 22 '17 12:07 NivenT

REnforce REnforce copied to clipboard

Example fails when run with --release

REnforce
REnforce copied to clipboard