balloon-learning-environment icon indicating copy to clipboard operation
balloon-learning-environment copied to clipboard

Can't get running on Ubuntu 20.04

Open tkschuler opened this issue 2 years ago • 0 comments

I have had a lot of trouble installing and trying to get this package running. After several different attempts I got to the following point to at least successfully get the package installed without errors:

The most important thing was to get the right version of JAX and Tensorflow installed.

I am running cuda-nvcc 12.0.140, CUDAtoolkit 11.8.0 with CUDNN 8.4.1.50

  1. Install a GPU enabled tensorflow conda environment with cudatoolkit>=11.4 and cudnn >=8.2 for JAX support first before trying to install the BLE I have verified this is working with python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
  2. the 'flax' module no longer supports 'optim', so downgrade python3.9 -m pip uninstall flax python3.9 -m pip install flax==0.5.3
  3. Install ble without acme

Now I can't run the benchmark example, I get this error: ImportError: /home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/courier/python/libserialization_cc_proto.so: undefined symbol: _ZNK6google8protobuf7Message11GetTypeNameEv

Nor can I import the balloon enivronment, I get this error: >>> env = balloon_env.BalloonEnv() 2023-02-03 13:25:28.522181: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2389] Execution of replica 0 failed: INTERNAL: Failed to execute XLA Runtime executable: run time error: custom call 'xla.gpu.custom_call' failed: jaxlib/gpu/prng_kernels.cc:33: operation gpuGetLastError() failed: out of memory. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/gin/config.py", line 1605, in gin_wrapper utils.augment_exception_message_and_reraise(e, err_str) File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise raise proxy.with_traceback(exception.__traceback__) from None File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/gin/config.py", line 1582, in gin_wrapper return fn(*new_args, **new_kwargs) File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/balloon_learning_environment/env/balloon_env.py", line 145, in __init__ self.arena = balloon_arena.BalloonArena(self._feature_constructor_factory, File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/balloon_learning_environment/env/balloon_arena.py", line 151, in __init__ self._atmosphere = standard_atmosphere.Atmosphere(jax.random.PRNGKey(0)) File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/balloon_learning_environment/env/balloon/standard_atmosphere.py", line 74, in __init__ self.reset(key) File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/balloon_learning_environment/env/balloon/standard_atmosphere.py", line 82, in reset alpha = jax.random.uniform(key).item() File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/jax/_src/random.py", line 265, in uniform return _uniform(key, shape, dtype, minval, maxval) # type: ignore jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Failed to execute XLA Runtime executable: run time error: custom call 'xla.gpu.custom_call' failed: jaxlib/gpu/prng_kernels.cc:33: operation gpuGetLastError() failed: out of memory. In call to configurable 'BalloonEnv' (<class 'balloon_learning_environment.env.balloon_env.BalloonEnv'>)

tkschuler avatar Feb 03 '23 18:02 tkschuler