baselines About Tensorboard

Hello, I am not sure about how to set a Tensorboard? I have set the environment variable, I use the PyCharm, so I donot know how to modify the --log-dir ?

Sep 17 '18 14:09 Nara0731

It shows "Logging to /tmp/openai-2018*". However, I canno find the directory "/tmp".

Sep 18 '18 04:09 Nara0731

I have found the directory “/tmp". However, I only find 0.0.monitor.csv log.txt progress.csv Where should I find the file about Tensorboard？

Sep 18 '18 06:09 Nara0731

Hi @Nara0731 ! We have recently added this section to the README: https://github.com/openai/baselines/blob/master/README.md#using-baselines-with-tensorboard basically, you need to set env variables: OPENAI_LOGDIR to where you want the tensorboard files to be saved, and OPENAI_LOG_FORMAT to 'stdout,tensorboard' (if you only need output to command line and tensorboard). The tensorboard data should show up in OPENAI_LOGDIR (subfolder tb). You can launch tensorboard via tensorboard --logdir=$OPENAI_LOGDIR From the fact logs are saved to /tmp/openai-2018* location, I suspect that neither of the environment variables are actually set (at least from python interpreter perspective). Could you run

import os; print(os.environ)

in python and paste here the output? If OPENAI_LOGDIR and OPENAI_LOG_FORMAT are not there, you can set them directly from python:

os.environ['OPENAI_LOGDIR'] = ...
os.environ['OPENAI_LOG_FORMAT'] = 'stdout,tensorboard'

(that has to happen before you start training) Hope this helps!

Sep 19 '18 22:09 pzhokhov

Yeah， I have set export OPENAI_LOG_FORMAT='stdout,log,csv,tensorboard' # formats are comma-separated, but for tensorboard you only really need the last one export OPENAI_LOGDIR=/tmp

Unfortunately, I did not find any relevant file about tensorboard in the "/tmp"

Sep 20 '18 15:09 Nara0731

hm... let's solve it one step at a time. Could you run import os; print(os.environ) in python?

Sep 20 '18 19:09 pzhokhov

Yes it show /usr/bin/python3.6 /home/ubuntu/baselines/baselines/run.py Logging to /tmp/openai-2018-09-21-14-19-48-807530 environ({'PATH': '/home/ubuntu/bin:/home/ubuntu/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin', 'LC_MEASUREMENT': 'zh_CN.UTF-8', 'XAUTHORITY': '/home/ubuntu/.Xauthority', 'XMODIFIERS': '@im=ibus', 'LC_TELEPHONE': 'zh_CN.UTF-8', 'XDG_DATA_DIRS': '/usr/share/ubuntu:/usr/share/gnome:/usr/local/share:/usr/share:/var/lib/snapd/desktop:/var/lib/snapd/desktop', 'GDMSESSION': 'ubuntu', 'MANDATORY_PATH': '/usr/share/gconf/ubuntu.mandatory.path', 'LC_TIME': 'zh_CN.UTF-8', 'GTK_IM_MODULE': 'ibus', 'DBUS_SESSION_BUS_ADDRESS': 'unix:abstract=/tmp/dbus-mKrFfL3RGO', 'DEFAULTS_PATH': '/usr/share/gconf/ubuntu.default.path', 'XDG_CURRENT_DESKTOP': 'Unity', 'LD_LIBRARY_PATH': '/home/ubuntu/.mujoco/mjpro150/bin:/usr/lib/nvidia-384', 'UPSTART_SESSION': 'unix:abstract=/com/ubuntu/upstart-session/1000/1458', 'QT4_IM_MODULE': 'xim', 'LC_PAPER': 'zh_CN.UTF-8', 'SESSION_MANAGER': 'local/ubuntu-pc:@/tmp/.ICE-unix/1708,unix/ubuntu-pc:/tmp/.ICE-unix/1708', 'QT_LINUX_ACCESSIBILITY_ALWAYS_ON': '1', 'LOGNAME': 'ubuntu', 'JOB': 'unity-settings-daemon', 'PWD': '/home/ubuntu/baselines/baselines', 'IM_CONFIG_PHASE': '1', 'PYCHARM_HOSTED': '1', 'LANGUAGE': 'en_US', 'PYTHONPATH': '/home/ubuntu/baselines', 'SHELL': '/bin/bash', 'LC_ADDRESS': 'zh_CN.UTF-8', 'UNITY_HAS_3D_SUPPORT': 'true', 'GIO_LAUNCHED_DESKTOP_FILE': '/usr/share/applications/jetbrains-pycharm-ce.desktop', 'GTK2_MODULES': 'overlay-scrollbar', 'INSTANCE': '', 'OLDPWD': '/home/ubuntu/package/pycharm-community-2018.1.2/bin', 'GNOME_DESKTOP_SESSION_ID': 'this-is-deprecated', 'UPSTART_INSTANCE': '', 'CLUTTER_IM_MODULE': 'xim', 'XDG_SESSION_PATH': '/org/freedesktop/DisplayManager/Session0', 'COMPIZ_BIN_PATH': '/usr/bin/', 'SESSIONTYPE': 'gnome-session', 'XDG_SESSION_DESKTOP': 'ubuntu', 'SHLVL': '0', 'LC_IDENTIFICATION': 'zh_CN.UTF-8', 'LC_MONETARY': 'zh_CN.UTF-8', 'COMPIZ_CONFIG_PROFILE': 'ubuntu', 'QT_IM_MODULE': 'ibus', 'UPSTART_JOB': 'unity7', 'XDG_CONFIG_DIRS': '/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg', 'LANG': 'en_US.UTF-8', 'GNOME_KEYRING_CONTROL': '', 'XDG_SEAT_PATH': '/org/freedesktop/DisplayManager/Seat0', 'XDG_SESSION_ID': 'c2', 'XDG_SESSION_TYPE': 'x11', 'DISPLAY': ':0', 'UNITY_DEFAULT_PROFILE': 'unity', 'LC_NAME': 'zh_CN.UTF-8', 'GDM_LANG': 'en_US', 'PYTHONIOENCODING': 'UTF-8', 'XDG_GREETER_DATA_DIR': '/var/lib/lightdm-data/ubuntu', 'UPSTART_EVENTS': 'xsession started', 'GPG_AGENT_INFO': '/home/ubuntu/.gnupg/S.gpg-agent:0:1', 'DESKTOP_SESSION': 'ubuntu', 'SESSION': 'ubuntu', 'USER': 'ubuntu', 'XDG_MENU_PREFIX': 'gnome-', 'GIO_LAUNCHED_DESKTOP_FILE_PID': '1996', 'QT_ACCESSIBILITY': '1', 'LC_NUMERIC': 'zh_CN.UTF-8', 'SSH_AUTH_SOCK': '/run/user/1000/keyring/ssh', 'XDG_SEAT': 'seat0', 'PYTHONUNBUFFERED': '1', 'QT_QPA_PLATFORMTHEME': 'appmenu-qt5', 'LD_PRELOAD': '/usr/lib/x86_64-linux-gnu/libGLEW.so:/usr/lib/nvidia-384/libGL.so', 'XDG_VTNR': '7', 'XDG_RUNTIME_DIR': '/run/user/1000', 'HOME': '/home/ubuntu', 'GNOME_KEYRING_PID': ''})

Sep 21 '18 06:09 Nara0731

Thanks! Yeah, so basically one way or another the OPENAI_LOGDIR and OPENAI_LOG_FORMAT do not make it to the python process environment variables. The fix is really easy - add

import os
os.environ['OPENAI_LOGDIR']='/tmp'
os.environ['OPENAI_LOG_FORMAT']='stdout,tensorboard'

to the very top of your python script; and try running it again. Ideally, tensorboard checkpoints should show up in /tmp/tb folder. Please let me know if that does not work for you,

Sep 22 '18 01:09 pzhokhov

I cannot find the "/tmp/tb", I only find "tmp"

Sep 22 '18 10:09 Nara0731

okay; could you post here your python code please? Thanks!

Sep 24 '18 20:09 pzhokhov

Hi, I have the same problem as you. I solved the problem like this. Just modify the code for 209th in run.py. if MPI is None or MPI.COMM_WORLD.Get_rank() == 0: rank = 0 logger.configure(dir='./log',format_strs=['stdout','log','csv','tensorboard'])

Oct 09 '18 02:10 smalltingting

Really? I will try it.

Oct 09 '18 03:10 Nara0731

Hi @pzhokhov @smalltingting I configured the logger setting as you have mentioned. I see it created a directory called "tb". However, it is empty. Any idea what is going on?

I am using deepq example but I think it shouldn't matter. This is how I configure it in my code:

def main():
    logger.configure(dir='.log', format_strs=['stdout', 'log', 'csv', 'tensorboard'])

Oct 09 '18 21:10 srivatsankrishnan

@srivatsankrishnan does logger print anything on the screen / in the log file? Logger only saves data when a logger.dumpkvs() (or logger.dump_tabular()) is called, which by default happens fairly rarely in deepq. Could you try with --print_freq=1 option?

Oct 11 '18 02:10 pzhokhov

Hi @pzhokhov, The only thing the logger prints in the screen is this message: "Logging to .log"

It creates the following folder structure in .log: /logs |------tb |------log |------progress

The tb folder is empty. The progress.csv is also empty. The "log" ( the file that gets created inside the directory) file basically has the same message that was printed in the console ("Logging to .log"). I tried changing the --print_freq=1 but the results are the same.

I tried to hack the code where I create my model (models.py) to explicitly export my graph to visualize in TensorBoard. This is what i use:

tf_writer = tf.summary.FileWriter(LOGDIR)
tf_writer.add_graph(tf.get_default_session().graph)

But the graph is too complex and can't trace to my input and output nodes ( Honestly trying to make sense of it and not given up on that yet). I assume the functionality that you guys enable with logger for tensorboard will be more structured or methodical to visualize it in tensorboard.

Oct 11 '18 03:10 srivatsankrishnan

Hi @srivatsankrishnan ! Sorry about the lag. If all the progress.csv is empty, tb/ subfolder is empty and nothing interesting is printed on the screen, it means that

the training did not progress to the point where it would save anything (call logger.dump_tabular()) . or
something bad happened to the logger module

Could you try running a simple test with deepq, for instance:

export OPENAI_LOG_FORMAT=stdout,csv,tensorboard
export OPENAI_LOGDIR=.log
python -m baselines.run --alg=deepq --env=CartPole-v0 --print_freq=1 --num_timesteps=1e5

If everything works correctly, this should generate a long output that looks like:

-----------------------------------
| % time spent exploring  | 2     |
| episodes                | 843   |
| mean 100 episode reward | 190.8 |
| steps                   | 99081 |
-----------------------------------

and files progress.csv, 0.0.monitor.csv, log.txt, and subfolder tb in .log. If that works, but your case still does not, it probably means that logger / logger configuration are messed up. If the test above does not work, then something in your python environment is not quite right; and in that case, I'd recommend installing baselines in a clean virtualenv, and trying again.

Oct 16 '18 23:10 pzhokhov

Hi @pzhokhov! No worries. This one works and I see logs and event file getting generated. When I open tensorboard, it only has the scalars such as (% time spent exploring, episodes, rewards etc). I don't see a graph in tensorboard.I was interested in seeing the graph for the neural net model to determine input and output nodes. I just hacked the code where I define the model to capture the graph. So in a way, I was able to get what I wanted.

As you put it, In my case, I just put 100 steps for my environment and --print_freq=1 to quickly capture the graph. Maybe it didn't get to a point where logger.dump_tabular() wasn't getting called.

On a different note, Is there a plan to support saving the model in native tensorflow format along with graph (.pb)? The reason is that there are lots of interesting tools in tensorflow and they basically require the model in one of these formats.

Oct 17 '18 03:10 srivatsankrishnan

oh now I see :) Yeah, the logger only saves scalars. As for long-term support of saving entire models in tensorflow / tensorboard support and serialization in general - this has been a subject of quite a bit of debate. We will likely support custom serialization functions (so that every use case can pick its poison), but I don't have a timeline for that. If you could provide an example of useful functionality that is missing by not saving data in tensorflow format, we can speed it up somewhat :)

Oct 17 '18 17:10 pzhokhov

Hi @pzhokhov, Thanks for your reply. There are lots of tools in tensorflow to fine-tune inference performance: (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/python/tools) (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md) The basic requirement is to use these tools is to have models saved in native tensorflow format (checkpoints, .pb etc). I am particularly interested in using these tools and was able to hack the code to save it in the native tensorflow format. I am currently facing some tensorflow related issue but will soon be able to test it out once I resolve those.

I have one more useful functionality in mind but its orthogonal to this discussion. Maybe I will open a new issue for it to avoid mixing it up with this.

Oct 18 '18 18:10 srivatsankrishnan

Was there any resolution to this issue? I've tried the same suggestions that have been listed so far (os.environ['OPENAI_LOGDIR'] = ... and os.environ['OPENAI_LOG_FORMAT] = 'stdout,tensorboard') and I can get those to be listed on print(os.environ), but I am not getting any file outputs. Any ideas?

Jun 29 '20 23:06 ryanmaxwell96

I just wanted to know how to move the logs from /tmp directory to directory of choice, as I have to manually save the /tmp/openai-2022..... files to get the checkpoints for training.

PS I am using multiple gpus for training, hope that suggested methods works for multi gpu training

Aug 01 '22 18:08 KomputerMaster64

你好，我已收到你的来信。若有重要事情，请短信告知！

Aug 01 '22 18:08 Nara0731

baselines baselines copied to clipboard

About Tensorboard

baselines
baselines copied to clipboard