tensorboard
tensorboard copied to clipboard
hparams table not getting displayed when many hparams are beeing used
Consider Stack Overflow for getting support using TensorBoard—they have a larger community with better searchability:
https://stackoverflow.com/questions/tagged/tensorboard
Do not use this template for for setup, installation, or configuration issues. Instead, use the “installation problem” issue template:
https://github.com/tensorflow/tensorboard/issues/new?template=installation_problem.md
To report a problem with TensorBoard itself, please fill out the remainder of this template.
Environment information (required)
Diagnostics output
--- check: autoidentify
INFO: diagnose_tensorboard.py version 4725c70c7ed724e2d1b9ba5618d7c30b957ee8a4
--- check: general
INFO: sys.version_info: sys.version_info(major=3, minor=7, micro=3, releaselevel='final', serial=0)
INFO: os.name: nt
INFO: os.uname(): N/A
INFO: sys.getwindowsversion(): sys.getwindowsversion(major=10, minor=0, build=14393, platform=2, service_pack='')
--- check: package_management
INFO: has conda-meta: False
INFO: $VIRTUAL_ENV: 'C:\\tensorflow_anduin'
--- check: installed_packages
INFO: installed: tensorboard==2.0.0
INFO: installed: tensorflow-gpu==2.0.0
INFO: installed: tensorflow-estimator==2.0.0
--- check: tensorboard_python_version
INFO: tensorboard.version.VERSION: '2.0.0'
--- check: tensorflow_python_version
2019-10-08 14:40:42.620638: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
INFO: tensorflow.__version__: '2.0.0'
INFO: tensorflow.__git_version__: 'v2.0.0-rc2-26-g64c3d382ca'
--- check: tensorboard_binary_path
INFO: which tensorboard: b'C:\\tensorflow_anduin\\Scripts\\tensorboard.exe\r\n'
--- check: readable_fqdn
INFO: socket.getfqdn(): '...'
--- check: stat_tensorboardinfo
INFO: directory: C:\Users\halle\AppData\Local\Temp\.tensorboard-info
INFO: os.stat(...): os.stat_result(st_mode=16895, st_ino=3096224744103339, st_dev=2217911477, st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1570538160, st_mtime=1570538160, st_ctime=1562760637)
INFO: mode: 0o40777
--- check: source_trees_without_genfiles
INFO: tensorboard_roots (1): ['C:\\tensorflow_anduin\\lib\\site-packages']; bad_roots (0): []
--- check: full_pip_freeze
INFO: pip freeze --all:
absl-py==0.7.1
adal==1.2.2
asn1crypto==0.24.0
astor==0.8.0
astroid==2.2.5
avro-python3==1.9.1
azure-common==1.1.23
azure-graphrbac==0.53.0
azure-keyvault==1.1.0
azure-mgmt-authorization==0.51.1
azure-mgmt-containerregistry==2.7.0
azure-mgmt-keyvault==1.1.0
azure-mgmt-msi==0.2.0
azure-mgmt-nspkg==3.0.2
azure-mgmt-resource==2.2.0
azure-mgmt-storage==3.1.1
azure-nspkg==3.0.2
azure-storage-blob==1.5.0
azure-storage-common==1.4.2
blinker==1.4
boto3==1.9.238
botocore==1.12.238
cachetools==3.1.1
certifi==2019.9.11
cffi==1.12.3
chardet==3.0.4
Click==7.0
click-completion==0.5.1
clipboard==0.0.4
colorama==0.3.9
cryptography==2.7
cycler==0.10.0
docker==3.7.3
docker-pycreds==0.4.0
docutils==0.15.2
Flask==1.1.1
flatten-json==0.1.7
gast==0.2.2
gitdb2==2.0.6
GitPython==2.1.14
google-api-core==1.14.2
google-auth==1.6.3
google-cloud-core==1.0.3
google-cloud-kms==1.2.1
google-cloud-storage==1.20.0
google-pasta==0.1.7
google-resumable-media==0.4.1
googleapis-common-protos==1.6.0
grpc-google-iam-v1==0.12.3
grpcio==1.22.0
h5py==2.9.0
httplib2==0.14.0
humanize==0.5.1
idna==2.8
imageio==2.5.0
isodate==0.6.0
isort==4.3.21
itsdangerous==1.1.0
Jinja2==2.10.1
jmespath==0.9.4
Keras==2.2.4
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
lazy-object-proxy==1.4.1
lockfile==0.12.2
Markdown==3.1.1
MarkupSafe==1.1.1
matplotlib==3.1.1
mccabe==0.6.1
missinglink==19.9.26557
missinglink-kernel==19.9.26893
missinglink-sdk==19.9.26893
ml-core==19.9.3999
ml-crypto==0.7.811
ml-legit==19.9.8734
msgpack==0.6.2
msrest==0.6.10
msrestazure==0.6.2
mypy==0.711
mypy-extensions==0.4.1
natsort==6.0.0
netifaces==0.10.9
numpy==1.17.2
oauthlib==3.1.0
opt-einsum==2.3.2
pandas==0.25.1
patsy==0.5.1
pep8==1.7.1
Pillow==6.1.0
pip==19.2.3
ply==3.11
protobuf==3.8.0
psutil==5.6.3
puremagic==1.5
pyasn1==0.4.7
pyasn1-modules==0.2.6
pycparser==2.19
pycryptodome==3.6.6
Pygments==2.4.2
PyJWT==1.7.1
pylint==2.3.1
pyparsing==2.4.0
pyperclip==1.7.0
pypiwin32==223
python-dateutil==2.8.0
pytz==2019.2
pywin32==225
PyYAML==5.1.1
requests==2.22.0
requests-oauthlib==1.2.0
retrying==1.3.3
rope==0.14.0
rsa==4.0
s3transfer==0.2.1
scipy==1.3.0
sentry-sdk==0.11.2
setuptools==41.0.1
shellingham==1.3.1
six==1.12.0
smmap2==2.0.5
sseclient==0.0.24
statsmodels==0.10.1
tensorboard==2.0.0
tensorflow-estimator==2.0.0
tensorflow-gpu==2.0.0
termcolor==1.1.0
terminaltables==3.1.0
tqdm==4.32.2
typed-ast==1.4.0
urllib3==1.24.3
wcwidth==0.1.7
websocket-client==0.56.0
Werkzeug==0.16.0
wheel==0.33.4
wrapt==1.11.2
- Browser: Chrome 76.0.3809.132
Issue description
If I use many hparams (eg. 14) in tensorboard the table doenst display any results but the table head gets displayed correcly.
But when I delete some of the rows in HPARAMS
section the row in the hparams table and the accuracy gets displayed correcly.
HPARAMS = [HP_BATCH_SIZE,
HP_OPTIMIZER,
HP_W_PARAM_0,
HP_W_PARAM_1,
HP_W_PARAM_2,
HP_W_PARAM_3,
HP_CONV1_FILTER,
HP_CONV1_KERNEL,
HP_CONV2_FILTER,
HP_CONV2_KERNEL,
HP_CONV3_FILTER,
HP_CONV3_KERNEL,
HP_Conv_UP_1_UNITS,
HP_Conv_UP_2_UNITS]
with file_writer.as_default():
hp.hparams_config(
hparams=HPARAMS,
metrics=METRICS,
)
hp.hparams(hparams)
@Asorie Can you please try running this script in your notebook and let me know if you are facing the same issue?
I tried the script , but at section 4. the line %tensorboard --logdir logs/hparam_tuning
procudes an error:
ERROR: Failed to launch TensorBoard (exited with 1).
Contents of stderr:
Traceback (most recent call last):
File "/usr/local/bin/tensorboard", line 10, in <module>
sys.exit(run_main())
File "/usr/local/lib/python3.6/dist-packages/tensorboard/main.py", line 64, in run_main
app.run(tensorboard.main, flags_parser=tensorboard.configure)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/usr/local/lib/python3.6/dist-packages/tensorboard/program.py", line 220, in main
server = self._make_server()
File "/usr/local/lib/python3.6/dist-packages/tensorboard/program.py", line 299, in _make_server
self.assets_zip_provider)
File "/usr/local/lib/python3.6/dist-packages/tensorboard/backend/application.py", line 160, in standard_tensorboard_wsgi
flags, plugin_loaders, data_provider, assets_zip_provider, multiplexer)
File "/usr/local/lib/python3.6/dist-packages/tensorboard/backend/application.py", line 228, in TensorBoardWSGIApp
return TensorBoardWSGI(tbplugins, flags.path_prefix)
File "/usr/local/lib/python3.6/dist-packages/tensorboard/backend/application.py", line 279, in __init__
raise ValueError('Duplicate plugins for name %s' % plugin.plugin_name)
ValueError: Duplicate plugins for name projector
Its because there might be multiple versions of Tensorboard in your system. Please find my github gist here
I am able to see all the hyperparameters on Tensorboard using Tensorflow 2.0. There might be an issue with your tensorboard. Please try to run the same script in your system and see if you can see hparams displayed or no. Thanks!
The script works. I think the problem is, that I tried to add new HP and write the logs to an already used tensorboard.
Yes. So, I think the problem here is resolved?
Not really. I think tensorboard should look for the HP used and add new to the table if a new HP was found.
If this line:
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd']))
gets changed to:
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd', 'RMSprop']))
and then trained to the same logdir, tensorboard doenst add this new model to the hparams table.
So it isn't possible to dynamically change the possible hparams in the same logdir?
I think I'm facing the same issue. Any updates here?
This issue, in particular, https://github.com/tensorflow/tensorboard/issues/2743#issuecomment-542057891, very much reminds me of #3597. There, the problem is that mixed-type (string + float, meaning some models use a string value, others a numerical value) parameters are all cast to string, but the filter in list_session_groups.py
doesn't take that casting into account - it looks for 2.0
and doesn't find "2.0"
. As a result, only models with string parameter values are found - the other ones just don't show up. I have never used hp.HParam
myself, so I cannot say if the two HP_OPTIMIZER
s are seen as different types, but it sure feels like a similar issue.
I'm having the same issue, I am using torch + PPO in rllib and only half of my hyperparams show on tensorboard
I've had the same issue with TensorboardX, the reason was that the metric name contained a whitespace.
I met the same issue, when the number of hparams is getting large the issue appears.
Still an issue for me. Really annoying. Anyone have a solution?
I guess I will try to write Hparams structure to other file and replace that every time I change something. Not sure this works though
The original issue description here suggests the issue appears when "many hparams" are used. Then later it seems to be that users are trying to "add new HP and write the logs to an already used tensorboard".
So I'm not sure I'm understanding what the issue is. Are you logging more hparams data to the same log dir, and you want TB to read it? Does starting tensorboard again like tensorboard --logdir path/to/logs
show everything you want to see? Do you have a small example to reproduce the issue?
@arcra it probably covers only one aspect of this issue, but https://github.com/tensorflow/tensorboard/issues/3597#issuecomment-1490793918 has very specific repro steps that I created "only" 7 months ago ("only" compared to the 4 years that this issue has been open).