tensorboard
tensorboard copied to clipboard
Batch accuracy and batch loss are not being plotted in browser or vscode plugin
This link in the bug report text did not work for me:
https://raw.githubusercontent.com/tensorflow/tensorboard/master/tensorboard/tools/diagnose_tensorboard.py
/home/user/work/diagnose_tensorboard.py:32: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13 import pipes
Diagnostics
Diagnostics output
--- check: autoidentify
INFO: diagnose_tensorboard.py version df7af2c6fc0e4c4a5b47aeae078bc7ad95777ffa
--- check: general
INFO: sys.version_info: sys.version_info(major=3, minor=12, micro=2, releaselevel='final', serial=0)
INFO: os.name: posix
INFO: os.uname(): posix.uname_result(sysname='Linux', nodename='beast', release='6.6.21-1-lts', version='#1 SMP PREEMPT_DYNAMIC Wed, 06 Mar 2024 16:59:55 +0000', machine='x86_64')
INFO: sys.getwindowsversion(): N/A
--- check: package_management
INFO: has conda-meta: False
INFO: $VIRTUAL_ENV: None
--- check: installed_packages
INFO: installed: tensorboard==2.16.2
INFO: installed: tensorflow==2.16.1
WARNING: no installation among: ['tensorflow-estimator', 'tensorflow-estimator-2.0-preview', 'tf-estimator-nightly']
INFO: installed: tensorboard-data-server==0.7.2
--- check: tensorboard_python_version
INFO: tensorboard.version.VERSION: '2.16.2'
--- check: tensorflow_python_version
2024-03-14 13:58:36.829200: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-03-14 13:58:36.851097: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-14 13:58:37.239354: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
INFO: tensorflow.__version__: '2.16.1'
INFO: tensorflow.__git_version__: 'v2.16.1-0-g5bc9d26649c'
--- check: tensorboard_data_server_version
INFO: data server binary: '/home/john/work/Sleep/.venv/lib/python3.12/site-packages/tensorboard_data_server/bin/server'
INFO: data server binary version: b'rustboard 0.7.2'
--- check: tensorboard_binary_path
INFO: which tensorboard: b'/home/john/work/Sleep/.venv/bin/tensorboard\n'
--- check: addrinfos
socket.has_ipv6 = True
socket.AF_UNSPEC = <AddressFamily.AF_UNSPEC: 0>
socket.SOCK_STREAM = <SocketKind.SOCK_STREAM: 1>
socket.AI_ADDRCONFIG = <AddressInfo.AI_ADDRCONFIG: 32>
socket.AI_PASSIVE = <AddressInfo.AI_PASSIVE: 1>
Loopback flags: <AddressInfo.AI_ADDRCONFIG: 32>
Loopback infos: [(<AddressFamily.AF_INET6: 10>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('::1', 0, 0, 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('127.0.0.1', 0))]
Wildcard flags: <AddressInfo.AI_PASSIVE: 1>
Wildcard infos: [(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('0.0.0.0', 0)), (<AddressFamily.AF_INET6: 10>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('::', 0, 0, 0))]
--- check: readable_fqdn
INFO: socket.getfqdn(): 'beast'
--- check: stat_tensorboardinfo
INFO: directory: /tmp/.tensorboard-info
INFO: os.stat(...): os.stat_result(st_mode=16895, st_ino=1853, st_dev=44, st_nlink=2, st_uid=1000, st_gid=1000, st_size=40, st_atime=1710438727, st_mtime=1710438986, st_ctime=1710438986)
INFO: mode: 0o40777
--- check: source_trees_without_genfiles
INFO: tensorboard_roots (1): ['/home/john/work/Sleep/.venv/lib/python3.12/site-packages']; bad_roots (0): []
--- check: full_pip_freeze
INFO: pip freeze --all:
absl-py==2.1.0
asttokens==2.4.1
astunparse==1.6.3
bidict==0.23.1
biosppy==2.1.2
certifi==2024.2.2
charset-normalizer==3.3.2
colorama==0.4.6
colorlog==6.8.2
comm==0.2.2
contourpy==1.2.0
cycler==0.12.1
debugpy==1.8.1
decorator==5.1.1
dm-tree==0.1.8
easydev==0.13.1
edfio==0.4.0
executing==2.0.1
flatbuffers==24.3.7
fonttools==4.49.0
future==1.0.0
gast==0.5.4
google-pasta==0.2.0
grpcio==1.62.1
h5py==3.10.0
idna==3.6
ipykernel==6.29.3
ipython==8.22.2
jedi==0.19.1
Jinja2==3.1.3
joblib==1.3.2
jupyter_client==8.6.1
jupyter_core==5.7.2
keras==3.0.5
kiwisolver==1.4.5
lazy_loader==0.3
libclang==16.0.6
lightgbm==4.3.0
line-profiler==4.1.2
lxml==5.1.0
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.8.3
matplotlib-inline==0.1.6
mdurl==0.1.2
ml-dtypes==0.3.2
mne==1.6.1
namex==0.0.7
nest-asyncio==1.6.0
nolds==0.5.2
numpy==1.26.4
nvidia-cublas-cu12==12.3.4.1
nvidia-cuda-cupti-cu12==12.3.101
nvidia-cuda-nvcc-cu12==12.3.107
nvidia-cuda-nvrtc-cu12==12.3.107
nvidia-cuda-runtime-cu12==12.3.101
nvidia-cudnn-cu12==8.9.7.29
nvidia-cufft-cu12==11.0.12.1
nvidia-curand-cu12==10.3.4.107
nvidia-cusolver-cu12==11.5.4.101
nvidia-cusparse-cu12==12.2.0.103
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.3.101
opencv-python==4.9.0.80
opt-einsum==3.3.0
packaging==24.0
pandas==2.2.1
parso==0.8.3
pexpect==4.9.0
pillow==10.2.0
pip==24.0
platformdirs==4.2.0
pooch==1.8.1
prompt-toolkit==3.0.43
protobuf==4.25.3
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
Pygments==2.17.2
pyhrv==0.4.1
pyparsing==3.1.2
python-dateutil==2.9.0.post0
pytz==2024.1
PyWavelets==1.5.0
pyzmq==25.1.2
requests==2.31.0
rich==13.7.1
scikit-learn==1.4.1.post1
scipy==1.12.0
seaborn==0.13.2
setuptools==69.2.0
shortuuid==1.0.13
six==1.16.0
spectrum==0.8.1
stack-data==0.6.3
tensorboard==2.16.2
tensorboard-data-server==0.7.2
tensorflow==2.16.1
termcolor==2.4.0
threadpoolctl==3.3.0
tornado==6.4
tqdm==4.66.2
traitlets==5.14.2
typing_extensions==4.10.0
tzdata==2024.1
urllib3==2.2.1
wcwidth==0.2.13
Werkzeug==3.0.1
wheel==0.43.0
wrapt==1.16.0
In vscode plugin and Firefox, the same issue:
Issue description
The batch_accuracy and batch_loss are not being plotted. Their is a single dot at the center, but this screenshot was taken after some 3100 batches, so there should have been a line plotted for both.
Callbacks in my model.fit:
callbacks=[
tf.keras.callbacks.TensorBoard(log_dir=LOG_PATH, update_freq="batch"),
chkpt_callback,
],
Would you mind running tensorboard --inspect --logdir <your log directory>
and providing the results?
Sure!
inspect output
======================================================================
Processing event files... (this can take a few minutes)
======================================================================
Found event files in:
logs/train
logs/validation
These tags are in logs/train:
audio -
histograms -
images -
scalars -
tensor
batch_accuracy
batch_loss
epoch_accuracy
epoch_learning_rate
epoch_loss
keras
======================================================================
Event statistics for logs/train:
audio -
graph
first_step 0
last_step 0
max_step 0
min_step 0
num_steps 1
outoforder_steps []
histograms -
images -
scalars -
sessionlog:checkpoint -
sessionlog:start -
sessionlog:stop -
tensor
first_step 0
last_step 0
max_step 9
min_step 0
num_steps 10
outoforder_steps [(1, 0), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0)]
======================================================================
These tags are in logs/validation:
audio -
histograms -
images -
scalars -
tensor
epoch_accuracy
epoch_loss
evaluation_accuracy_vs_iterations
evaluation_loss_vs_iterations
======================================================================
Event statistics for logs/validation:
audio -
graph -
histograms -
images -
scalars -
sessionlog:checkpoint -
sessionlog:start -
sessionlog:stop -
tensor
first_step 4744
last_step 8
max_step 47480
min_step 0
num_steps 49
outoforder_steps [(4744, 0), (9488, 1), (4691, 0), (9382, 1), (14073, 2), (18764, 3), (23455, 4), (28146, 5), (32837, 6), (37528, 7), (42219, 8), (46910, 9), (4691, 0), (9382, 1), (14073, 2), (18764, 3), (23455, 4), (28146, 5), (32837, 6), (37528, 7), (42219, 8), (46910, 9), (4744, 0), (9488, 1), (14232, 2), (18976, 3), (23720, 4), (28464, 5), (33208, 6), (37952, 7), (42696, 8), (47440, 9), (4748, 0), (9496, 1), (14244, 2), (18992, 3), (23740, 4), (28488, 5), (33236, 6), (37984, 7), (42732, 8), (47480, 9), (4719, 0), (9438, 1), (14157, 2), (18876, 3), (23595, 4), (28314, 5), (33033, 6), (37752, 7), (42471, 8)]
======================================================================
Also, the display in Scalars is the same. And, the single dot is being updated with the latest value when the 30-second update triggers.