tensorboard
tensorboard copied to clipboard
Unable to Retrieve Embedding Arrays From TensorBoard Logs
I am encountering difficulties in retrieving embedding arrays that were logged using add_embedding from TensorBoard logs. I am unable to locate the actual embedding arrays. Below is a detailed description of the issue and the steps I have taken so far.
Steps to Reproduce Logging Embeddings:
I used add_embedding to log embeddings in TensorBoard. Example code for logging embeddings:
from torch.utils.tensorboard import SummaryWriter
import numpy as np
# Create a SummaryWriter
log_dir = 'logs/embedding_example'
writer = SummaryWriter(log_dir)
# Generate some dummy embeddings
embedding_data = np.random.randn(100, 64) # 100 items with 64-dim embeddings
metadata = [f'Label {i}' for i in range(100)]
# Write the embeddings
writer.add_embedding(mat=embedding_data, metadata=metadata, global_step=1)
writer.close()
Attempting to Retrieve Embeddings:
I tried using EventAccumulator to load and parse the event files but was unable to locate the embedding arrays. Example code for extracting embeddings:
import os
import numpy as np
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
def extract_embeddings_from_log(log_dir):
event_acc = EventAccumulator(log_dir, size_guidance={'tensors': 0})
event_acc.Reload()
embeddings = {}
# Get tags for tensors (embeddings should be listed here)
tensor_tags = event_acc.Tags()
print(tensor_tags)
I would appreciate any guidance or suggestions on how to properly retrieve the embedding arrays logged using add_embedding. Specifically, I am looking for:
- Confirmation on whether add_embedding embeddings should be accessible through EventAccumulator.
- Corrections to my approach or alternative methods to extract the embeddings.
- Any additional information on the correct tags or structures to look for within the TensorBoard logs.
Environment Details Framework: PyTorch Logging Library: TensorBoard TensorBoard Version: 2.16.2 Python Version: 3.10 Operating System: Ubuntu 22.04
Thank you for your assistance.
Embeddings are treated differently than other logs as they are really part of the projector plugin. As a result they are written to a separate file projector_config.pbtxt and only read in by the projector plugin.
I'm not sure exactly what you're trying to read out, but you may find success using something like this.
import os
import tensorflow as tf
from google.protobuf import text_format
from tensorboard.plugins import projector
with tf.io.gfile.GFile(
os.path.join(logdir, "projector_config.pbtxt")
) as f:
config2 = projector.ProjectorConfig()
text_format.Parse(f.read(), config2)
print(config2)