tensorboardX
tensorboardX copied to clipboard
Expose add_scalar(ndarray)
I'd like to log an entire array to tensorboard. This is supported by https://tensorboardx.readthedocs.io/en/latest/tensorboard.html#tensorboardX.SummaryWriter.add_scalar but is not exposed by tensorboardX
. Can you please expose this functionality?
The workaround of invoking add_scalars()
multiple times in a for loop is so slow it is unusable. It takes minutes to plot data that should take milliseconds.
What is the typical size of your data? Sometimes the slowness is caused by tensorboard itself. Another question is how do we define the global_step
(the x-axis value of a point) if an entire array is passed to the add_numpy_array()
? Should that be implicitly inferred from each element's order?
@lanpa I just realized that I filed this bug report against the wrong codebase :) I thought that pytorch uses this library under the hood but I see now that it uses torch.utils.tensorboard
. I looked at https://github.com/lanpa/tensorboardX/blob/054f1f3aa5e8313be42450f5e9ce1fc1799252a7/tensorboardX/writer.py#L416 but I could not figure out how https://tensorboardx.readthedocs.io/en/latest/tensorboard.html#tensorboardX.SummaryWriter.add_scalar logs an ndarray
to tensorboard. As far as I can tell https://github.com/lanpa/tensorboardX/blob/054f1f3aa5e8313be42450f5e9ce1fc1799252a7/tensorboardX/writer.py#L457 forces the value to be a float
or a single-dimensional ndarray.
I'm going to explain what I am trying to do in case you are aware of a better way to do it.
I am trying to predict a time series consisting of 300 points per sample. I want to plot the predicted vs target output of each sample to tensorboard every validation step so I can visually inspect how predictions improve over time.
I've got ~2500 samples in my validation set so I want to log 300 * 2500 = 750000 points. Currently, it takes ~3 seconds per 10 samples I log. Since I cannot plot an entire sample at a time, I am force to plot a single point of a sample at a time as follows:
for index, tensor in enumerate(actual_predictions[:10]):
for x, y in enumerate(tensor):
self.logger.experiment.add_scalars(f"predictions/{index}",
{f"val_actual/epoch/{self.current_epoch}": y},
global_step=x)
Ideally, I want to invoke:
for index, tensor in enumerate(actual_predictions[:10]):
self.logger.experiment.add_scalars(f"predictions/{index}",
{f"val_actual": tensor},
global_step=self.current_epoch)
and have tensorboard plot the entire tensor to its own entry in the "Time Series" tab. Any ideas?
Hi, if I understand correctly, the data you want to plot has four dimension: 1. time 2. value at that time 3. different samples 4. different training epoch. As far as I know, TensorBoard's scalar can show you each trace in the x-z slice in a plot. And if you tag the plots correctly, you can overlay several trace in one plot. In your case, if one trace in a plot is the predicted time series, the other trace should be its corresponding ground truth, and for each validation epoch, you have ~2500 plots to look at, correct? Exposing the interface and write hundreds of points at one time to TensorBoard event file is easy, I think the problem is how you look through so many visualized data.
I looked around the "Time Series" plot, look like those visualization is very similar to the visualization of the ordinary "scalar" plot.
def add_numpy_array(tag, numpy_array_of_length_larger_than_one):
pass
So I think the exposed function should infer the global_step implicitly by counting the numpy array.
@lanpa Your design doesn't match what I had in mind. I don't want a single plot to compare the performance of different samples. Instead, I want to:
- Evaluate the performance of the model for a specific sample at a specific epoch.
- I want to compare the performance of a single sample across epochs to see whether predictions improve over time (and how fast).
Visually, I think I want a graph similar to:
without the gray line.
- For each sample (time series) I want the x-axis to denote a time, the y-axis denotes the value at time x.
- Each plot compares the expected vs predicted values.
- I would have one plot per sample per epoch.
Maybe there is a better way to represent this visually but this is what I had in mind. I've already got this working in Tensorboard but plotting is extremely slow.
Yes, you can infer global_step
implicitly. That said, I don't see how you could implement the above function. As far as I can see, there is no way to plot an entire array in tensorboard. The only way I found is plotting one point at a time which is very slow.