vowpal_wabbit
vowpal_wabbit copied to clipboard
feat: [py] Providing Basic support for Tensorboard and Tensorwatch
Usage
- Created
VWtoTensorboard,VWtoTensorwatchStreamer,VWtoTensorwatchClient,DFtoVWtoTBorTWclasses inpython/vowpalwabbit/DFtoVWtoTBorTW.py - Also created an example (.ipynb) file in
python/examples/iris-df-to-vw-to-tensorboard-tensorwatch - Edited
pyvw.pyto add a method.get_labeland modifiedvw.learnto support tensorboard and tensorwatch logs - Added Iris.csv to
python/examples/iris-df-to-vw-to-tensorboard-tensorboard - Purpose of
DFtoVWtoTBorTWis to use this class to create tensorboard or tensorwatch logs or both for vw metrics out of a pandas df, DFtoVW and vw objects - Purpose of
VWtoTensorboardis to support Tensorboard logs for vw metrics: average_loss, since_last, graph of vw reductions, arguments as text for now - Purpose of
VWtoTensorwatchStreamerandVWtoTensorwatchClientis to support Tensorwatch logs for vw metrics: average_loss, since_last, label, prediction for now
df_simple = pd.DataFrame({
"f1": [1, 2, 3],
"f2": [1.2, 2.3, 3.4],
"f3": [0.2, 0.8, .01],
"l": [1, 2, 3],
})
label = MulticlassLabel(label="l")
features = [Feature(col) for col in ["f1", "f2", "f3"]]
df_to_vw = DFtoVW(df=df_simple, label=label, features=features)
vw = pyvw.vw('--oaa 3 -P 1')
# For tensorboard
logdir = './logs'
vw_to_tb = VWtoTensorboard(logdir)
# For tensorwatch
logfile = './iris.log'
vw_to_tw = VWtoTensorwatchStreamer(logfile)
df_to_tb_tw = DFtoVWtoTBorTW(df_to_vw.convert_df(), vw)
df_to_tb_tw.fit(vw_to_tensorboard=vw_to_tb, vw_to_tensorwatch=vw_to_tw)
vw_to_tb.draw_reductions_graph(vw)
vw_to_tb.show_args_as_text(vw)
vw_to_tw_client = VWtoTensorwatchClient(logfile)
vw_to_tw_client.plot_metrics()
# Both vw_to_tb and vw_to_tw objects can be passed to a single DFtoVWtoTBorTW instance
Tensorboard logs will be created as VWtoTensorboard instance is passed to .fit method
Tensorwatch logs will be created as VWtoTensorwatchStreamer instance is passed to .fit method
To run tensorboard on logs that were created, type command tensorboard --logdir ./logs in the directory where logs for tensorboard are present
For tensorwatch visualization use .plot_metrics of Tensorwatch Client as above
Implementation Details
- Create a
DFtoVWobject usingpandas Dataframeobject - Create a
vwobject, specify-P 1too whle initializing object - Create a
VWtoTensorboardinstance, orVWtoTensorwatchStreamerinstance or both - Create
DFtoVWtoTBorTWobject by passing examples fromDFtoVWandvwobjects as parameters - Call
.fit()method ofDFtoVWtoTBorTW(tensorboardandtensorwatchargument is optional) for thevwobject to start learning, and to output Tensorboard logs (ifVWtoTensorboardinstance is passed, otherwise default value is None), and to also output Tensorwatch logs (ifVWtoTensorwatchStreamerinstance is passed, otherwise default value is None)
VWtoTensorboard is used to log metrics for tensrboard
.get_label() method is also added to pyvw.py before pyvw.get_prediction implementation and vw.learn has been modified to support Tensorboard logs for average_loss and since_last, graph of vw reductions, arguments as text and Tensorwatch logs for average_loss, since_last, label, prediction
Limitations
- Curretly its only for
vw objectswhere-P 1is specified - Currently no
.predict()method for predicting on data set inDFtoVWtoTensorboard - Methods for these seem to be not in vw currenlty, so couldn't accomodate them in
pyvw.get_label()- 5: lMAX - 6: lCONDITIONAL_CONTEXTUAL_BANDIT - 7: lSLATES - 8: lCONTINUOUS - Tensorboard logs for metrics: label, prediction are not created as line plot for them is not suitable inside Tensorboard but barplot support is available for Tensorwatch so they are supported in Tensorwatch
self.finish_exampleis not called invw.learnif an example instance is passed as parameter, callingvw.learn(ec, vw_to_tb)with ec as example instance, vw_to_tb as VWtoTensorboard instance would give an error as metrics are not calculated correctly- logs of label, prediction for Tensorwatch is only done when string ec object is passed
This pull request introduces 1 alert when merging 2c8642eb77200ad552002acd170c0745c9bb23cf into 6b45001e89ed9b732fee0288d046c155f1c69e6d - view on LGTM.com
new alerts:
- 1 for Unused import
It seems like we could reasonably implement this externally to the core library? Adding a Tensorboard dependency seems pretty heavy if it can be avoided
Hey @jackgerrits - could you expand a bit on how you think it would be layered? Are you thinking of creating a proxy object that drives the underlying vw object? Or were you thinking more along the lines of expecting a specific callback interface into the normal learning code, like we do now, just without specifically typing it, then implementing the TW/TB projections externally?
I am primarily referring to a sort of proxy object. It seems like the information required for this module to operate is available "externally" to the individual predict and learn calls. Therefore, providing a separate, and importantly optional, module which wraps the necessary predict and learn calls seems to be a more flexible design. Not needing to change pyvw.py at all seems doable to me and is the best outcome I think.
The biggest factor here is that adding tensorboardX and tensorwatch dependencies to the core vowpalwabbit library is not a viable path forward.
Going to go ahead and close this. If you would like to revisit in future please feel free to reopen.