reinforcement_learning feat: Initial Support for FlatBuffer context

feat: Initial Support for FlatBuffer context

Open lokitoth opened this issue 1 year ago • 0 comments

Add support for using flatbuffer spans as input to RLClientLib when using a VW model.

Note: Currently only working in CB models when using choose_rank().

[x] Fix memory leak when providing bad buffer (e.g. JSON) to a Flatbuffer-configured LiveModel
[x] Figure out why the build breaking on fb_parser_test
- reinforcement_learning builds on ubuntu1804; vowpal_wabbit builds on ubuntu2004
[ ] Verify whether CA models work out of the box with Flatbuffer contexts
[ ] When no more changes to VW need to be made, return submodule back to pointing to a trunk commit of VW

Stretch:

[ ] Support RequestDecision/MultislotDecision (need to implement context introspection for FB contexts)
[ ] (unlikely) Support Episodic (need to implement history generation / merge; to do this right requires a fair bit of engineering)

Feb 14 '24 19:02 lokitoth