reinforcement_learning
reinforcement_learning copied to clipboard
feat: Initial Support for FlatBuffer context
Add support for using flatbuffer spans as input to RLClientLib when using a VW model.
Note: Currently only working in CB models when using choose_rank().
- [x] Fix memory leak when providing bad buffer (e.g. JSON) to a Flatbuffer-configured LiveModel
- [x] Figure out why the build breaking on fb_parser_test
- reinforcement_learning builds on ubuntu1804; vowpal_wabbit builds on ubuntu2004
- [ ] Verify whether CA models work out of the box with Flatbuffer contexts
- [ ] When no more changes to VW need to be made, return submodule back to pointing to a trunk commit of VW
Stretch:
- [ ] Support RequestDecision/MultislotDecision (need to implement context introspection for FB contexts)
- [ ] (unlikely) Support Episodic (need to implement history generation / merge; to do this right requires a fair bit of engineering)