SynapseML icon indicating copy to clipboard operation
SynapseML copied to clipboard

How to process data when each event has different number of available actions in VW CB?

Open ruotaozhang opened this issue 3 years ago • 1 comments

Hi,

I was looking at the VW CB example given here.

In the example data, each event (i.e., row) has the same set of available actions which is [1,2,3,4,5]. Each action has a feature called "topic", and the topic of each of the 5 actions is extracted from the original json log to form 5 columns. Then the 5 columns are featurized and combined into a single column named "feature".

My question is: what if my data have different sets of actions for each event (i.e., row) in the json log? For instance, row 1 has two available actions [1,2], row 2 has three available actions [2,3,4], and row 3 has 1 available action [1]. In this case, how do I extract the action features and form the final "feature" column for the CB estimator?

Thank you!

AB#1837274

ruotaozhang avatar Jun 20 '22 13:06 ruotaozhang

Working on it offline. We are waiting for the next VW release to get the required changes into the Java binding.

eisber avatar Jul 09 '22 19:07 eisber