tsflex
tsflex copied to clipboard
be flexible in number of feature-function outputs
It is okay to be flexible in # of feature-function outputs, as long as it is consistent (which is a constraint that we already impose; e.g., hist4 => always returns 4 values).
=> We could omit the burden that we put on the user to always pass output_names when # function outputs > 1
If the number of outputs > 1 & no output_names is given => just add post_fixes (e.g. _1, _2, ...) to the function name as feature names
IMO this is the best of both worlds;
- Users can still add more interpretable output names if they prefer to
- But they are not obligated to do so :) (will avoid a lot of wrapping functions in
FuncWrapper)
A drawback of this is that we lose the deterministic behavior (in number of outputs) that we currently have.
What do you think @jonasvdd @emield12 ?
Great suggestion!
I just think it's a low-priority feature, as I think the added value is not huge.
I don't see wrapping functions with a FuncWrapper and passing output_names as a big burden :upside_down_face:.
On the other hand, I don't really understand the drawback you mention. Is this really a drawback? I see two possible ways of a non-deterministic amount of output features.
- The # of output features depends on function parameters. I don't really see this one as a problem, if you serialize your pipeline those function parameters will also be the same.
- The # of output features depends on the data. This is an issue and should not be allowed, IMHO. But I think this will throw an error now anyway, no? (Ohh maybe I see an issue here... If you have dataset A that always results in 3 output features and dataset B always in 4, you will not see any issue. Until you use dataset C which combines data of A & B... But I consider this as a very low probability edge case, I think that if we provide a clear error then, this will be alright)