Avro2TF
Avro2TF copied to clipboard
Add an option to sort the feature list and use the sorted feature list as the mapping
In many cases, having a sorted featurelist for the vectorization ensures the same mapping across different avro2tf runs.
Thanks @ypyuan666 for opening this issue for us.
I will help to take a look.
When do you need this feature from us?
It would be great if we have the feature by the end of July. Thanks!
Got it, thanks @ypyuan666. I will add a ticket to our July sprint.
@ypyuan666 Can you detail based on what should we do the sorting. Right now, we sort the feature list by the term/entry's frequency.
@zhangxuhong Can we have an option to sort by the name/term alphabetical order? The challenge I have is that many single term features may be present in every record and lead to ties by frequency. Also, if the training period changes, the frequency may change.