Avro2TF icon indicating copy to clipboard operation
Avro2TF copied to clipboard

Add an option to sort the feature list and use the sorted feature list as the mapping

Open ypyuan666 opened this issue 5 years ago • 5 comments

In many cases, having a sorted featurelist for the vectorization ensures the same mapping across different avro2tf runs.

ypyuan666 avatar Jun 25 '19 16:06 ypyuan666

Thanks @ypyuan666 for opening this issue for us.

I will help to take a look.

When do you need this feature from us?

cyzhangchenya avatar Jun 25 '19 17:06 cyzhangchenya

It would be great if we have the feature by the end of July. Thanks!

ypyuan666 avatar Jun 25 '19 18:06 ypyuan666

Got it, thanks @ypyuan666. I will add a ticket to our July sprint.

cyzhangchenya avatar Jun 25 '19 18:06 cyzhangchenya

@ypyuan666 Can you detail based on what should we do the sorting. Right now, we sort the feature list by the term/entry's frequency.

zhangxuhong avatar Jun 25 '19 20:06 zhangxuhong

@zhangxuhong Can we have an option to sort by the name/term alphabetical order? The challenge I have is that many single term features may be present in every record and lead to ties by frequency. Also, if the training period changes, the frequency may change.

ypyuan666 avatar Jun 25 '19 22:06 ypyuan666