Fix evaluation and other minor issues to adapt to multi-label classification
Tried to resolve the conflicts but it seems that you have changed the structure a lot. Any suggestion on how to merge?
Yes, I refactored a lot. Basically, TensorFlow components are renamed to its original name with a TF shuffix. Sorry you have to do a line-by-line merge as I did for merging your commits to the new release. If you'd like to contribute, please rebase on dev.
By the way, since 2.1 officially dependents on the wonderful huggingface transformers. It would also be great if you want to use their BERT. Here is some reference codes: https://github.com/hankcs/HanLP/commit/1fe90f7040d591176712240285c8a514089ce73b
I eventually wrote my own script on multi-label classification task. Basically using customized BCE with weights to deal with imbalanced classes and macro-F1 for metrics, as well as AdamW with amsGrad enabled. Aided with data augmentation, the performance achieved better. I will continue to use hanlp as an exploration tool and try to contribute when I can. Thanks!
Sure, feel free to explore 2.1.