HanLP Fix evaluation and other minor issues to adapt to multi-label classification

Dec 22 '20 06:12 callzhang

Tried to resolve the conflicts but it seems that you have changed the structure a lot. Any suggestion on how to merge?

Jan 01 '21 08:01 callzhang

Yes, I refactored a lot. Basically, TensorFlow components are renamed to its original name with a TF shuffix. Sorry you have to do a line-by-line merge as I did for merging your commits to the new release. If you'd like to contribute, please rebase on dev.

By the way, since 2.1 officially dependents on the wonderful huggingface transformers. It would also be great if you want to use their BERT. Here is some reference codes: https://github.com/hankcs/HanLP/commit/1fe90f7040d591176712240285c8a514089ce73b

Jan 01 '21 08:01 hankcs

I eventually wrote my own script on multi-label classification task. Basically using customized BCE with weights to deal with imbalanced classes and macro-F1 for metrics, as well as AdamW with amsGrad enabled. Aided with data augmentation, the performance achieved better. I will continue to use hanlp as an exploration tool and try to contribute when I can. Thanks!

Jan 01 '21 08:01 callzhang

Sure, feel free to explore 2.1.

Jan 01 '21 08:01 hankcs