LCF-ATEPC icon indicating copy to clipboard operation
LCF-ATEPC copied to clipboard

AE affects APC

Open yassmine-lam opened this issue 3 years ago • 14 comments

Hi,

thank u for sharing ur code with us. As I understand, the results of APC are affected by those of AE aren t they ? you use the extracted aspect terms to identify the sentiment polarities instead of using gold terms but what if the results of AE are very low and they hardly affect the APC performance?

Thank u

yassmine-lam avatar May 04 '21 21:05 yassmine-lam

Yes, but the impact on apc should be limited. This is an emprical conclusion and you can conduct experiments if you want.

yangheng95 avatar May 10 '21 09:05 yangheng95

Thank u for ur reply

I tested this model with a dataset in another language than English and Chinese. When I used the multilingual bert model I achieved high results, but when I used a monolingual model, I obtained very low results (F1-score = 0 for ATE task !!!), which is very weird. Normally the monolingual models are better than multilingual models as they have a larger number of vocabularies no? Do u have any idea plz?

thank u

yassmine-lam avatar Aug 04 '21 07:08 yassmine-lam

Which pretrained model dou use and can you share any visualization of this preoblem (e.g., code block)?

yangheng95 avatar Aug 04 '21 07:08 yangheng95

Note that this repo is hard coded to use BERTPretrainedModel and tokenizer, you may need to alter to use AutoModel and autotokenizer instead.

yangheng95 avatar Aug 04 '21 08:08 yangheng95

Hi,

I replaced the multilingual bert model by this model aubmindlab/bert-base-arabertv01 and I also used AutoModel and autotokenizer in ur code

As I said it gave me 0 for ATE and a low accuaracy for APC

Screen Shot 2021-08-06 at 8 18 30 AM

Thank u

yassmine-lam avatar Aug 06 '21 07:08 yassmine-lam

I dont have the dataset to debug, did you design the dataset as provided format? I received a similar report which is cuased by mis-annotation and label usage.

yangheng95 avatar Aug 16 '21 08:08 yangheng95

Yes, u were right; there was a problem with the data format. I fixed it, but the accuracy is still very low using the monolingual BERT model compared to the multilingual one.

I really cannot understand that because the monolingual models are generally better than multilingual ones

Do u have any idea plz? thank u

yassmine-lam avatar Sep 04 '21 08:09 yassmine-lam

Hi, I suggest you share your code on Github so I can review it. otherwise I might have no idea where the problem comes from.

yangheng95 avatar Sep 04 '21 08:09 yangheng95

Thank u for ur effort to help us fixing errors. I am working on google colab. So I shared with u the notebook and the folder of code (my email address: [email protected]) to allow u to reproduce the results.

Thank u again for ur effort.

yassmine-lam avatar Sep 04 '21 19:09 yassmine-lam

Do you solve the problem?

Astudnew avatar Sep 26 '21 04:09 Astudnew

Hi, Unfortunately, I am working on improving PyABSA, this repo is kind of out of maintance, you can try PyABSA which solve some problem about dataset. Or you can provide me with a cut of your dataset so I can analyze it.

yangheng95 avatar Sep 26 '21 04:09 yangheng95

I click the close button accidently, and look forward to your reply.

yangheng95 avatar Sep 26 '21 04:09 yangheng95

@Phd-Student2018 No not yet you?

yassmine-lam avatar Nov 08 '21 06:11 yassmine-lam

There is no known error found in your data, maybe you can debug via pycharm, etc. To see what happened in tokenization (I suspect the problem is tokenization, or using incompatible tokenizer and model)

yangheng95 avatar Nov 08 '21 15:11 yangheng95